Client Library API

Overview

The WebLLM client library provides both Vercel AI SDK and OpenAI SDK compatible interfaces. It includes robust installation prompts and readiness detection to make integration seamless.

Installation & Setup Functions

`promptInstall()`

Shows an interactive modal that guides users through extension installation. This is the recommended way to handle missing extensions.

Returns: Promise<void>

Throws: Error if installation cancelled or browser not supported

import { promptInstall, generateText } from 'webllm';

try {
  await promptInstall();
  // Extension is now ready
  const result = await generateText({ prompt: 'Hello' });
} catch (error) {
  console.error('Installation failed:', error.message);
}

Features:

Shows browser-specific installation links (Chrome Web Store, Edge Add-ons)
Waits for extension to auto-connect after installation
Provides refresh option if timeout occurs
Handles unsupported browsers gracefully
Returns immediately if extension already installed

`webLlmReady(timeout?: number)`

Waits for the WebLLM extension to become available and ready. Works across page refreshes and extension installations.

Parameters:

timeout (number, optional) - Maximum time to wait in milliseconds (default: 30000)

Returns: Promise<void>

Throws: Error if timeout expires or browser not supported

import { webLlmReady, generateText } from 'webllm';

try {
  // Wait up to 10 seconds
  await webLlmReady(10000);

  // Extension is ready
  const result = await generateText({ prompt: 'Hello' });
} catch (error) {
  console.error('Extension not available:', error.message);
}

Use Cases:

Progressive enhancement (short timeout, silent failure)
App initialization (wait for extension before enabling features)
Post-installation detection (listens for extension connection)

`isAvailable()`

Synchronously checks if the WebLLM extension is currently installed and available.

Returns: boolean

import { isAvailable } from 'webllm';

if (isAvailable()) {
  console.log('Extension is installed');
} else {
  console.log('Extension not found');
}

`getBrowserInfo()`

Returns information about browser compatibility with WebLLM.

Returns: BrowserInfo

interface BrowserInfo {
  isSupported: boolean;
  browserName: string;
  reason?: string;
  installUrl?: string;
}

import { getBrowserInfo } from 'webllm';

const info = getBrowserInfo();

console.log('Browser:', info.browserName);
console.log('Supported:', info.isSupported);

if (!info.isSupported) {
  console.log('Reason:', info.reason);
} else if (info.installUrl) {
  console.log('Install from:', info.installUrl);
}

Detected Browsers:

Chrome/Chromium (supported)
Microsoft Edge (supported)
Firefox (coming soon)
Safari (not supported)
Mobile browsers (not supported)

Text Generation Functions

`generateText(options)`

Generate text using Vercel AI SDK-compatible interface.

Parameters:

options (GenerateTextOptions)

interface GenerateTextOptions {
  // Intelligent model selection (recommended)
  task?: 'general' | 'summarization' | 'translation' | 'qa' | 'coding' | 'creative' | 'extraction';
  hints?: ModelHints;           // Performance and capability hints

  // Expert overrides (optional)
  model?: string;                // Specific model name (e.g., 'claude-sonnet-4', 'gpt-4o')
  provider?: string;             // Specific provider ('anthropic', 'openai', etc.)

  // Content
  system?: string;               // System prompt
  prompt?: string;               // User prompt (alternative to messages)
  messages?: Message[];          // Conversation messages

  // Generation parameters
  temperature?: number;          // Randomness (0.0-1.0)
  maxTokens?: number;           // Maximum tokens to generate
  topP?: number;                // Nucleus sampling
  topK?: number;                // Top-k sampling
  frequencyPenalty?: number;     // Reduce repetition (0.0-2.0)
  presencePenalty?: number;      // Encourage new topics (0.0-2.0)
  stopSequences?: string[];      // Stop generation at these strings
}

interface ModelHints {
  speed?: 'fastest' | 'fast' | 'balanced' | 'quality';
  quality?: 'draft' | 'standard' | 'high' | 'best';
  maxModelSize?: number;        // Max model size in GB
  maxMemory?: number;           // Max memory in GB
  capabilities?: {
    multilingual?: boolean;
    codeGeneration?: boolean;
    reasoning?: boolean;
    longContext?: boolean;
    math?: boolean;
    functionCalling?: boolean;
  };
  modelId?: string;             // Force specific model
  excludeModels?: string[];     // Exclude models from selection
}

interface Message {
  role: 'user' | 'assistant' | 'system';
  content: string;
}

Returns: Promise<GenerateTextResult>

interface GenerateTextResult {
  text: string;
  finishReason: string;
  usage: {
    promptTokens: number;
    completionTokens: number;
    totalTokens: number;
  };
  model?: string;
  provider?: string;
  requestId?: string;
  timestamp?: number;
}

Examples

Simple prompt:

import { generateText } from 'webllm';

const result = await generateText({
  prompt: 'Write a haiku about coding'
});

console.log(result.text);

With task and hints (intelligent selection):

const result = await generateText({
  task: 'creative',
  hints: {
    quality: 'high'
  },
  prompt: 'Write a haiku about coding'
});

console.log(result.text);

With system prompt:

const result = await generateText({
  system: 'You are a helpful translator.',
  prompt: 'Translate to Spanish: Hello, how are you?'
});

console.log(result.text);

Multi-turn conversation:

const result = await generateText({
  messages: [
    { role: 'user', content: 'What is the capital of France?' },
    { role: 'assistant', content: 'The capital of France is Paris.' },
    { role: 'user', content: 'What is its population?' }
  ]
});

console.log(result.text);

With all parameters:

const result = await generateText({
  model: 'claude-sonnet-4',
  prompt: 'Write a creative story',
  temperature: 0.9,
  maxTokens: 500,
  topP: 0.95,
  frequencyPenalty: 0.5,
  presencePenalty: 0.5,
  stopSequences: ['THE END']
});

console.log(result.text);
console.log('Tokens used:', result.usage.totalTokens);

Task-based routing for coding:

const result = await generateText({
  task: 'coding',
  hints: {
    quality: 'best',
    capabilities: {
      reasoning: true,
      codeGeneration: true
    }
  },
  prompt: 'Write a React component for a todo list'
});

console.log(result.text);

Expert override - specify provider:

// Force Anthropic (when you need specific provider)
const result1 = await generateText({
  provider: 'anthropic',
  model: 'claude-sonnet-4',
  prompt: 'Explain quantum computing'
});

// Force OpenAI
const result2 = await generateText({
  provider: 'openai',
  model: 'gpt-4o',
  prompt: 'Explain quantum computing'
});

OpenAI SDK Compatible API

`webllm.chat.completions.create(options)`

OpenAI-compatible chat completions interface.

Parameters:

options (ChatCompletionOptions)

interface ChatCompletionOptions {
  model?: string;
  messages: ChatCompletionMessage[];
  temperature?: number;
  max_tokens?: number;
  top_p?: number;
  frequency_penalty?: number;
  presence_penalty?: number;
  stop?: string | string[];
  stream?: boolean; // Not yet supported
}

interface ChatCompletionMessage {
  role: 'user' | 'assistant' | 'system';
  content: string;
  name?: string;
}

Returns: Promise<ChatCompletionResponse>

interface ChatCompletionResponse {
  id: string;
  object: 'chat.completion';
  created: number;
  model: string;
  choices: Array<{
    index: number;
    message: ChatCompletionMessage;
    finish_reason: string;
  }>;
  usage: {
    prompt_tokens: number;
    completion_tokens: number;
    total_tokens: number;
  };
}

Examples

Basic chat:

import { webllm } from 'webllm';

const completion = await webllm.chat.completions.create({
  model: 'gpt-4o',
  messages: [
    { role: 'user', content: 'Hello!' }
  ]
});

console.log(completion.choices[0].message.content);

With system message:

const completion = await webllm.chat.completions.create({
  model: 'claude-sonnet-4',
  messages: [
    { role: 'system', content: 'You are a helpful assistant.' },
    { role: 'user', content: 'Explain quantum computing' }
  ],
  temperature: 0.7,
  max_tokens: 500
});

console.log(completion.choices[0].message.content);

Drop-in OpenAI replacement:

// Import as openai for seamless replacement
import { webllm as openai } from 'webllm';

// Now use exactly like OpenAI SDK
const completion = await openai.chat.completions.create({
  model: 'gpt-4o',
  messages: [{ role: 'user', content: 'Hello!' }]
});

WebLLMClient Class

Constructor

import { WebLLMClient } from 'webllm';

const client = new WebLLMClient();

The constructor automatically:

Initializes readiness promise tracking
Listens for webllm:ready events
Checks if extension is already installed

Instance Methods

All the functions above are also available as instance methods:

const client = new WebLLMClient();

// Installation & setup
await client.promptInstall();
await client.webLlmReady(10000);
const available = client.isAvailable();
const browserInfo = client.getBrowserInfo();

// Text generation
const result = await client.generateText({ prompt: 'Hello' });

// OpenAI-compatible
const completion = await client.chat.completions.create({
  model: 'gpt-4o',
  messages: [{ role: 'user', content: 'Hello' }]
});

Complete Usage Example

Full Application Flow

import {
  promptInstall,
  webLlmReady,
  generateText,
  getBrowserInfo
} from 'webllm';

async function initializeApp() {
  // Check browser compatibility
  const browserInfo = getBrowserInfo();

  if (!browserInfo.isSupported) {
    showMessage(`WebLLM is not supported on ${browserInfo.browserName}`);
    return;
  }

  // Try to wait for extension (with short timeout)
  try {
    await webLlmReady(3000);
    enableAiFeatures();
  } catch {
    // Extension not installed, continue without AI features
    disableAiFeatures();
  }
}

async function handleAiRequest(prompt) {
  try {
    // Prompt for installation if needed
    await promptInstall();

    // Generate text
    const result = await generateText({
      prompt,
      temperature: 0.7,
      maxTokens: 500
    });

    displayResult(result.text);
    console.log('Tokens used:', result.usage.totalTokens);
  } catch (error) {
    if (error.message.includes('cancelled')) {
      showMessage('Installation was cancelled');
    } else {
      showError('AI request failed: ' + error.message);
    }
  }
}

// Initialize on page load
initializeApp();

TypeScript Support

Full TypeScript definitions are included:

import type {
  BrowserInfo,
  GenerateTextOptions,
  GenerateTextResult,
  ChatCompletionOptions,
  ChatCompletionResponse,
  Message,
  Usage
} from 'webllm';

Error Handling

Extension Not Available

try {
  await generateText({ prompt: 'Hello' });
} catch (error) {
  if (error.message.includes('extension not available')) {
    // Extension not installed
    await promptInstall();
  }
}

Browser Not Supported

try {
  await promptInstall();
} catch (error) {
  if (error.message.includes('not supported')) {
    // Show browser compatibility message
    const info = getBrowserInfo();
    console.log(info.reason);
  }
}

Installation Cancelled

try {
  await promptInstall();
} catch (error) {
  if (error.message.includes('cancelled')) {
    // User clicked cancel
    console.log('User cancelled installation');
  }
}

Best Practices

Use promptInstall() for user-initiated features - Best UX for required AI features
Use webLlmReady() for optional features - Progressive enhancement with timeout
Check getBrowserInfo() before prompting - Avoid showing prompts on unsupported browsers
Always provide fallbacks - App should work without AI features
Handle installation cancellation - User might click cancel
Cache readiness state - Don’t prompt repeatedly in same session

Next Steps

See SDK Getting Started for examples
Learn about Vercel AI Provider
Read the API Specification