Client SDK API
Installation
Section titled “Installation”npm install webllm# oryarn add webllm# orpnpm add webllmBasic Usage
Section titled “Basic Usage”import { generateText } from 'webllm';
const result = await generateText({ prompt: 'Explain quantum computing in simple terms',});
console.log(result.text);Core Functions
Section titled “Core Functions”generateText(options)
Section titled “generateText(options)”Generate text using the WebLLM extension with Vercel AI SDK-compatible interface.
Parameters
Section titled “Parameters”interface GenerateTextOptions { // Intelligent model selection (recommended) task?: | 'general' | 'summarization' | 'translation' | 'qa' | 'coding' | 'creative' | 'extraction'; hints?: ModelHints;
// Expert overrides (optional) model?: string; // Specific model name (e.g., 'claude-sonnet-4', 'gpt-4o') provider?: string; // Specific provider ('anthropic', 'openai', etc.)
// Content system?: string; // System prompt prompt?: string; // User prompt (alternative to messages) messages?: Message[]; // Conversation messages
// Generation parameters temperature?: number; // Randomness (0.0-1.0, default: 0.7) maxTokens?: number; // Maximum tokens to generate topP?: number; // Nucleus sampling (0.0-1.0) topK?: number; // Top-k sampling frequencyPenalty?: number; // Reduce repetition (0.0-2.0) presencePenalty?: number; // Encourage new topics (0.0-2.0) stopSequences?: string[]; // Stop generation at these strings}
interface ModelHints { speed?: 'fastest' | 'fast' | 'balanced' | 'quality'; quality?: 'draft' | 'standard' | 'high' | 'best'; maxModelSize?: number; // Max model size in GB maxMemory?: number; // Max memory in GB capabilities?: { multilingual?: boolean; codeGeneration?: boolean; reasoning?: boolean; longContext?: boolean; math?: boolean; functionCalling?: boolean; }; modelId?: string; // Force specific model excludeModels?: string[]; // Exclude models from selection}
interface Message { role: 'user' | 'assistant' | 'system'; content: string;}Returns
Section titled “Returns”CancellablePromise<GenerateTextResult>;
// CancellablePromise extends Promise with cancellation supportinterface CancellablePromise<T> extends Promise<T> { cancel(): void; // Cancel the in-flight request readonly requestId: string; // Unique identifier for this request}
interface GenerateTextResult { text: string; // Generated text finishReason: string; // Reason for completion ('stop', 'length', etc.) usage: { promptTokens: number; // Tokens in the prompt completionTokens: number; // Tokens in the completion totalTokens: number; // Total tokens used }; model?: string; // Model that was used provider?: string; // Provider that was used requestId?: string; // Unique request ID timestamp?: number; // Unix timestamp}Examples
Section titled “Examples”Simple prompt:
const result = await generateText({ prompt: 'Write a haiku about coding',});With system prompt:
const result = await generateText({ system: 'You are a helpful translator.', prompt: 'Translate to Spanish: Hello, how are you?',});Multi-turn conversation:
const result = await generateText({ messages: [ { role: 'user', content: 'What is the capital of France?' }, { role: 'assistant', content: 'The capital of France is Paris.' }, { role: 'user', content: 'What is its population?' }, ],});With request cancellation:
// All requests return CancellablePromise - cancel anytime before resolutionconst request = generateText({ prompt: 'Write a long story' });
// Cancel after 5 seconds if still runningsetTimeout(() => request.cancel(), 5000);
try { const result = await request; console.log(result.text);} catch (error) { if (error.name === 'AbortError') { console.log('Request was cancelled'); }}With generation parameters:
const result = await generateText({ prompt: 'Write a creative story', temperature: 0.9, maxTokens: 500, topP: 0.95, frequencyPenalty: 0.5, presencePenalty: 0.5,});Task-based intelligent routing:
const result = await generateText({ task: 'coding', hints: { quality: 'best', capabilities: { reasoning: true, codeGeneration: true, }, }, prompt: 'Write a React component for a todo list',});Force specific provider:
const result = await generateText({ provider: 'anthropic', model: 'claude-sonnet-4', prompt: 'Explain quantum computing',});promptInstall()
Section titled “promptInstall()”Shows an interactive modal that guides users through extension installation.
Parameters
Section titled “Parameters”None
Returns
Section titled “Returns”Promise<void>;Throws
Section titled “Throws”- Error if installation cancelled by user
- Error if browser not supported
Example
Section titled “Example”import { promptInstall, generateText } from 'webllm';
try { await promptInstall(); // Extension is now ready const result = await generateText({ prompt: 'Hello' });} catch (error) { console.error('Installation failed:', error.message);}webLlmReady(timeout?)
Section titled “webLlmReady(timeout?)”Waits for the WebLLM extension to become available and ready.
Parameters
Section titled “Parameters”timeout(number, optional) - Maximum time to wait in milliseconds (default: 30000)
Returns
Section titled “Returns”Promise<void>;Throws
Section titled “Throws”- Error if timeout expires
- Error if browser not supported
Example
Section titled “Example”import { webLlmReady, generateText } from 'webllm';
try { await webLlmReady(10000); // Wait up to 10 seconds const result = await generateText({ prompt: 'Hello' });} catch (error) { console.error('Extension not available:', error.message);}isAvailable()
Section titled “isAvailable()”Synchronously checks if the WebLLM extension is currently installed and available.
Parameters
Section titled “Parameters”None
Returns
Section titled “Returns”boolean;Example
Section titled “Example”import { isAvailable } from 'webllm';
if (isAvailable()) { console.log('Extension is installed');} else { console.log('Extension not found');}getBrowserInfo()
Section titled “getBrowserInfo()”Returns information about browser compatibility with WebLLM.
Parameters
Section titled “Parameters”None
Returns
Section titled “Returns”BrowserInfo;
interface BrowserInfo { isSupported: boolean; // Whether WebLLM is supported on this browser browserName: string; // Browser name (e.g., 'Chrome', 'Edge', 'Firefox') reason?: string; // Reason if not supported installUrl?: string; // URL to install the extension}Example
Section titled “Example”import { getBrowserInfo } from 'webllm';
const info = getBrowserInfo();
console.log('Browser:', info.browserName);console.log('Supported:', info.isSupported);
if (!info.isSupported) { console.log('Reason:', info.reason);} else if (info.installUrl) { console.log('Install from:', info.installUrl);}OpenAI SDK Compatible API
Section titled “OpenAI SDK Compatible API”webllm.chat.completions.create(options)
Section titled “webllm.chat.completions.create(options)”OpenAI-compatible chat completions interface.
Parameters
Section titled “Parameters”interface ChatCompletionOptions { model?: string; // Model name (optional) messages: ChatCompletionMessage[]; // Conversation messages temperature?: number; // Randomness (0.0-2.0) max_tokens?: number; // Maximum tokens to generate top_p?: number; // Nucleus sampling (0.0-1.0) frequency_penalty?: number; // Reduce repetition (0.0-2.0) presence_penalty?: number; // Encourage new topics (0.0-2.0) stop?: string | string[]; // Stop sequences stream?: boolean; // Enable streaming (not yet supported)}
interface ChatCompletionMessage { role: 'user' | 'assistant' | 'system'; content: string; name?: string; // Optional message author name}Returns
Section titled “Returns”Promise<ChatCompletionResponse>;
interface ChatCompletionResponse { id: string; object: 'chat.completion'; created: number; // Unix timestamp model: string; choices: Array<{ index: number; message: ChatCompletionMessage; finish_reason: string; }>; usage: { prompt_tokens: number; completion_tokens: number; total_tokens: number; };}Examples
Section titled “Examples”Basic chat:
import { webllm } from 'webllm';
const completion = await webllm.chat.completions.create({ model: 'gpt-4o', messages: [{ role: 'user', content: 'Hello!' }],});
console.log(completion.choices[0].message.content);With system message:
const completion = await webllm.chat.completions.create({ model: 'claude-sonnet-4', messages: [ { role: 'system', content: 'You are a helpful assistant.' }, { role: 'user', content: 'Explain quantum computing' }, ], temperature: 0.7, max_tokens: 500,});Drop-in OpenAI replacement:
// Import as openai for seamless replacementimport { webllm as openai } from 'webllm';
const completion = await openai.chat.completions.create({ model: 'gpt-4o', messages: [{ role: 'user', content: 'Hello!' }],});WebLLMClient Class
Section titled “WebLLMClient Class”Constructor
Section titled “Constructor”import { WebLLMClient } from 'webllm';
const client = new WebLLMClient();The constructor automatically:
- Initializes readiness promise tracking
- Listens for
webllm:readyevents - Checks if extension is already installed
Instance Methods
Section titled “Instance Methods”All functions above are available as instance methods:
const client = new WebLLMClient();
// Installation & setupawait client.promptInstall();await client.webLlmReady(10000);const available = client.isAvailable();const browserInfo = client.getBrowserInfo();
// Text generationconst result = await client.generateText({ prompt: 'Hello' });
// OpenAI-compatibleconst completion = await client.chat.completions.create({ model: 'gpt-4o', messages: [{ role: 'user', content: 'Hello' }],});TypeScript Support
Section titled “TypeScript Support”Full TypeScript definitions are included:
import type { BrowserInfo, CancellablePromise, GenerateTextOptions, GenerateTextResult, ChatCompletionOptions, ChatCompletionResponse, Message, ModelHints, Usage,} from 'webllm';Error Handling
Section titled “Error Handling”Extension Not Available
Section titled “Extension Not Available”try { await generateText({ prompt: 'Hello' });} catch (error) { if (error.message.includes('extension not available')) { await promptInstall(); }}Browser Not Supported
Section titled “Browser Not Supported”try { await promptInstall();} catch (error) { if (error.message.includes('not supported')) { const info = getBrowserInfo(); console.log(info.reason); }}Installation Cancelled
Section titled “Installation Cancelled”try { await promptInstall();} catch (error) { if (error.message.includes('cancelled')) { console.log('User cancelled installation'); }}Request Cancelled
Section titled “Request Cancelled”const request = generateText({ prompt: 'Hello' });
// Cancel the requestrequest.cancel();
try { await request;} catch (error) { if (error.name === 'AbortError') { console.log('Request was cancelled'); }}Best Practices
Section titled “Best Practices”- Check browser compatibility first - Use
getBrowserInfo()before prompting for installation - Use
promptInstall()for required features - Best UX for features that need AI - Use
webLlmReady()for optional features - Progressive enhancement with timeout - Always provide fallbacks - App should work without AI features
- Handle installation cancellation - User might click cancel
- Cache readiness state - Don’t prompt repeatedly in same session
- Use task-based routing - Let WebLLM select the best model for your use case
Related Documentation
Section titled “Related Documentation”- SDK Getting Started - Integration tutorial
- Vercel AI Provider - Using with Vercel AI SDK
- Best Practices - Production tips
- Gateway Service - Server-side token generation