Developer Best Practices
Building with WebLLM is different from traditional AI integrations. This guide covers best practices for creating great user experiences while respecting user control and privacy.
Core Principles
Section titled “Core Principles”1. Respect User Control
Section titled “1. Respect User Control”Users choose the providers, not you. Design accordingly.
❌ Don’t:
// Don't assume a specific providerconst response = await client.generate({ model: 'gpt-4', // User might not have OpenAI configured! prompt: 'Hello'});✅ Do:
// Describe your intent with task and hintsconst response = await generateText({ task: 'general', hints: { speed: 'fast' }, prompt: 'Hello' // WebLLM intelligently selects best provider based on task/hints});2. Handle Extension Not Installed
Section titled “2. Handle Extension Not Installed”Always handle the case where users don’t have the extension yet.
✅ Good UX:
import { promptInstall, isAvailable } from 'webllm';
async function initializeAI() { try { // Prompt for installation if needed await promptInstall();
// Now safe to use WebLLM return true; } catch (error) { // User declined or browser not supported console.log('WebLLM not available:', error.message); return false; }}
// Use itconst canUseAI = await initializeAI();
if (canUseAI) { // Show AI features} else { // Hide AI features or show fallback}3. Fail Gracefully
Section titled “3. Fail Gracefully”AI features should enhance your app, not break it.
✅ Progressive enhancement:
// Feature detectionconst hasAI = await webLlmReady().catch(() => false);
if (hasAI) { // Enhanced experience with AI showAISummaryButton();} else { // Core functionality still works hideAISummaryButton();}Installation & Setup
Section titled “Installation & Setup”Recommended Installation Flow
Section titled “Recommended Installation Flow”Step 1: Check availability
import { isAvailable, getBrowserInfo } from 'webllm';
function checkWebLLM() { if (isAvailable()) { return { available: true }; }
const browserInfo = getBrowserInfo(); return { available: false, supported: browserInfo.isSupported, installUrl: browserInfo.installUrl, reason: browserInfo.reason };}Step 2: Show appropriate UI
const status = checkWebLLM();
if (status.available) { // Show AI features showAIFeatures();} else if (status.supported) { // Show install prompt showInstallPrompt(status.installUrl);} else { // Browser not supported, hide AI features hideAIFeatures(); console.warn('WebLLM not supported:', status.reason);}Step 3: Use promptInstall() for best UX
import { promptInstall } from 'webllm';
async function enableAIFeatures() { try { // Shows modal with install instructions await promptInstall();
// User installed and configured showAIFeatures(); return true; } catch (error) { // User cancelled return false; }}Request Patterns
Section titled “Request Patterns”Simple Text Generation
Section titled “Simple Text Generation”For straightforward prompts:
import { generateText } from 'webllm';
async function summarizeArticle(articleText) { try { const result = await generateText({ task: 'summarization', hints: { quality: 'high', speed: 'balanced' }, prompt: `Summarize this article:\n\n${articleText}`, maxTokens: 200, temperature: 0.3 // Lower for factual tasks });
return result.text; } catch (error) { console.error('Summarization failed:', error); // Fallback: return first paragraph return articleText.split('\n\n')[0]; }}import { generateText } from 'ai';import { webllm } from 'webllm-ai-provider';
async function summarizeArticle(articleText) { try { const result = await generateText({ model: webllm({ task: 'summarization', hints: { quality: 'high', speed: 'balanced' } }), prompt: `Summarize this article:\n\n${articleText}`, maxTokens: 200, temperature: 0.3 // Lower for factual tasks });
return result.text; } catch (error) { console.error('Summarization failed:', error); // Fallback: return first paragraph return articleText.split('\n\n')[0]; }}Chat Conversations
Section titled “Chat Conversations”For multi-turn conversations:
import { streamText } from 'webllm';
async function chatWithAI(messages, onChunk) { const stream = await streamText({ messages: [ { role: 'system', content: 'You are a helpful assistant' }, ...messages ] });
let fullResponse = '';
for await (const chunk of stream) { fullResponse += chunk.text; onChunk(chunk.text); // Update UI incrementally }
return fullResponse;}import { streamText } from 'ai';import { webllm } from 'webllm-ai-provider';
async function chatWithAI(messages, onChunk) { const stream = await streamText({ model: webllm('browser'), messages: [ { role: 'system', content: 'You are a helpful assistant' }, ...messages ] });
let fullResponse = '';
for await (const textPart of stream.textStream) { fullResponse += textPart; onChunk(textPart); // Update UI incrementally }
return fullResponse;}Structured Output
Section titled “Structured Output”For JSON or structured data:
import { generateText } from 'webllm';
async function extractData(text) { const result = await generateText({ task: 'extraction', hints: { quality: 'high', capabilities: { functionCalling: true // Prefer models with structured output } }, prompt: `Extract contact information from this text as JSON:
${text}
Return only valid JSON with fields: name, email, phone`, temperature: 0.1, // Very low for structured output maxTokens: 500 });
try { return JSON.parse(result.text); } catch (error) { console.error('Failed to parse JSON:', error); return null; }}import { generateText } from 'ai';import { webllm } from 'webllm-ai-provider';
async function extractData(text) { const result = await generateText({ model: webllm({ task: 'extraction', hints: { quality: 'high', capabilities: { functionCalling: true // Prefer models with structured output } } }), prompt: `Extract contact information from this text as JSON:
${text}
Return only valid JSON with fields: name, email, phone`, temperature: 0.1, // Very low for structured output maxTokens: 500 });
try { return JSON.parse(result.text); } catch (error) { console.error('Failed to parse JSON:', error); return null; }}Error Handling
Section titled “Error Handling”Comprehensive Error Handling
Section titled “Comprehensive Error Handling”import { generateText, WebLLMError } from 'webllm';
async function generateWithRetry(prompt, maxRetries = 2) { for (let i = 0; i < maxRetries; i++) { try { const result = await generateText({ prompt }); return result.text; } catch (error) { if (error instanceof WebLLMError) { switch (error.code) { case 'EXTENSION_NOT_FOUND': // Extension not installed throw new Error('Please install WebLLM extension');
case 'NO_PROVIDER_AVAILABLE': // User hasn't configured any providers throw new Error('Please configure an AI provider in WebLLM');
case 'PERMISSION_DENIED': // User denied permission throw new Error('Permission denied');
case 'RATE_LIMIT_EXCEEDED': // Rate limited, retry after delay if (i < maxRetries - 1) { await new Promise(resolve => setTimeout(resolve, 2000)); continue; } throw new Error('Rate limit exceeded. Please try again later.');
case 'PROVIDER_ERROR': // Provider-specific error console.error('Provider error:', error.message); throw error;
default: throw error; } } throw error; } }}User-Friendly Error Messages
Section titled “User-Friendly Error Messages”function getUserFriendlyError(error) { if (!error) return 'Unknown error occurred';
const errorMessages = { 'EXTENSION_NOT_FOUND': 'WebLLM extension not installed. Please install it to use AI features.', 'NO_PROVIDER_AVAILABLE': 'No AI provider configured. Please set up a provider in WebLLM settings.', 'PERMISSION_DENIED': 'Permission denied. Click the WebLLM icon to grant access.', 'RATE_LIMIT_EXCEEDED': 'Too many requests. Please wait a moment and try again.', 'QUOTA_EXCEEDED': 'Your usage quota has been exceeded. Check your WebLLM settings.', 'MODEL_NOT_AVAILABLE': 'The requested model is not available. Try a different provider.', 'NETWORK_ERROR': 'Network error. Please check your connection and try again.' };
return errorMessages[error.code] || error.message || 'An error occurred';}
// Usagetry { await generateText({ prompt: 'Hello' });} catch (error) { showErrorToUser(getUserFriendlyError(error));}Performance Optimization
Section titled “Performance Optimization”Cache Responses
Section titled “Cache Responses”For repeated queries:
const cache = new Map();
async function generateWithCache(prompt) { // Check cache first if (cache.has(prompt)) { return cache.get(prompt); }
// Generate const result = await generateText({ prompt });
// Cache result cache.set(prompt, result.text);
// Limit cache size if (cache.size > 100) { const firstKey = cache.keys().next().value; cache.delete(firstKey); }
return result.text;}Debounce User Input
Section titled “Debounce User Input”For real-time features:
function debounce(fn, delay) { let timeout; return (...args) => { clearTimeout(timeout); timeout = setTimeout(() => fn(...args), delay); };}
// Usage: AI-powered autocompleteconst generateSuggestions = debounce(async (text) => { const result = await generateText({ prompt: `Complete this text: ${text}`, maxTokens: 50 }); updateSuggestions(result.text);}, 500); // Wait 500ms after typing stops
inputElement.addEventListener('input', (e) => { generateSuggestions(e.target.value);});Show Progress for Long Operations
Section titled “Show Progress for Long Operations”import { streamText } from 'webllm';
async function generateWithProgress(prompt, onProgress) { onProgress({ status: 'starting', percent: 0 });
const stream = await streamText({ prompt }); let chunks = 0; let fullText = '';
for await (const chunk of stream) { fullText += chunk.text; chunks++;
// Update progress onProgress({ status: 'generating', percent: Math.min(chunks * 5, 95), // Estimate text: fullText }); }
onProgress({ status: 'complete', percent: 100, text: fullText }); return fullText;}UI/UX Best Practices
Section titled “UI/UX Best Practices”Show Extension Status
Section titled “Show Extension Status”Let users know if WebLLM is available:
import { isAvailable } from 'webllm';
function WebLLMStatus() { const available = isAvailable();
return ( <div className="status-badge"> {available ? ( <span className="status-active"> ✓ AI Features Enabled </span> ) : ( <button onClick={handleInstall}> Enable AI Features (Install WebLLM) </button> )} </div> );}Streaming for Better UX
Section titled “Streaming for Better UX”Show responses as they generate:
async function displayStreamingResponse(prompt, outputElement) { outputElement.textContent = ''; // Clear
const stream = await streamText({ prompt });
for await (const chunk of stream) { outputElement.textContent += chunk.text; // Scroll to bottom outputElement.scrollTop = outputElement.scrollHeight; }}Loading States
Section titled “Loading States”Show clear feedback during generation:
async function generateAndDisplay(prompt) { const button = document.getElementById('generate-btn'); const output = document.getElementById('output');
// Show loading state button.disabled = true; button.textContent = 'Generating...'; output.innerHTML = '<div class="spinner">⏳ Thinking...</div>';
try { const result = await generateText({ prompt }); output.textContent = result.text; } catch (error) { output.innerHTML = `<div class="error">Error: ${getUserFriendlyError(error)}</div>`; } finally { button.disabled = false; button.textContent = 'Generate'; }}Privacy & Security
Section titled “Privacy & Security”Don’t Log User Prompts
Section titled “Don’t Log User Prompts”Respect user privacy - don’t send prompts to your analytics:
❌ Don’t:
const result = await generateText({ prompt });analytics.track('ai_generation', { prompt }); // Don't do this!✅ Do:
const result = await generateText({ prompt });analytics.track('ai_generation', { promptLength: prompt.length, tokensUsed: result.usage.totalTokens // No actual prompt content});Sanitize User Input
Section titled “Sanitize User Input”Always sanitize prompts from user input:
function sanitizePrompt(userInput) { // Remove potentially harmful content return userInput .replace(/<script>/gi, '') .replace(/javascript:/gi, '') .trim() .substring(0, 10000); // Limit length}
async function generateFromUserInput(userInput) { const sanitized = sanitizePrompt(userInput); return await generateText({ prompt: sanitized });}Clear Sensitive Data
Section titled “Clear Sensitive Data”Don’t keep sensitive data in memory longer than needed:
async function processSecureData(sensitiveData) { try { const result = await generateText({ prompt: `Analyze: ${sensitiveData}` });
return result.text; } finally { // Clear sensitive data sensitiveData = null; }}Testing
Section titled “Testing”Mock for Tests
Section titled “Mock for Tests”Use mocks in unit tests:
import { generateText } from 'webllm';
// Mock in testvi.mock('webllm', () => ({ generateText: vi.fn(async ({ prompt }) => ({ text: 'Mocked response', model: 'mock-model', usage: { totalTokens: 10 } })), isAvailable: vi.fn(() => true)}));
// Testtest('summarizes article', async () => { const summary = await summarizeArticle('Long article...'); expect(summary).toBe('Mocked response'); expect(generateText).toHaveBeenCalledWith({ prompt: expect.stringContaining('Summarize'), maxTokens: 200, temperature: 0.3 });});Test Without Extension
Section titled “Test Without Extension”Test graceful degradation:
test('handles missing extension', async () => { // Mock extension not available vi.mocked(isAvailable).mockReturnValue(false);
const app = render(<App />);
// AI features should be hidden expect(app.queryByText('AI Summary')).toBeNull();
// Core features should work expect(app.getByText('Read Article')).toBeTruthy();});Framework-Specific Tips
Section titled “Framework-Specific Tips”Use hooks for better integration:
import { useState, useEffect } from 'react';import { isAvailable, promptInstall, generateText } from 'webllm';
function useWebLLM() { const [available, setAvailable] = useState(false); const [loading, setLoading] = useState(true);
useEffect(() => { setAvailable(isAvailable()); setLoading(false); }, []);
return { available, loading };}
function useGenerate() { const [generating, setGenerating] = useState(false); const [error, setError] = useState(null);
const generate = async (prompt) => { setGenerating(true); setError(null);
try { const result = await generateText({ prompt }); return result.text; } catch (err) { setError(err); throw err; } finally { setGenerating(false); } };
return { generate, generating, error };}
// Usagefunction AISummary({ text }) { const { available } = useWebLLM(); const { generate, generating, error } = useGenerate(); const [summary, setSummary] = useState('');
if (!available) return null;
const handleSummarize = async () => { const result = await generate(`Summarize: ${text}`); setSummary(result); };
return ( <div> <button onClick={handleSummarize} disabled={generating}> {generating ? 'Summarizing...' : 'Summarize'} </button> {error && <div className="error">{getUserFriendlyError(error)}</div>} {summary && <div className="summary">{summary}</div>} </div> );}Use composables:
import { ref, onMounted } from 'vue';import { isAvailable, generateText } from 'webllm';
export function useWebLLM() { const available = ref(false);
onMounted(() => { available.value = isAvailable(); });
return { available };}
export function useGenerate() { const generating = ref(false); const error = ref(null);
const generate = async (prompt) => { generating.value = true; error.value = null;
try { const result = await generateText({ prompt }); return result.text; } catch (err) { error.value = err; throw err; } finally { generating.value = false; } };
return { generate, generating, error };}Svelte
Section titled “Svelte”Use stores:
import { writable } from 'svelte/store';import { isAvailable, generateText } from 'webllm';
export const webllmAvailable = writable(isAvailable());
export async function generate(prompt) { const result = await generateText({ prompt }); return result.text;}
// Component.svelte<script> import { webllmAvailable, generate } from './webllm';
let generating = false; let summary = '';
async function handleSummarize() { generating = true; summary = await generate('Summarize this...'); generating = false; }</script>
{#if $webllmAvailable} <button on:click={handleSummarize} disabled={generating}> Summarize </button> {#if summary} <p>{summary}</p> {/if}{/if}Checklist
Section titled “Checklist”Before shipping WebLLM features:
- Handle extension not installed gracefully
- Test with extension disabled
- Add loading states for all AI operations
- Implement error handling with user-friendly messages
- Don’t assume specific providers or models
- Don’t log user prompts to analytics
- Sanitize user input
- Add retry logic for transient errors
- Use streaming for better UX when appropriate
- Cache responses when applicable
- Document which features require WebLLM
- Test on supported browsers
- Implement progressive enhancement
Common Mistakes
Section titled “Common Mistakes”❌ Assuming Extension is Installed
Section titled “❌ Assuming Extension is Installed”// BAD: Will break if extension not installedconst result = await generateText({ prompt: 'Hello' });// GOOD: Check availability firstif (await webLlmReady()) { const result = await generateText({ prompt: 'Hello' });} else { showInstallPrompt();}❌ Not Handling Errors
Section titled “❌ Not Handling Errors”// BAD: Errors crash the appconst result = await generateText({ prompt });showResult(result.text);// GOOD: Handle errors gracefullytry { const result = await generateText({ prompt }); showResult(result.text);} catch (error) { showError(getUserFriendlyError(error));}❌ Blocking UI
Section titled “❌ Blocking UI”// BAD: Blocks UI during generationconst result = await generateText({ prompt });updateUI(result.text);// GOOD: Show loading stateshowLoading();const result = await generateText({ prompt });hideLoading();updateUI(result.text);❌ Not Respecting User Choice
Section titled “❌ Not Respecting User Choice”// BAD: Forces specific modelconst result = await generateText({ model: 'gpt-4', // User might not have this! prompt: 'Hello'});// GOOD: Describe your intent with task/hintsconst result = await generateText({ task: 'general', hints: { speed: 'fast' }, prompt: 'Hello' // WebLLM intelligently selects best model});Resources
Section titled “Resources”Build great AI experiences that respect user control and privacy.