Quickstart
WebLLM brings AI directly into the browser without server costs or API key management. Get started in minutes and let your users choose their preferred AI provider—whether it’s OpenAI, Anthropic, local models, or any of 16+ supported providers.
Installation
Section titled “Installation”First, install WebLLM using your preferred package manager:
npm install webllmyarn add webllmpnpm add webllmbun add webllmYour First AI Request
Section titled “Your First AI Request”Create your first browser-native AI request. WebLLM intelligently routes to the user’s best available provider:
import { promptInstall, generateText } from 'webllm';
// Ensure extension is installedawait promptInstall();
// Generate text - WebLLM handles provider selectionconst result = await generateText({ messages: [ { role: 'system', content: 'Respond like a cynical irritated teenager' }, { role: 'user', content: 'Hi how are you today' } ]});
alert(result.text); // Funny teenager response displayed in browserconsole.log(result.text);import { promptInstall } from 'webllm';import { generateText } from 'ai';import { webllm } from 'webllm-ai-provider';
// Ensure extension is installedawait promptInstall();
// Generate text using Vercel AI SDKconst result = await generateText({ model: webllm('browser'), messages: [ { role: 'system', content: 'Respond like a cynical irritated teenager' }, { role: 'user', content: 'Hi how are you today' } ]});
alert(result.text); // Funny teenager response displayed in browserconsole.log(result.text);Note: For Vercel AI SDK, also install:
npm install webllm-ai-provider aiThat’s it! No API keys to manage, no server costs, and your users control which AI provider powers the request.
Guide Provider Selection
Section titled “Guide Provider Selection”Help WebLLM choose the optimal provider for your use case with hints:
import { generateText } from 'webllm';
const result = await generateText({ // Provide hints about your needs task: 'creative', hints: { speed: 'fast', // Favor fast responses quality: 'standard' // But flexible on quality },
// Enable streaming for real-time responses stream: true,
// Specify tool support if needed tools: [],
messages: [ { role: 'system', content: 'Respond like a cynical irritated teenager. Favour fast responses as this is a joking agent, but we\'re flexible if user wants a better model.' }, { role: 'user', content: 'Tell me a joke' } ]});
// Handle streaming responsefor await (const chunk of result.textStream) { process.stdout.write(chunk);}import { generateText, streamText } from 'ai';import { webllm } from 'webllm-ai-provider';
// For streaming responsesconst result = await streamText({ model: webllm({ task: 'creative', hints: { speed: 'fast', // Favor fast responses quality: 'standard' // But flexible on quality } }),
messages: [ { role: 'system', content: 'Respond like a cynical irritated teenager. Favour fast responses as this is a joking agent, but we\'re flexible if user wants a better model.' }, { role: 'user', content: 'Tell me a joke' } ],
// Tool support tools: {}});
// Handle streamingfor await (const chunk of result.textStream) { process.stdout.write(chunk);}Available Hints:
- speed:
'fastest','fast','balanced','slow' - quality:
'draft','standard','high','best' - task:
'general','creative','qa','coding','translation','summarization'
WebLLM uses these hints along with user preferences to select the best provider. Users might have GPT-4, Claude, local models, or others configured—your app works with all of them.
Complete Example
Section titled “Complete Example”Here’s a complete browser-ready example:
<!DOCTYPE html><html><head> <title>WebLLM Quickstart</title> <script type="module"> import { promptInstall, generateText } from 'https://esm.sh/webllm';
async function askAI() { try { // Ensure extension is available await promptInstall();
// Get user input const userMessage = document.getElementById('input').value;
// Show loading document.getElementById('output').textContent = 'Thinking...';
// Generate response const result = await generateText({ task: 'creative', hints: { speed: 'fast' }, messages: [ { role: 'system', content: 'Respond like a cynical irritated teenager' }, { role: 'user', content: userMessage } ] });
// Display result document.getElementById('output').textContent = result.text;
} catch (error) { document.getElementById('output').textContent = 'Error: ' + error.message; } }
// Make function globally available window.askAI = askAI; </script></head><body> <h1>Cynical Teenager Bot</h1>
<input id="input" type="text" placeholder="Say something..." style="width: 300px; padding: 8px;" />
<button onclick="askAI()">Ask</button>
<div id="output" style="margin-top: 20px; padding: 16px; background: #f0f0f0; border-radius: 8px;" > Response will appear here... </div></body></html>What Makes WebLLM Special?
Section titled “What Makes WebLLM Special?”Unlike traditional AI APIs:
- ✅ No Server Required: Runs entirely in the browser
- ✅ No API Keys: Users bring their own provider
- ✅ Privacy First: User data never leaves their machine (unless their chosen provider requires it)
- ✅ Provider Agnostic: Works with OpenAI, Anthropic, local models, and 16+ providers
- ✅ Cost Efficient: You pay nothing—users use their own accounts
- ✅ Intelligent Routing: Automatically selects the best available provider
Next Steps
Section titled “Next Steps”Now that you’ve got the basics, explore these topics to build production-ready applications:
Essential Reading
Section titled “Essential Reading”- User Onboarding - Guide users to install the extension and configure providers
- Fallback Strategies - Provide API tokens as fallback when extension isn’t available
- Best Practices - Production-ready patterns and recommendations
Deep Dives
Section titled “Deep Dives”- Client Library API - Complete API reference with all available methods
- Vercel AI Provider - Full integration guide for Vercel AI SDK
- Provider System - Understand how WebLLM routes requests
- Architecture Overview - Learn how WebLLM works under the hood
Advanced Topics
Section titled “Advanced Topics”- Task-Specific Routing - Optimize provider selection for different use cases
- Streaming Responses - Real-time response handling
- Error Handling - Graceful degradation patterns
- Framework Integration - React, Vue, and other frameworks
Need Help?
Section titled “Need Help?”- 📖 Browse the full documentation
- 🎮 Try the interactive playground
- 💬 Join our community (coming soon)
- 🐛 Report issues on GitHub
Ready to build something amazing? WebLLM makes it easy to add AI to any web application—no servers, no API management, just pure browser-native intelligence.