Skip to content

Quickstart

WebLLM brings AI directly into the browser without server costs or API key management. Get started in minutes and let your users choose their preferred AI provider—whether it’s OpenAI, Anthropic, local models, or any of 16+ supported providers.

First, install WebLLM using your preferred package manager:

Terminal window
npm install webllm

Create your first browser-native AI request. WebLLM intelligently routes to the user’s best available provider:

import { promptInstall, generateText } from 'webllm';
// Ensure extension is installed
await promptInstall();
// Generate text - WebLLM handles provider selection
const result = await generateText({
messages: [
{ role: 'system', content: 'Respond like a cynical irritated teenager' },
{ role: 'user', content: 'Hi how are you today' }
]
});
alert(result.text); // Funny teenager response displayed in browser
console.log(result.text);

That’s it! No API keys to manage, no server costs, and your users control which AI provider powers the request.

Help WebLLM choose the optimal provider for your use case with hints:

import { generateText } from 'webllm';
const result = await generateText({
// Provide hints about your needs
task: 'creative',
hints: {
speed: 'fast', // Favor fast responses
quality: 'standard' // But flexible on quality
},
// Enable streaming for real-time responses
stream: true,
// Specify tool support if needed
tools: [],
messages: [
{
role: 'system',
content: 'Respond like a cynical irritated teenager. Favour fast responses as this is a joking agent, but we\'re flexible if user wants a better model.'
},
{ role: 'user', content: 'Tell me a joke' }
]
});
// Handle streaming response
for await (const chunk of result.textStream) {
process.stdout.write(chunk);
}

Available Hints:

  • speed: 'fastest', 'fast', 'balanced', 'slow'
  • quality: 'draft', 'standard', 'high', 'best'
  • task: 'general', 'creative', 'qa', 'coding', 'translation', 'summarization'

WebLLM uses these hints along with user preferences to select the best provider. Users might have GPT-4, Claude, local models, or others configured—your app works with all of them.

Here’s a complete browser-ready example:

<!DOCTYPE html>
<html>
<head>
<title>WebLLM Quickstart</title>
<script type="module">
import { promptInstall, generateText } from 'https://esm.sh/webllm';
async function askAI() {
try {
// Ensure extension is available
await promptInstall();
// Get user input
const userMessage = document.getElementById('input').value;
// Show loading
document.getElementById('output').textContent = 'Thinking...';
// Generate response
const result = await generateText({
task: 'creative',
hints: {
speed: 'fast'
},
messages: [
{
role: 'system',
content: 'Respond like a cynical irritated teenager'
},
{
role: 'user',
content: userMessage
}
]
});
// Display result
document.getElementById('output').textContent = result.text;
} catch (error) {
document.getElementById('output').textContent =
'Error: ' + error.message;
}
}
// Make function globally available
window.askAI = askAI;
</script>
</head>
<body>
<h1>Cynical Teenager Bot</h1>
<input
id="input"
type="text"
placeholder="Say something..."
style="width: 300px; padding: 8px;"
/>
<button onclick="askAI()">Ask</button>
<div
id="output"
style="margin-top: 20px; padding: 16px; background: #f0f0f0; border-radius: 8px;"
>
Response will appear here...
</div>
</body>
</html>

Unlike traditional AI APIs:

  • No Server Required: Runs entirely in the browser
  • No API Keys: Users bring their own provider
  • Privacy First: User data never leaves their machine (unless their chosen provider requires it)
  • Provider Agnostic: Works with OpenAI, Anthropic, local models, and 16+ providers
  • Cost Efficient: You pay nothing—users use their own accounts
  • Intelligent Routing: Automatically selects the best available provider

Now that you’ve got the basics, explore these topics to build production-ready applications:


Ready to build something amazing? WebLLM makes it easy to add AI to any web application—no servers, no API management, just pure browser-native intelligence.