LLM Integration
Configure AI providers for your agent.
Supported Providers
| Provider | Models |
|---|---|
| Workers AI | Llama, Mistral, Qwen, etc. |
| OpenAI | GPT-4, GPT-3.5, etc. |
| Anthropic | Claude 3, Claude 2, etc. |
Configuration
Via Settings
- Navigate to Settings
- Configure LLM section:
- Provider
- Model
- Temperature
- Max tokens
- System prompt
Settings Schema
json
{
"llm_provider": "workers-ai",
"llm_model": "@cf/meta/llama-3.1-8b-instruct",
"llm_temperature": 0.7,
"llm_max_tokens": 2048,
"llm_system_prompt": "You are a helpful assistant."
}Workers AI Models
Available Models
| Model | Context | Best For |
|---|---|---|
@cf/meta/llama-3.1-8b-instruct | 128K | General chat |
@cf/meta/llama-3.1-70b-instruct | 128K | Complex tasks |
@cf/mistral/mistral-7b-instruct-v0.1 | 8K | Fast responses |
@cf/qwen/qwen1.5-14b-chat-awq | 32K | Multilingual |
No API Key Required
Workers AI is built-in and requires no API key.
OpenAI Integration
Configuration
json
{
"llm_provider": "openai",
"llm_model": "gpt-4-turbo-preview"
}API Key
Store in agent settings or secrets:
bash
npx wrangler secret put OPENAI_API_KEYAnthropic Integration
Configuration
json
{
"llm_provider": "anthropic",
"llm_model": "claude-3-sonnet-20240229"
}API Key
Store in agent settings or secrets:
bash
npx wrangler secret put ANTHROPIC_API_KEYParameters
Temperature
Controls randomness (0-1):
0- Deterministic, focused0.7- Balanced (default)1- Creative, varied
Max Tokens
Maximum response length:
- Varies by model
- Higher = longer responses
- Consider context limits
System Prompt
Instructions for the AI:
You are a customer support agent for Acme Corp.
- Answer product questions
- Help with issues
- Be polite and professionalBest Practices
1. Choose the Right Model
| Use Case | Recommended |
|---|---|
| Simple chat | Llama 8B |
| Complex reasoning | GPT-4 / Claude 3 |
| Fast responses | Mistral 7B |
| Cost-effective | Workers AI |
2. Write Clear Prompts
Be specific in system prompts:
- Define the role
- Set boundaries
- Specify format
3. Monitor Usage
Track token consumption and costs.
4. Test Before Production
Verify responses with test conversations.