AI Provider Configuration
Choose and configure the AI backend for your agents
Overview
AgentOp supports three AI providers for powering your agents. Each provider offers different capabilities, pricing models, and deployment options. This guide will help you choose the right provider and configure it properly for your use case.
Provider Comparison
| Feature | OpenAI | Anthropic | Local WebLLM |
|---|---|---|---|
| API Key Required | Yes | Yes | No |
| Cost | Pay per token | Pay per token | Free |
| Privacy | Data sent to OpenAI | Data sent to Anthropic | 100% local, no data sent |
| Internet Required | Yes (API calls) | Yes (API calls) | First download only |
| Model Size | N/A (cloud) | N/A (cloud) | 4-8GB download |
| Response Speed | Fast (API latency) | Fast (API latency) | Depends on device |
| Function Calling | Full support | Full support | Full support via LangChain.js |
| Best For | Production apps, complex tasks | Long context, detailed responses | Privacy, offline, no costs |
OpenAI Provider
Overview
OpenAI provides the GPT series of models, known for strong general-purpose capabilities and broad knowledge. Best suited for production applications requiring fast, reliable AI responses.
Supported Models
- GPT-4o: Most capable model with vision, function calling, and structured outputs
- GPT-4o-mini: Fast, affordable model for lighter tasks (recommended for most users)
- GPT-4-turbo: Previous generation flagship, still highly capable
- GPT-3.5-turbo: Legacy model, still useful for simple tasks at lower cost
Getting an API Key
- Visit platform.openai.com
- Create an account (requires phone verification)
- Add billing information (credit card required)
- Navigate to API Keys section
- Click "Create new secret key"
- Copy the key (shown only once!)
- Set usage limits for security
Security Best Practice
Create separate API keys for each agent and set monthly spending limits. This limits damage if a key is compromised.
Configuration in AgentOp
When creating your agent, select "OpenAI" as the provider:
from langchain_openai import ChatOpenAI
# Configure model
llm = ChatOpenAI(
model="gpt-4o-mini",
temperature=0.7,
# API key handled by AgentOp encryption system
)
Cost Estimation
- GPT-4o-mini: ~$0.15 per million input tokens, ~$0.60 per million output tokens
- GPT-4o: ~$2.50 per million input tokens, ~$10.00 per million output tokens
- GPT-3.5-turbo: ~$0.50 per million input tokens, ~$1.50 per million output tokens
Note: Prices are approximate and change over time. Check OpenAI's pricing page for current rates.
Pros
- Excellent general-purpose performance
- Fast response times via API
- Large ecosystem and community support
- Strong function calling capabilities
- Vision support in GPT-4o models
Cons
- Requires API key and billing setup
- Pay-per-use pricing can add up
- Data sent to OpenAI servers (privacy consideration)
- Internet connection required
- Rate limits apply
Anthropic Provider (Claude)
Overview
Anthropic's Claude models are known for their strong instruction-following, safety features, and ability to handle very long contexts. Excellent choice for detailed responses and document analysis.
Supported Models
- Claude 3.5 Sonnet: Most capable model with excellent reasoning (recommended)
- Claude 3 Opus: Previous flagship, extremely capable but slower and more expensive
- Claude 3 Haiku: Fast, cost-effective for simpler tasks
Getting an API Key
- Visit console.anthropic.com
- Create an account
- Add billing information
- Navigate to API Keys section
- Click "Create Key"
- Name your key and copy it
- Set usage limits as needed
Configuration in AgentOp
When creating your agent, select "Anthropic" as the provider:
from langchain_anthropic import ChatAnthropic
# Configure model
llm = ChatAnthropic(
model="claude-3-5-sonnet-20241022",
temperature=0.7,
max_tokens=4096,
# API key handled by AgentOp encryption system
)
Cost Estimation
- Claude 3.5 Sonnet: ~$3.00 per million input tokens, ~$15.00 per million output tokens
- Claude 3 Haiku: ~$0.25 per million input tokens, ~$1.25 per million output tokens
- Claude 3 Opus: ~$15.00 per million input tokens, ~$75.00 per million output tokens
Note: Prices are approximate. Check Anthropic's pricing page for current rates.
Pros
- Excellent at following complex instructions
- Very large context windows (up to 200K tokens)
- Strong safety and ethical alignment
- Detailed, well-reasoned responses
- Good for document analysis and summarization
Cons
- More expensive than OpenAI for similar capabilities
- Requires API key and billing
- Data sent to Anthropic servers
- Internet connection required
- Smaller ecosystem than OpenAI
Local WebLLM Provider
Overview
WebLLM enables running AI models directly in the browser using WebGPU. No API keys, no costs, and complete privacy. Models download once and run entirely on the user's device.
Supported Models
- Hermes-2-Pro-Mistral-7B: 7B parameter model with function calling support (recommended)
- Hermes-3-Llama-3.1-8B: Hermes-tuned Llama 3.1, 8B parameters (q4f16 / q4f32 variants)
- Hermes-2-Pro-Llama-3-8B: Hermes-tuned Llama 3, 8B parameters (q4f16 / q4f32 variants)
- Llama-3.1-8B-Instruct: Meta's Llama 3.1, 8B parameters (q4f16 / q4f32 variants)
- Llama-3.1-70B-Instruct: Meta's Llama 3.1, 70B parameters (q3f16 / q4f16 variants)
The available models may expand over time as new options are added to the platform.
WebGPU Requirement
WebLLM requires WebGPU support. Works in Chrome 113+, Edge 113+, and recent Safari. Firefox support is experimental. Check webgpureport.org to verify your browser.
No Configuration Required
Simply select "Local (WebLLM)" as your provider when creating an agent. No API key needed!
# WebLLM uses LangChain.js (NOT Python LangChain) for inference
# Your Python code defines the tools, which are called from JavaScript
# Model selection happens in the browser UI
# Default model: Hermes-2-Pro-Mistral-7B-q4f16_1-MLC
# This model supports native function calling via OpenAI API format
First-Time Setup
When a user opens your agent for the first time:
- They select a model from available options
- The model downloads (4-8GB depending on model)
- Model is cached in browser for future use
- Agent is ready to use offline!
Large Download Size
Models are 4-8GB in size. Warn users about the download on first use. After downloading, the model is cached and works offline.
Performance Considerations
- Desktop/Laptop: Good performance with modern GPUs
- High-end mobile: Works but may be slower
- Low-end devices: May struggle with larger models
- Speed: 5-30 tokens/second depending on device and model
Pros
- Completely free - no API costs
- 100% private - data never leaves device
- Works offline after initial download
- No API key management
- Full function calling support via LangChain.js
- No rate limits
Cons
- Large initial download (4-8GB)
- Requires WebGPU-capable browser
- Performance varies by device
- Smaller models = lower quality than GPT-4 or Claude
- Limited to browser environment
- May not work on older/low-end devices
API Key Security
AgentOp uses client-side encryption to protect your API keys when embedding them in HTML files:
How It Works
- When you download an agent, you're prompted to create an encryption password
- Your API key is encrypted using AES-256 encryption in your browser
- Only the encrypted key is embedded in the HTML file
- When opening the agent, users enter the password to decrypt the key
- Decryption happens entirely client-side - no keys sent to servers
Security Best Practices
- Use a strong, unique encryption password
- Create separate API keys for each agent
- Set spending limits on all API keys
- Regularly rotate API keys
- Never share agents with embedded keys publicly
- Consider using Local WebLLM for public agents
Choosing the Right Provider
Use OpenAI if you need:
- Production-ready, reliable AI
- Fast response times
- Vision capabilities (GPT-4o)
- Broad general knowledge
- Cost-effective solutions (GPT-4o-mini)
Use Anthropic if you need:
- Very long context handling (200K tokens)
- Extremely detailed responses
- Document analysis and summarization
- Strong ethical alignment
- Complex instruction following
Use Local WebLLM if you need:
- Zero API costs
- Complete privacy and data control
- Offline functionality
- No API key management
- Public distribution without key exposure
Switching Providers
You can change providers for any agent by editing it and selecting a different provider option. Your Python code may need minor adjustments to use provider-specific LangChain classes.