AgentOp

AI Provider Configuration

Choose and configure the AI backend for your agents

Overview

AgentOp supports three AI providers for powering your agents. Each provider offers different capabilities, pricing models, and deployment options. This guide will help you choose the right provider and configure it properly for your use case.

Provider Comparison

Feature OpenAI Anthropic Local WebLLM
API Key Required Yes Yes No
Cost Pay per token Pay per token Free
Privacy Data sent to OpenAI Data sent to Anthropic 100% local, no data sent
Internet Required Yes (API calls) Yes (API calls) First download only
Model Size N/A (cloud) N/A (cloud) 4-8GB download
Response Speed Fast (API latency) Fast (API latency) Depends on device
Function Calling Full support Full support Full support via LangChain.js
Best For Production apps, complex tasks Long context, detailed responses Privacy, offline, no costs

OpenAI Provider

Overview

OpenAI provides the GPT series of models, known for strong general-purpose capabilities and broad knowledge. Best suited for production applications requiring fast, reliable AI responses.

Supported Models

  • GPT-4o: Most capable model with vision, function calling, and structured outputs
  • GPT-4o-mini: Fast, affordable model for lighter tasks (recommended for most users)
  • GPT-4-turbo: Previous generation flagship, still highly capable
  • GPT-3.5-turbo: Legacy model, still useful for simple tasks at lower cost

Getting an API Key

  1. Visit platform.openai.com
  2. Create an account (requires phone verification)
  3. Add billing information (credit card required)
  4. Navigate to API Keys section
  5. Click "Create new secret key"
  6. Copy the key (shown only once!)
  7. Set usage limits for security

Security Best Practice

Create separate API keys for each agent and set monthly spending limits. This limits damage if a key is compromised.

Configuration in AgentOp

When creating your agent, select "OpenAI" as the provider:

from langchain_openai import ChatOpenAI

# Configure model
llm = ChatOpenAI(
    model="gpt-4o-mini",
    temperature=0.7,
    # API key handled by AgentOp encryption system
)

Cost Estimation

  • GPT-4o-mini: ~$0.15 per million input tokens, ~$0.60 per million output tokens
  • GPT-4o: ~$2.50 per million input tokens, ~$10.00 per million output tokens
  • GPT-3.5-turbo: ~$0.50 per million input tokens, ~$1.50 per million output tokens

Note: Prices are approximate and change over time. Check OpenAI's pricing page for current rates.

Pros

  • Excellent general-purpose performance
  • Fast response times via API
  • Large ecosystem and community support
  • Strong function calling capabilities
  • Vision support in GPT-4o models

Cons

  • Requires API key and billing setup
  • Pay-per-use pricing can add up
  • Data sent to OpenAI servers (privacy consideration)
  • Internet connection required
  • Rate limits apply

Anthropic Provider (Claude)

Overview

Anthropic's Claude models are known for their strong instruction-following, safety features, and ability to handle very long contexts. Excellent choice for detailed responses and document analysis.

Supported Models

  • Claude 3.5 Sonnet: Most capable model with excellent reasoning (recommended)
  • Claude 3 Opus: Previous flagship, extremely capable but slower and more expensive
  • Claude 3 Haiku: Fast, cost-effective for simpler tasks

Getting an API Key

  1. Visit console.anthropic.com
  2. Create an account
  3. Add billing information
  4. Navigate to API Keys section
  5. Click "Create Key"
  6. Name your key and copy it
  7. Set usage limits as needed

Configuration in AgentOp

When creating your agent, select "Anthropic" as the provider:

from langchain_anthropic import ChatAnthropic

# Configure model
llm = ChatAnthropic(
    model="claude-3-5-sonnet-20241022",
    temperature=0.7,
    max_tokens=4096,
    # API key handled by AgentOp encryption system
)

Cost Estimation

  • Claude 3.5 Sonnet: ~$3.00 per million input tokens, ~$15.00 per million output tokens
  • Claude 3 Haiku: ~$0.25 per million input tokens, ~$1.25 per million output tokens
  • Claude 3 Opus: ~$15.00 per million input tokens, ~$75.00 per million output tokens

Note: Prices are approximate. Check Anthropic's pricing page for current rates.

Pros

  • Excellent at following complex instructions
  • Very large context windows (up to 200K tokens)
  • Strong safety and ethical alignment
  • Detailed, well-reasoned responses
  • Good for document analysis and summarization

Cons

  • More expensive than OpenAI for similar capabilities
  • Requires API key and billing
  • Data sent to Anthropic servers
  • Internet connection required
  • Smaller ecosystem than OpenAI

Local WebLLM Provider

Overview

WebLLM enables running AI models directly in the browser using WebGPU. No API keys, no costs, and complete privacy. Models download once and run entirely on the user's device.

Supported Models

  • Hermes-2-Pro-Mistral-7B: 7B parameter model with function calling support (recommended)
  • Hermes-3-Llama-3.1-8B: Hermes-tuned Llama 3.1, 8B parameters (q4f16 / q4f32 variants)
  • Hermes-2-Pro-Llama-3-8B: Hermes-tuned Llama 3, 8B parameters (q4f16 / q4f32 variants)
  • Llama-3.1-8B-Instruct: Meta's Llama 3.1, 8B parameters (q4f16 / q4f32 variants)
  • Llama-3.1-70B-Instruct: Meta's Llama 3.1, 70B parameters (q3f16 / q4f16 variants)

The available models may expand over time as new options are added to the platform.

WebGPU Requirement

WebLLM requires WebGPU support. Works in Chrome 113+, Edge 113+, and recent Safari. Firefox support is experimental. Check webgpureport.org to verify your browser.

No Configuration Required

Simply select "Local (WebLLM)" as your provider when creating an agent. No API key needed!

# WebLLM uses LangChain.js (NOT Python LangChain) for inference
# Your Python code defines the tools, which are called from JavaScript
# Model selection happens in the browser UI

# Default model: Hermes-2-Pro-Mistral-7B-q4f16_1-MLC
# This model supports native function calling via OpenAI API format

First-Time Setup

When a user opens your agent for the first time:

  1. They select a model from available options
  2. The model downloads (4-8GB depending on model)
  3. Model is cached in browser for future use
  4. Agent is ready to use offline!

Large Download Size

Models are 4-8GB in size. Warn users about the download on first use. After downloading, the model is cached and works offline.

Performance Considerations

  • Desktop/Laptop: Good performance with modern GPUs
  • High-end mobile: Works but may be slower
  • Low-end devices: May struggle with larger models
  • Speed: 5-30 tokens/second depending on device and model

Pros

  • Completely free - no API costs
  • 100% private - data never leaves device
  • Works offline after initial download
  • No API key management
  • Full function calling support via LangChain.js
  • No rate limits

Cons

  • Large initial download (4-8GB)
  • Requires WebGPU-capable browser
  • Performance varies by device
  • Smaller models = lower quality than GPT-4 or Claude
  • Limited to browser environment
  • May not work on older/low-end devices

API Key Security

AgentOp uses client-side encryption to protect your API keys when embedding them in HTML files:

How It Works

  1. When you download an agent, you're prompted to create an encryption password
  2. Your API key is encrypted using AES-256 encryption in your browser
  3. Only the encrypted key is embedded in the HTML file
  4. When opening the agent, users enter the password to decrypt the key
  5. Decryption happens entirely client-side - no keys sent to servers

Security Best Practices

  • Use a strong, unique encryption password
  • Create separate API keys for each agent
  • Set spending limits on all API keys
  • Regularly rotate API keys
  • Never share agents with embedded keys publicly
  • Consider using Local WebLLM for public agents

Choosing the Right Provider

Use OpenAI if you need:

  • Production-ready, reliable AI
  • Fast response times
  • Vision capabilities (GPT-4o)
  • Broad general knowledge
  • Cost-effective solutions (GPT-4o-mini)

Use Anthropic if you need:

  • Very long context handling (200K tokens)
  • Extremely detailed responses
  • Document analysis and summarization
  • Strong ethical alignment
  • Complex instruction following

Use Local WebLLM if you need:

  • Zero API costs
  • Complete privacy and data control
  • Offline functionality
  • No API key management
  • Public distribution without key exposure

Switching Providers

You can change providers for any agent by editing it and selecting a different provider option. Your Python code may need minor adjustments to use provider-specific LangChain classes.

Next Steps