// docs

Building Agents

Python tools, async handlers, tool schemas, and the complete agent code structure.

Overview

Agents in AgentOp are autonomous AI systems that combine large language models with custom Python code, allowing them to perform tasks, use tools, and interact with users. This guide covers everything you need to know about creating powerful agents.

Agent Architecture

Each agent consists of several key components:

Template: Provides HTML structure, default code, and UI styling
Python Code: Your custom agent logic (runs via Pyodide in browser)
Prompt Configuration: System prompt and user prompt template
AI Provider: OpenAI, Anthropic, or local wllama (llama.cpp WASM)
Python Packages: Dependencies from Pyodide or PyPI
Metadata: Name, description, tags, visibility settings

Creating an Agent

1. Select a Template

Templates provide starting points with pre-configured HTML, CSS, and example code:

Browse templates at /templates/
Preview template UI and functionality
Select one when creating your agent

2. Write Python Code

Your agent's Python code defines its behavior. Here's a basic structure:

import json

# Define async tool functions your agent can call.
# Use 'async def' — Pyodide executes these asynchronously inside the browser.
# The docstring becomes the tool's description sent to the LLM.
async def get_current_weather(location: str) -> str:
    """Get the current weather for a location."""
    # Your implementation here (e.g. call a weather API via js.fetch)
    return f"Weather in {location}: Sunny, 72°F"

async def search_web(query: str) -> str:
    """Search the web for up-to-date information about the given query."""
    # Your implementation here
    return f"Search results for: {query}"

# Export tool schemas in OpenAI function-calling format.
# The JavaScript↔Python bridge uses these to dispatch LLM tool calls to your functions.
def get_tool_schemas():
    """Return tool schemas for the JS↔Python bridge."""
    return [
        {
            "type": "function",
            "function": {
                "name": "get_current_weather",
                "description": "Get the current weather for a location.",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "location": {"type": "string", "description": "City and country, e.g. 'London, UK'"}
                    },
                    "required": ["location"]
                }
            }
        },
        {
            "type": "function",
            "function": {
                "name": "search_web",
                "description": "Search the web for up-to-date information about the given query.",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "query": {"type": "string", "description": "The search query"}
                    },
                    "required": ["query"]
                }
            }
        }
    ]
# AgentOp handles the rest: model loading, prompt assembly, tool dispatch,
# and response streaming. No LangChain boilerplate needed in your Python code.

💡 AgentOp handles orchestration automatically

You don't need to create an AgentExecutor or wire up prompts manually. AgentOp injects the full LangChain (Python or LangChain.js) infrastructure around your code automatically. Define your async tool functions and get_tool_schemas() — the platform handles model loading, prompt assembly, memory, tool dispatch, and response streaming.

3. Configure Prompts

Prompts guide your agent's behavior:

System Prompt

Defines the agent's personality, role, and capabilities:

You are a helpful weather assistant that can check weather conditions
and provide recommendations. Always be friendly and informative.
When users ask about weather, use the get_current_weather function.
If they ask about other topics, let them know your specialty is weather.

💡 Individual agents can override just this

The system prompt above is the template's default. Anyone creating an agent from this template can give it its own personality via the "Instructions" field (create wizard or agent edit page) without touching the template — leaving it blank inherits this default. The user prompt template and few-shot examples below stay template-level only.

User Prompt Template

Formats user input before sending to the LLM:

User question: {input}

Please provide a helpful response. If you need to use a tool, explain what you're doing.

Few-Shot Examples (Optional)

Provide example interactions to guide the agent's responses:

[
  {
    "input": "What's the weather in San Francisco?",
    "output": "Let me check that for you. [calls get_current_weather('San Francisco')]
              The weather in San Francisco is sunny and 72°F."
  }
]

4. Select Provider

Choose how your agent will generate responses:

OpenAI

Models: GPT-4o-mini (default, recommended), GPT-4o, GPT-4-turbo, GPT-3.5-turbo (legacy)
Best for: Production applications, complex reasoning, fast responses
Requires: OpenAI API key
LangChain packages are auto-injected by AgentOp — no manual setup needed

Anthropic (Claude)

Models: Claude 3.5 Sonnet (default, recommended), Claude 3 Opus, Claude 3 Haiku
Best for: Long context (200K tokens), detailed responses, document analysis
Requires: Anthropic API key
LangChain packages are auto-injected by AgentOp — no manual setup needed

Local (wllama)

Models: Qwen 3 (0.6B–8B, 4B recommended), Llama 3.2/3.3, Hermes 3, Phi 4 Mini, Gemma 4, DeepSeek R1 — all GGUF format
Best for: Privacy, no API costs, offline use
Requires: A WebGPU-capable GPU (Chrome/Edge on a machine with a graphics card, or an Apple Silicon Mac)
Note: Models are 0.5–7.5GB (the recommended Qwen 3 4B is ~2.6GB) and download from HuggingFace on first use

5. Add Python Packages

Extend your agent's capabilities with Python packages:

Pyodide Built-in Packages

Pre-compiled packages that load quickly:

Data Science: numpy, pandas, scipy, scikit-learn
Visualization: matplotlib, bokeh
Parsing: beautifulsoup4, lxml, regex
Utilities: pillow, yaml, pytz, sqlite3

PyPI Packages

Any pure-Python package from PyPI (installed via micropip at runtime):

{
  "pypi_packages": {
    "python-dateutil": ">=2.8.0",
    "pyyaml": "*"
  }
}

💡 LangChain packages are auto-injected

You do not need to add langchain, langchain_openai, or langchain_anthropic to your packages — AgentOp adds the correct versions automatically based on your chosen provider.

⚠️ Package Compatibility

Only pure-Python packages work with Pyodide. Packages with C extensions (except those pre-compiled for Pyodide) won't work. Check the Pyodide package list for built-in support.

6. Set Metadata

Name: Clear, descriptive (e.g., "Weather Assistant Bot")
Description: Explain what your agent does (Markdown supported)
Tags: Categorize for discovery (e.g., "weather", "chatbot", "assistant")
Visibility: Public (marketplace) or Private (only you)
Allow Forks: Let others create copies

Advanced Features

Conversation Memory

AgentOp handles conversation memory automatically — you do not need to write any memory management code. When a template has conversation_memory_enabled turned on, the JavaScript layer (AgentOpLangChain) maintains a sliding window of recent messages and sends them to the LLM with each new query.

You can configure the memory size via the template's max_memory_messages setting (default: 10). To clear conversation history at runtime from Python, call the JS bridge:

# Clear the conversation history from Python
from js import window
window.agentOpLangChain.clearHistory()

💡 Memory is per-session

Conversation memory lives in the browser tab. Closing or refreshing the page resets it. Memory persists across multiple messages within the same session.

Custom Tools with Multiple Parameters

Tools can accept multiple parameters with defaults. Use Python type hints and a detailed docstring so the LLM knows how to call them. Export a matching schema in get_tool_schemas():

import json

async def search_web(query: str, max_results: int = 5) -> str:
    """Search the web for current information about a topic.

    Args:
        query: The search term or question to look up.
        max_results: Maximum number of results to return (1-10).
    """
    # Example: call an external search API via js.fetch
    from js import fetch
    resp = await fetch(f"https://api.example.com/search?q={query}&limit={max_results}")
    data = await resp.json()
    return json.dumps({"results": data.to_py()})

async def calculate(expression: str) -> str:
    """Evaluate a mathematical expression and return the result."""
    try:
        result = eval(expression)  # Note: eval executes arbitrary Python; restrict inputs as appropriate
        return str(result)
    except Exception as e:
        return f"Error: {e}"

def get_tool_schemas():
    return [
        {
            "type": "function",
            "function": {
                "name": "search_web",
                "description": "Search the web for current information about a topic.",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "query": {"type": "string", "description": "The search term or question"},
                        "max_results": {"type": "integer", "description": "Max results (1-10, default 5)"}
                    },
                    "required": ["query"]
                }
            }
        },
        {
            "type": "function",
            "function": {
                "name": "calculate",
                "description": "Evaluate a mathematical expression and return the result.",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "expression": {"type": "string", "description": "Math expression, e.g. '2 + 2 * 3'"}
                    },
                    "required": ["expression"]
                }
            }
        }
    ]

Calling JavaScript APIs from Python

Because your Python code runs inside Pyodide, you can access browser APIs via the js module. This is useful for fetching external data, manipulating the DOM, or interacting with the template's JavaScript code:

from js import fetch, document, window
import json

async def fetch_data(url: str) -> str:
    """Fetch JSON data from a URL."""
    response = await fetch(url)
    data = await response.json()
    return json.dumps(data.to_py())

async def update_ui(html_content: str) -> str:
    """Update a DOM element with new HTML content."""
    element = document.getElementById("results")
    if element:
        element.innerHTML = html_content
        return "UI updated"
    return "Element not found"

Streaming Responses

Streaming is handled automatically by AgentOp's JavaScript layer. For cloud providers (OpenAI, Anthropic), the AgentOpLangChain class streams tokens to the UI in real time. For local (wllama), the WllamaAgentManager streams inference output token by token.

You do not need to write any streaming code in your Python tools. The platform takes care of streaming the LLM's response to the user interface. Your tool functions return a plain string result — the LLM incorporates it and the infrastructure streams the final answer.

Error Handling in Tools

AgentOp catches exceptions from your Python tool functions and passes the error message back to the LLM so it can inform the user. Return a descriptive error string rather than raising exceptions where possible:

async def analyze_csv(file_content: str) -> str:
    """Analyze CSV data and return summary statistics."""
    try:
        import pandas as pd
        from io import StringIO
        df = pd.read_csv(StringIO(file_content))
        summary = df.describe().to_string()
        return f"Dataset: {len(df)} rows, {len(df.columns)} columns\n\n{summary}"
    except Exception as e:
        return f"Could not analyze CSV: {e}"

If a tool returns an error, the model sees that error and can try a different approach or explain it to the user — errors are never silently swallowed.

How the agent loop works

Both provider families run a bounded ReAct-style loop with the same guardrails — a capped number of tool rounds, truncated tool results (so a large output can't overflow the model's context), and a trimmed conversation-history window:

Cloud (OpenAI / Anthropic): up to 5 tool-calling iterations per query. Transient network failures (timeouts, rate limits, 5xx) are retried once automatically; a bad key or malformed request surfaces immediately.
Local (wllama): up to 3 tool rounds. Because inference runs in your browser with no network, there is nothing to retry on the network side; instead the loop fails fast if the GPU device is lost (reload and, if needed, switch to CPU — the failed GPU is remembered, so the next load starts on CPU automatically), falls back from grammar-constrained to unconstrained sampling when a WASM build needs it, and salvages a plain-language answer from the last tool result if a follow-up pass stalls. At load time, NVIDIA and Apple GPUs are used immediately; any other GPU (AMD, Intel, Qualcomm — including unified-memory machines) is tested once with an automatic fall-back to CPU, and the result is remembered on that device.

Testing Your Agent

Local Testing

Download your agent as HTML
Open in a browser
Test various inputs and edge cases
Check console for errors (F12)

Testing Checklist

✅ Agent responds to normal queries
✅ Tools are called correctly
✅ Error messages are user-friendly
✅ Conversation memory works (if enabled)
✅ Agent stays within its domain/role
✅ Performance is acceptable

Best Practices

Prompt Engineering

Be specific about the agent's role and limitations
Provide clear instructions for tool use
Include examples of desired behavior
Test prompts iteratively

Tool Design

Keep tool names and descriptions clear and concise
Use Python type hints for parameters so the auto-generated schema is accurate
Return descriptive error strings rather than raising exceptions
Return structured data (JSON) when possible for richer LLM responses

Performance

Minimize package dependencies (smaller download size)
Use Pyodide built-ins when available
Cache expensive computations
For local models, warn users about download size

Security

Never hardcode API keys in Python code
Validate user inputs before processing
Be cautious with tools that access external APIs
Consider rate limiting for public agents

Publishing and Sharing

Make Your Agent Public

Edit your agent
Set "Visibility" to Public
Ensure description and tags are complete
Save changes

Allow Forking

Enable "Allow Forks" to let others learn from and build upon your agent. Forks create independent copies that others can modify.

Promote Your Agent

Share the agent URL on social media
Write a blog post about your agent's use case
Add it to relevant collections in the marketplace
Engage with users who fork or rate your agent