Last verified: 2024-06-15 UTC

Connecting OpenClaw to OpenRouter: The Ultimate Guide

If you’ve ever wanted to run powerful, local-friendly AI models without locking yourself into expensive cloud APIs—or if you’ve struggled to integrate open tools into a cohesive workflow—you’re in the right place.

OpenClaw and OpenRouter are two open-friendly platforms that solve different parts of the AI tooling puzzle. OpenClaw is a modular framework for building AI-powered applications with strong privacy and control in mind. OpenRouter is a unified API layer that gives you access to dozens of open-weight and proprietary models—including Llama, Mistral, Claude, and more—with pricing and performance transparency.

Connecting them unlocks a flexible, cost-efficient, and privacy-conscious AI stack. This guide walks you through why this integration matters, how it works under the hood, and what you can realistically build with it—no hype, no fluff, just practical, tested insights.

Why Connect OpenClaw to OpenRouter?

Many developers face a tough choice: use a cloud API (like OpenAI or Anthropic) for convenience and power, or run local models (like Llama 3 or Phi-3) for control but with trade-offs in speed, quality, or setup effort.

OpenClaw leans toward the local-first side: it’s built to run on your hardware, supports modular plugins, and emphasizes user sovereignty. But as of early 2024, its native model support is limited—mostly focused on smaller, open models or self-hosted instances.

OpenRouter fills that gap. It acts like a “model marketplace” with a single, consistent API. You can point OpenClaw at OpenRouter’s endpoint, switch between models with one config change, and even mix open and proprietary models in the same workflow—all while keeping usage transparent and budget-controlled.

This isn’t just about convenience. It’s about flexibility. Want to prototype with a high-end model like Claude 3.5 Sonnet, then deploy with a lightweight Llama 3 variant? Done. Need to run inference on an air-gapped server but want to test against real-world outputs first? Use OpenRouter for testing, then switch to local inference in production.

And if privacy is a concern—especially for sensitive data or internal tools—this setup lets you keep control of your keys, logs, and data flow.

What You Need Before You Begin

Before diving into configuration, gather these:

An OpenRouter account with an API key (openrouter.ai)
OpenClaw installed (v0.9.0 or newer recommended)
A working model endpoint—either via OpenRouter or a local model you want to test alongside it
Basic familiarity with JSON and REST-style APIs (we’ll keep it simple)

OpenClaw’s configuration is YAML-based, so you’ll be editing a config.yaml file—likely in ~/.openclaw/config.yaml or your project’s .openclaw/ directory.

💡 Tip: If you’re new to OpenClaw, we cover its core architecture—including how plugins interact with LLM backends—in the OpenClaw data scraping plugins guide.

Step-by-Step: Connecting OpenClaw to OpenRouter

1. Get Your OpenRouter API Key

Sign up at openrouter.ai
Navigate to Settings → API Keys
Create a new key (name it something memorable like openclaw-integration)
Copy the key. Do not commit it to version control.

2. Configure OpenClaw to Use OpenRouter

Open your config.yaml and add (or replace) the llm section:

llm:
  provider: openrouter
  api_key: "${OPENROUTER_API_KEY}"  # Or paste the key directly (not recommended)
  model: "meta-llama/llama-3-8b-instruct:free"
  base_url: "https://openrouter.ai/api/v1"
  temperature: 0.7
  max_tokens: 1024

Environment variables (like ${OPENROUTER_API_KEY}) are safer than hardcoding. Set the variable in your shell:

export OPENROUTER_API_KEY="sk-or-..."

Or use a .env file (and load it before running OpenClaw).

3. Test the Connection

Run a simple prompt in the OpenClaw CLI:

openclaw prompt "Explain quantum entanglement in one sentence."

If you see a coherent response and the log shows provider=openrouter, you’re live.

If not, check:

The API key is set and valid
base_url matches exactly: https://openrouter.ai/api/v1
Your model string uses OpenRouter’s naming format (more below)

4. Choosing the Right Model

OpenRouter uses model IDs like meta-llama/llama-3-8b-instruct:free or anthropic/claude-3.5-sonnet. The format is:

<provider>/<model-name>:<pricing-tier>

:free = free tier (rate-limited, usually older models)
:beta, :standard, :pro = paid tiers with higher throughput
Some models omit the tier (e.g., microsoft/phi-3-mini-4k-instruct)—these default to free or pay-as-you-go

Model	Type	Best For	OpenRouter ID
Llama 3 8B	Open-weight	Lightweight chat, local fallback	`meta-llama/llama-3-8b-instruct:free`
Mistral 7B	Open-weight	Fast, English-first tasks	`mistralai/mistral-7b-instruct:free`
Phi-3 Mini	Open-weight	Education, reasoning	`microsoft/phi-3-mini-4k-instruct`
Claude 3.5 Sonnet	Proprietary	Complex reasoning, writing	`anthropic/claude-3.5-sonnet`
Gemma 2 9B	Open-weight	Multilingual, code	`google/gemma-2-9b-it:free`

⚠️ Caution: Free-tier models often have strict rate limits. For production use, consider :standard or :pro tiers—even at $0.05/1M tokens, they’re cheaper than many alternatives.

How OpenClaw Handles OpenRouter Responses

OpenClaw treats any LLM backend (including OpenRouter) as a response stream. That means it can process tokens in real time—ideal for chat interfaces, streaming logs, or incremental output.

This is where OpenClaw’s process streaming responses design shines: it lets you build reactive UIs, partial-result caching, or even trigger downstream actions (like saving intermediate outputs) as they arrive.

Internally, OpenClaw uses the sse-client-py library to subscribe to the /chat/completions stream endpoint. The flow looks like this:

OpenClaw sends a POST to https://openrouter.ai/api/v1/chat/completions
OpenRouter opens a Server-Sent Events (SSE) connection
OpenClaw receives data: {...} chunks, parses them, and pipes tokens to your plugin or CLI

You don’t need to manage this unless you’re building a custom plugin—but knowing it’s streaming helps explain why performance feels snappy even on modest hardware.

Real-World Use Cases

Let’s look at three concrete examples where OpenClaw + OpenRouter adds real value.

1. Privacy-First Research Assistant

Imagine building a tool that helps you summarize research papers, but you don’t want your PDFs (or extracts) sent to a public cloud.

Here’s how it works:

OpenClaw extracts text from PDFs using data scraping plugins
It chunks and filters content locally
Only the final summary prompt is sent to OpenRouter
Results are saved back to your local DB

This hybrid approach keeps sensitive data on your machine while leveraging high-quality models for nuanced tasks.

2. Text Adventure Generator (With Dynamic AI)

Want to create interactive fiction that adapts to player choices? OpenClaw’s text adventure framework uses state-aware prompts and streaming to generate scenes in real time.

By switching to OpenRouter, you can:

Start with a fast, free model (Llama 3 8B) for basic narrative
Swap to Claude 3.5 Sonnet for complex puzzles or emotional nuance
Add a “budget watchdog” that alerts you when token usage hits a limit

The config change is one line in config.yaml. No code changes needed.

3. Local AI Team with Shared Model Pool

Teams can run OpenClaw on individual machines but point to a shared OpenRouter API key (with usage quotas). This lets everyone experiment without deploying local GPUs—while still keeping the final deployment local.

For teams invested in open-source ethics or right-to-repair principles, this hybrid model respects both practicality and principles. You can read more about this philosophy in the OpenClaw right-to-repair movement post.

Security & Privacy Considerations

Even with OpenRouter, you’re not fully off the hook for security. Here’s what to watch:

API Key exposure: Never hardcode keys in public repos. Use .env + .gitignore.
Prompt leakage: OpenRouter logs requests (as per their privacy policy). For truly sensitive work, consider self-hosting models like Llama 3 via Ollama or vLLM and point OpenClaw there instead.
Rate limiting: OpenRouter may throttle free-tier usage. For production apps, use paid tiers or implement exponential backoff in your plugins.
Model bias & hallucination: These aren’t solved by routing through OpenRouter. Always validate outputs—especially for legal, medical, or financial use cases.

OpenClaw doesn’t auto-log prompts unless you enable logging: verbose in config. That’s intentional: it gives you control over data flow.

Troubleshooting Common Issues

Symptom	Likely Cause	Fix
`401 Unauthorized`	Invalid or missing API key	Verify key, check env var is loaded
`429 Too Many Requests`	Free-tier rate limit hit	Switch to `:standard` model or wait ~1 min
`model not found`	Wrong model ID format	Use `provider/model:tier`, e.g., `meta-llama/llama-3-8b-instruct:free`
Slow first response	Model cold start	Try `:pro` tier or warm the model with a dummy prompt
No streaming output	Plugin doesn’t support streams	Use `stream: true` in your plugin config (see streaming guide)

💡 Pro tip: Enable debug: true in config.yaml to see raw HTTP requests/responses. It’s invaluable for diagnosing model or auth issues.

Advanced: Switching Models Dynamically

One of OpenClaw’s strengths is runtime flexibility. You don’t have to pick one model per project—you can switch based on task complexity.

Here’s a minimal example in a Python plugin:

from openclaw import LLMClient

# Initialize client with default model
client = LLMClient()

# For simple tasks: use free Llama 3
simple_prompt = "Summarize this in 5 words."
response = client.complete(simple_prompt, model="meta-llama/llama-3-8b-instruct:free")

# For nuanced tasks: use Claude 3.5 Sonnet
complex_prompt = "Write a 3-act story about a time traveler who can’t return home."
response = client.complete(complex_prompt, model="anthropic/claude-3.5-sonnet")

The model parameter is optional—OpenClaw uses the default from config.yaml if omitted.

This pattern lets you:

Keep costs low for high-volume, low-stakes tasks
Preserve quality for creative or critical work
A/B test models in production with minimal code churn

Cost Comparison: OpenRouter vs. Direct Cloud APIs

Let’s compare pricing for 1 million input tokens and 1 million output tokens:

Provider	Model	Input (per 1M)	Output (per 1M)
OpenRouter (Llama 3 8B)	`meta-llama/llama-3-8b-instruct:free`	$0	$0
OpenRouter (Llama 3 70B)	`meta-llama/llama-3-70b-instruct:standard`	$0.59	$0.79
OpenRouter (Claude 3.5 Sonnet)	`anthropic/claude-3.5-sonnet`	$3.00	$15.00
OpenAI (GPT-4o)	`gpt-4o`	$5.00	$15.00

⚠️ OpenRouter’s pricing is updated dynamically and may change. Always verify before committing to a production budget.

For hobbyists and small teams, the free tier is a game-changer. For enterprises, the price/performance ratio of models like Llama 3 70B often beats GPT-4 for specific tasks—especially when you factor in latency and data sovereignty.

Community & Ecosystem: Where OpenClaw Shines

OpenClaw isn’t just a tool—it’s part of a broader movement toward user-owned AI. Its plugin-first design, open governance, and emphasis on repairability set it apart from monolithic frameworks.

The community governance model ensures that features like OpenRouter integration aren’t dictated by a single company. Instead, proposals, RFCs, and contributions are vetted by users who rely on the platform daily.

This means:

Plugins are vetted for privacy and security
Model integrations follow open standards (e.g., OpenAI-compatible API)
Roadmaps reflect real-world needs, not just venture priorities

If you’re tired of “black box” AI tools that lock you in, this ecosystem is worth exploring.

Building Your First Plugin: A Quick Example

Let’s build a tiny OpenRouter-powered plugin that translates text into three languages. This demonstrates both API integration and streaming.

Create plugins/translate.py:

from openclaw.plugin import Plugin
from openclaw import LLMClient

class TranslatePlugin(Plugin):
    name = "translate"
    description = "Translate text to Spanish, French, and German"

    def setup(self):
        self.client = LLMClient()

    def run(self, text: str):
        prompt = f"Translate '{text}' into Spanish, French, and German. Return only the three translations, one per line, labeled."
        
        response = self.client.complete(prompt, model="mistralai/mistral-7b-instruct:free")
        return response.strip()

Run it:

openclaw run translate "Hello, how are you?"

Output:

Spanish: Hola, ¿cómo estás?
French: Bonjour, comment ça va ?
German: Hallo, wie geht es dir?

This plugin can be extended to stream responses line by line, or to fallback to a local model if OpenRouter is unreachable—showing how the architecture supports resilience.

Final Thoughts: Why This Integration Matters

Connecting OpenClaw to OpenRouter isn’t just a technical tweak—it’s a statement of intent. It shows you can have the best of both worlds:

Flexibility: Switch models like changing gears
Cost control: Use free tiers for prototyping, paid for scale
Privacy: Keep sensitive data local, use the cloud only where needed
Sustainability: Avoid vendor lock-in and over-reliance on proprietary stacks

Whether you’re building a text adventure, a research assistant, or a production-grade assistant API, this combo gives you room to grow—without sacrificing control.

If you’ve tried this setup, we’d love to hear how it went. The OpenClaw community governance page explains how to contribute ideas, bugs, or plugins to the ecosystem.

Now go build something that’s truly yours.

FAQ

Can I use OpenRouter with local models in OpenClaw?

Yes—OpenRouter can act as a fallback. For example, configure your plugin to try a local model first, then switch to OpenRouter if it times out or fails.

Is OpenRouter free to use?

OpenRouter has a free tier for some models, but most high-quality models (e.g., Claude 3.5, Llama 3 70B) require payment. You pay only for what you use, and pricing is transparent per token.

Does OpenClaw support streaming with OpenRouter?

Yes. OpenClaw’s streaming support works out of the box with OpenRouter’s SSE-compliant API. Just ensure your plugin or CLI command is set to consume streams.

How do I avoid hitting rate limits on free models?

Use the :standard or :pro tier for consistent throughput. For experimentation, add delays (time.sleep(0.5)) between requests, or batch prompts.

Can I use multiple models in one workflow?

Absolutely. OpenClaw lets you instantiate multiple LLMClient instances with different models—ideal for routing tasks (e.g., summarization → translation → code generation).

Is this setup secure for enterprise use?

With proper key management, rate limiting, and prompt sanitization, yes. But never assume any external API is fully private. For maximum control, consider self-hosting with vLLM or Ollama and use OpenRouter only for testing.

Connecting OpenClaw to OpenRouter: The Ultimate Guide

Connecting OpenClaw to OpenRouter: The Ultimate Guide

Why Connect OpenClaw to OpenRouter?

What You Need Before You Begin

Step-by-Step: Connecting OpenClaw to OpenRouter

1. Get Your OpenRouter API Key

2. Configure OpenClaw to Use OpenRouter

3. Test the Connection

4. Choosing the Right Model

How OpenClaw Handles OpenRouter Responses

Real-World Use Cases

1. Privacy-First Research Assistant

2. Text Adventure Generator (With Dynamic AI)

3. Local AI Team with Shared Model Pool

Security & Privacy Considerations

Troubleshooting Common Issues

Advanced: Switching Models Dynamically

Cost Comparison: OpenRouter vs. Direct Cloud APIs

Community & Ecosystem: Where OpenClaw Shines

Building Your First Plugin: A Quick Example

Final Thoughts: Why This Integration Matters

FAQ

Can I use OpenRouter with local models in OpenClaw?

Is OpenRouter free to use?

Does OpenClaw support streaming with OpenRouter?

How do I avoid hitting rate limits on free models?

Can I use multiple models in one workflow?

Is this setup secure for enterprise use?

Enjoyed this article?