Connecting OpenClaw to OpenRouter: The Ultimate Guide
Last verified: 2024-06-15 UTC
Connecting OpenClaw to OpenRouter: The Ultimate Guide
If you’ve ever wanted to run powerful, local-friendly AI models without locking yourself into expensive cloud APIs—or if you’ve struggled to integrate open tools into a cohesive workflow—you’re in the right place.
OpenClaw and OpenRouter are two open-friendly platforms that solve different parts of the AI tooling puzzle. OpenClaw is a modular framework for building AI-powered applications with strong privacy and control in mind. OpenRouter is a unified API layer that gives you access to dozens of open-weight and proprietary models—including Llama, Mistral, Claude, and more—with pricing and performance transparency.
Connecting them unlocks a flexible, cost-efficient, and privacy-conscious AI stack. This guide walks you through why this integration matters, how it works under the hood, and what you can realistically build with it—no hype, no fluff, just practical, tested insights.
Why Connect OpenClaw to OpenRouter?
Many developers face a tough choice: use a cloud API (like OpenAI or Anthropic) for convenience and power, or run local models (like Llama 3 or Phi-3) for control but with trade-offs in speed, quality, or setup effort.
OpenClaw leans toward the local-first side: it’s built to run on your hardware, supports modular plugins, and emphasizes user sovereignty. But as of early 2024, its native model support is limited—mostly focused on smaller, open models or self-hosted instances.
OpenRouter fills that gap. It acts like a “model marketplace” with a single, consistent API. You can point OpenClaw at OpenRouter’s endpoint, switch between models with one config change, and even mix open and proprietary models in the same workflow—all while keeping usage transparent and budget-controlled.
This isn’t just about convenience. It’s about flexibility. Want to prototype with a high-end model like Claude 3.5 Sonnet, then deploy with a lightweight Llama 3 variant? Done. Need to run inference on an air-gapped server but want to test against real-world outputs first? Use OpenRouter for testing, then switch to local inference in production.
And if privacy is a concern—especially for sensitive data or internal tools—this setup lets you keep control of your keys, logs, and data flow.
What You Need Before You Begin
Before diving into configuration, gather these:
- An OpenRouter account with an API key (openrouter.ai)
- OpenClaw installed (v0.9.0 or newer recommended)
- A working model endpoint—either via OpenRouter or a local model you want to test alongside it
- Basic familiarity with JSON and REST-style APIs (we’ll keep it simple)
OpenClaw’s configuration is YAML-based, so you’ll be editing a config.yaml file—likely in ~/.openclaw/config.yaml or your project’s .openclaw/ directory.
💡 Tip: If you’re new to OpenClaw, we cover its core architecture—including how plugins interact with LLM backends—in the OpenClaw data scraping plugins guide.
Step-by-Step: Connecting OpenClaw to OpenRouter
1. Get Your OpenRouter API Key
- Sign up at openrouter.ai
- Navigate to Settings → API Keys
- Create a new key (name it something memorable like
openclaw-integration) - Copy the key. Do not commit it to version control.
2. Configure OpenClaw to Use OpenRouter
Open your config.yaml and add (or replace) the llm section:
llm:
provider: openrouter
api_key: "${OPENROUTER_API_KEY}" # Or paste the key directly (not recommended)
model: "meta-llama/llama-3-8b-instruct:free"
base_url: "https://openrouter.ai/api/v1"
temperature: 0.7
max_tokens: 1024
Environment variables (like ${OPENROUTER_API_KEY}) are safer than hardcoding. Set the variable in your shell:
export OPENROUTER_API_KEY="sk-or-..."
Or use a .env file (and load it before running OpenClaw).
3. Test the Connection
Run a simple prompt in the OpenClaw CLI:
openclaw prompt "Explain quantum entanglement in one sentence."
If you see a coherent response and the log shows provider=openrouter, you’re live.
If not, check:
- The API key is set and valid
base_urlmatches exactly:https://openrouter.ai/api/v1- Your model string uses OpenRouter’s naming format (more below)
4. Choosing the Right Model
OpenRouter uses model IDs like meta-llama/llama-3-8b-instruct:free or anthropic/claude-3.5-sonnet. The format is:
<provider>/<model-name>:<pricing-tier>
:free= free tier (rate-limited, usually older models):beta,:standard,:pro= paid tiers with higher throughput- Some models omit the tier (e.g.,
microsoft/phi-3-mini-4k-instruct)—these default to free or pay-as-you-go
| Model | Type | Best For | OpenRouter ID |
|---|---|---|---|
| Llama 3 8B | Open-weight | Lightweight chat, local fallback | meta-llama/llama-3-8b-instruct:free |
| Mistral 7B | Open-weight | Fast, English-first tasks | mistralai/mistral-7b-instruct:free |
| Phi-3 Mini | Open-weight | Education, reasoning | microsoft/phi-3-mini-4k-instruct |
| Claude 3.5 Sonnet | Proprietary | Complex reasoning, writing | anthropic/claude-3.5-sonnet |
| Gemma 2 9B | Open-weight | Multilingual, code | google/gemma-2-9b-it:free |
⚠️ Caution: Free-tier models often have strict rate limits. For production use, consider
:standardor:protiers—even at $0.05/1M tokens, they’re cheaper than many alternatives.
How OpenClaw Handles OpenRouter Responses
OpenClaw treats any LLM backend (including OpenRouter) as a response stream. That means it can process tokens in real time—ideal for chat interfaces, streaming logs, or incremental output.
This is where OpenClaw’s process streaming responses design shines: it lets you build reactive UIs, partial-result caching, or even trigger downstream actions (like saving intermediate outputs) as they arrive.
Internally, OpenClaw uses the sse-client-py library to subscribe to the /chat/completions stream endpoint. The flow looks like this:
- OpenClaw sends a POST to
https://openrouter.ai/api/v1/chat/completions - OpenRouter opens a Server-Sent Events (SSE) connection
- OpenClaw receives
data: {...}chunks, parses them, and pipes tokens to your plugin or CLI
You don’t need to manage this unless you’re building a custom plugin—but knowing it’s streaming helps explain why performance feels snappy even on modest hardware.
Real-World Use Cases
Let’s look at three concrete examples where OpenClaw + OpenRouter adds real value.
1. Privacy-First Research Assistant
Imagine building a tool that helps you summarize research papers, but you don’t want your PDFs (or extracts) sent to a public cloud.
Here’s how it works:
- OpenClaw extracts text from PDFs using data scraping plugins
- It chunks and filters content locally
- Only the final summary prompt is sent to OpenRouter
- Results are saved back to your local DB
This hybrid approach keeps sensitive data on your machine while leveraging high-quality models for nuanced tasks.
2. Text Adventure Generator (With Dynamic AI)
Want to create interactive fiction that adapts to player choices? OpenClaw’s text adventure framework uses state-aware prompts and streaming to generate scenes in real time.
By switching to OpenRouter, you can:
- Start with a fast, free model (Llama 3 8B) for basic narrative
- Swap to Claude 3.5 Sonnet for complex puzzles or emotional nuance
- Add a “budget watchdog” that alerts you when token usage hits a limit
The config change is one line in config.yaml. No code changes needed.
3. Local AI Team with Shared Model Pool
Teams can run OpenClaw on individual machines but point to a shared OpenRouter API key (with usage quotas). This lets everyone experiment without deploying local GPUs—while still keeping the final deployment local.
For teams invested in open-source ethics or right-to-repair principles, this hybrid model respects both practicality and principles. You can read more about this philosophy in the OpenClaw right-to-repair movement post.
Security & Privacy Considerations
Even with OpenRouter, you’re not fully off the hook for security. Here’s what to watch:
- API Key exposure: Never hardcode keys in public repos. Use
.env+.gitignore. - Prompt leakage: OpenRouter logs requests (as per their privacy policy). For truly sensitive work, consider self-hosting models like Llama 3 via Ollama or vLLM and point OpenClaw there instead.
- Rate limiting: OpenRouter may throttle free-tier usage. For production apps, use paid tiers or implement exponential backoff in your plugins.
- Model bias & hallucination: These aren’t solved by routing through OpenRouter. Always validate outputs—especially for legal, medical, or financial use cases.
OpenClaw doesn’t auto-log prompts unless you enable logging: verbose in config. That’s intentional: it gives you control over data flow.
Troubleshooting Common Issues
| Symptom | Likely Cause | Fix |
|---|---|---|
401 Unauthorized |
Invalid or missing API key | Verify key, check env var is loaded |
429 Too Many Requests |
Free-tier rate limit hit | Switch to :standard model or wait ~1 min |
model not found |
Wrong model ID format | Use provider/model:tier, e.g., meta-llama/llama-3-8b-instruct:free |
| Slow first response | Model cold start | Try :pro tier or warm the model with a dummy prompt |
| No streaming output | Plugin doesn’t support streams | Use stream: true in your plugin config (see streaming guide) |
💡 Pro tip: Enable
debug: trueinconfig.yamlto see raw HTTP requests/responses. It’s invaluable for diagnosing model or auth issues.
Advanced: Switching Models Dynamically
One of OpenClaw’s strengths is runtime flexibility. You don’t have to pick one model per project—you can switch based on task complexity.
Here’s a minimal example in a Python plugin:
from openclaw import LLMClient
# Initialize client with default model
client = LLMClient()
# For simple tasks: use free Llama 3
simple_prompt = "Summarize this in 5 words."
response = client.complete(simple_prompt, model="meta-llama/llama-3-8b-instruct:free")
# For nuanced tasks: use Claude 3.5 Sonnet
complex_prompt = "Write a 3-act story about a time traveler who can’t return home."
response = client.complete(complex_prompt, model="anthropic/claude-3.5-sonnet")
The model parameter is optional—OpenClaw uses the default from config.yaml if omitted.
This pattern lets you:
- Keep costs low for high-volume, low-stakes tasks
- Preserve quality for creative or critical work
- A/B test models in production with minimal code churn
Cost Comparison: OpenRouter vs. Direct Cloud APIs
Let’s compare pricing for 1 million input tokens and 1 million output tokens:
| Provider | Model | Input (per 1M) | Output (per 1M) |
|---|---|---|---|
| OpenRouter (Llama 3 8B) | meta-llama/llama-3-8b-instruct:free |
$0 | $0 |
| OpenRouter (Llama 3 70B) | meta-llama/llama-3-70b-instruct:standard |
$0.59 | $0.79 |
| OpenRouter (Claude 3.5 Sonnet) | anthropic/claude-3.5-sonnet |
$3.00 | $15.00 |
| OpenAI (GPT-4o) | gpt-4o |
$5.00 | $15.00 |
⚠️ OpenRouter’s pricing is updated dynamically and may change. Always verify before committing to a production budget.
For hobbyists and small teams, the free tier is a game-changer. For enterprises, the price/performance ratio of models like Llama 3 70B often beats GPT-4 for specific tasks—especially when you factor in latency and data sovereignty.
Community & Ecosystem: Where OpenClaw Shines
OpenClaw isn’t just a tool—it’s part of a broader movement toward user-owned AI. Its plugin-first design, open governance, and emphasis on repairability set it apart from monolithic frameworks.
The community governance model ensures that features like OpenRouter integration aren’t dictated by a single company. Instead, proposals, RFCs, and contributions are vetted by users who rely on the platform daily.
This means:
- Plugins are vetted for privacy and security
- Model integrations follow open standards (e.g., OpenAI-compatible API)
- Roadmaps reflect real-world needs, not just venture priorities
If you’re tired of “black box” AI tools that lock you in, this ecosystem is worth exploring.
Building Your First Plugin: A Quick Example
Let’s build a tiny OpenRouter-powered plugin that translates text into three languages. This demonstrates both API integration and streaming.
- Create
plugins/translate.py:
from openclaw.plugin import Plugin
from openclaw import LLMClient
class TranslatePlugin(Plugin):
name = "translate"
description = "Translate text to Spanish, French, and German"
def setup(self):
self.client = LLMClient()
def run(self, text: str):
prompt = f"Translate '{text}' into Spanish, French, and German. Return only the three translations, one per line, labeled."
response = self.client.complete(prompt, model="mistralai/mistral-7b-instruct:free")
return response.strip()
- Run it:
openclaw run translate "Hello, how are you?"
Output:
Spanish: Hola, ¿cómo estás?
French: Bonjour, comment ça va ?
German: Hallo, wie geht es dir?
This plugin can be extended to stream responses line by line, or to fallback to a local model if OpenRouter is unreachable—showing how the architecture supports resilience.
Final Thoughts: Why This Integration Matters
Connecting OpenClaw to OpenRouter isn’t just a technical tweak—it’s a statement of intent. It shows you can have the best of both worlds:
- Flexibility: Switch models like changing gears
- Cost control: Use free tiers for prototyping, paid for scale
- Privacy: Keep sensitive data local, use the cloud only where needed
- Sustainability: Avoid vendor lock-in and over-reliance on proprietary stacks
Whether you’re building a text adventure, a research assistant, or a production-grade assistant API, this combo gives you room to grow—without sacrificing control.
If you’ve tried this setup, we’d love to hear how it went. The OpenClaw community governance page explains how to contribute ideas, bugs, or plugins to the ecosystem.
Now go build something that’s truly yours.
FAQ
Can I use OpenRouter with local models in OpenClaw?
Yes—OpenRouter can act as a fallback. For example, configure your plugin to try a local model first, then switch to OpenRouter if it times out or fails.
Is OpenRouter free to use?
OpenRouter has a free tier for some models, but most high-quality models (e.g., Claude 3.5, Llama 3 70B) require payment. You pay only for what you use, and pricing is transparent per token.
Does OpenClaw support streaming with OpenRouter?
Yes. OpenClaw’s streaming support works out of the box with OpenRouter’s SSE-compliant API. Just ensure your plugin or CLI command is set to consume streams.
How do I avoid hitting rate limits on free models?
Use the :standard or :pro tier for consistent throughput. For experimentation, add delays (time.sleep(0.5)) between requests, or batch prompts.
Can I use multiple models in one workflow?
Absolutely. OpenClaw lets you instantiate multiple LLMClient instances with different models—ideal for routing tasks (e.g., summarization → translation → code generation).
Is this setup secure for enterprise use?
With proper key management, rate limiting, and prompt sanitization, yes. But never assume any external API is fully private. For maximum control, consider self-hosting with vLLM or Ollama and use OpenRouter only for testing.