Running Local AI Models
with OpenClaw + Ollama
Eliminate API costs entirely by running an open-source LLM on your VPS. Zero tokens billed, complete privacy, full offline capability.
How Much RAM Does Each Model Need?
Local model RAM requirements are significantly higher than regular OpenClaw usage. Plan your VPS accordingly.
Llama 3.2 3B
Fast, lightweight. Good for simple tasks. Runs on Hostinger KVM 2.
Llama 3.1 8B
Balanced quality. Recommended starting point. Needs KVM 3+.
Mistral 7B
Strong reasoning, efficient. Good Claude alternative for many tasks.
Llama 3.1 70B
Near GPT-4 quality. Requires Contabo VPS 40 or dedicated server.
Always add ~2 GB for OpenClaw’s Gateway + Docker overhead on top of the model’s requirement. A 7B model needing 8 GB actually needs 10 GB total VPS RAM for reliable operation.
Install Ollama on Your VPS
Connect OpenClaw to Ollama
Add to your .env file:
Use Ollama for routine tasks (scheduling checks, simple Q&A, private document processing) and Claude Sonnet for complex reasoning. Set OpenClaw to route based on task complexity. This gives you the best of both worlds — privacy and cost savings for simple tasks, maximum quality for hard ones.
Local Models Running
Zero API costs for routine tasks. Combine with Gemini Flash for complex requests at minimal cost.
VPS Specs for Local ModelsBudget Guide →
