OpenClawagentsself-hosted

The OpenClaw Stack: Running a Personal AI Agent Locally

January 25, 2025

How to build a self-hosted AI agent that actually does things — using OpenClaw, Ollama, and your own hardware.

Most AI assistants are read-only. They can answer questions, write text, generate images — but they can’t take actions. OpenClaw changes that.

OpenClaw is an agent runtime that connects a language model to real tools: file systems, shell commands, web browsing, messaging, calendars, and more. Run it locally with Ollama as the backend and you have a fully private, agentic AI that can actually do things — without sending a single byte to the cloud.

The stack

Ollama (inference)
  └── serves local model via OpenAI-compatible API
OpenClaw (agent runtime)
  └── routes prompts → model
  └── executes tools (files, shell, browser, messages)
  └── manages memory (daily notes, long-term MEMORY.md)
Telegram / Web UI (interface)
  └── how you talk to your agent

Why local inference for agents?

With a cloud API, every tool call goes through a third-party server. Your file paths, command outputs, calendar events, and message content are all visible to the API provider.

With local inference, the model runs on your hardware. Tool outputs never leave your machine. The agent can access sensitive context — medical notes, financial data, private messages — without privacy risk.

Getting started

1. Install Ollama and pull a capable model

curl -fsSL https://ollama.com/install.sh | sh
ollama pull llama3.1:70b   # best quality, needs 48 GB VRAM
ollama pull qwen2.5:32b    # excellent for agents, needs 20 GB VRAM
ollama pull phi4            # great reasoning, needs 10 GB VRAM

For agentic tasks, you want a model that’s strong at tool use and instruction following. Llama 3.1 and Qwen 2.5 are top choices.

2. Install OpenClaw

Visit openclaw.ai for the latest install instructions. OpenClaw connects to Ollama via its OpenAI-compatible API (http://localhost:11434/v1).

3. Configure your agent

OpenClaw uses markdown files for memory and persona:

SOUL.md — who the agent is
USER.md — who you are, your preferences
MEMORY.md — long-term context
memory/YYYY-MM-DD.md — daily session logs

These files give the agent continuity across restarts — something most AI tools completely lack.

What can a local agent actually do?

Read and write files on your machine
Run shell commands, scripts, git operations
Browse the web and fetch page content
Send Telegram messages, check your calendar
Manage long-term memory across sessions
Trigger workflows, run code, manipulate data

The difference from a chatbot is that it executes. You say “summarize my notes from this week and send me a Telegram with the highlights” — and it does exactly that.

Hardware requirements

For a usable local agent:

Minimum: RTX 3090 (24 GB) + Llama 3.1 8B at Q4 — fast, capable, handles most tasks
Recommended: RTX 4090 + Llama 3.1 70B at Q4_K_M — near frontier-level reasoning
Budget: 32+ GB RAM + Phi-4 or Llama 3.2 — CPU inference, slower but workable

See the hardware picks → for buy links.

Privacy trade-off

No cloud API means no remote data leakage. The trade-off is capability — a local 70B model is good, but not yet at the level of frontier cloud models.

That gap is closing fast. And for many agent tasks — personal productivity, file management, home automation, private research — a well-configured local 70B model is more than sufficient today.

The sovereign AI stack is real. Run it yourself.