The OpenClaw Stack: Running a Personal AI Agent Locally

How to build a self-hosted AI agent that actually does things — using OpenClaw, Ollama, and your own hardware.

Most AI assistants are read-only. They can answer questions, write text, generate images — but they can’t take actions. OpenClaw changes that.

OpenClaw is an agent runtime that connects a language model to real tools: file systems, shell commands, web browsing, messaging, calendars, and more. Run it locally with Ollama as the backend and you have a fully private, agentic AI that can actually do things — without sending a single byte to the cloud.

The stack

Ollama (inference)
  └── serves local model via OpenAI-compatible API
OpenClaw (agent runtime)
  └── routes prompts → model
  └── executes tools (files, shell, browser, messages)
  └── manages memory (daily notes, long-term MEMORY.md)
Telegram / Web UI (interface)
  └── how you talk to your agent

Why local inference for agents?

With a cloud API, every tool call goes through a third-party server. Your file paths, command outputs, calendar events, and message content are all visible to the API provider.

With local inference, the model runs on your hardware. Tool outputs never leave your machine. The agent can access sensitive context — medical notes, financial data, private messages — without privacy risk.

Getting started

1. Install Ollama and pull a capable model

curl -fsSL https://ollama.com/install.sh | sh
ollama pull llama3.1:70b   # best quality, needs 48 GB VRAM
ollama pull qwen2.5:32b    # excellent for agents, needs 20 GB VRAM
ollama pull phi4            # great reasoning, needs 10 GB VRAM

For agentic tasks, you want a model that’s strong at tool use and instruction following. Llama 3.1 and Qwen 2.5 are top choices.

2. Install OpenClaw

Visit openclaw.ai for the latest install instructions. OpenClaw connects to Ollama via its OpenAI-compatible API (http://localhost:11434/v1).

3. Configure your agent

OpenClaw uses markdown files for memory and persona:

  • SOUL.md — who the agent is
  • USER.md — who you are, your preferences
  • MEMORY.md — long-term context
  • memory/YYYY-MM-DD.md — daily session logs

These files give the agent continuity across restarts — something most AI tools completely lack.

What can a local agent actually do?

  • Read and write files on your machine
  • Run shell commands, scripts, git operations
  • Browse the web and fetch page content
  • Send Telegram messages, check your calendar
  • Manage long-term memory across sessions
  • Trigger workflows, run code, manipulate data

The difference from a chatbot is that it executes. You say “summarize my notes from this week and send me a Telegram with the highlights” — and it does exactly that.

Hardware requirements

For a usable local agent:

  • Minimum: RTX 3090 (24 GB) + Llama 3.1 8B at Q4 — fast, capable, handles most tasks
  • Recommended: RTX 4090 + Llama 3.1 70B at Q4_K_M — near frontier-level reasoning
  • Budget: 32+ GB RAM + Phi-4 or Llama 3.2 — CPU inference, slower but workable

See the hardware picks → for buy links.

Privacy trade-off

No cloud API means no remote data leakage. The trade-off is capability — a local 70B model is good, but not yet at the level of frontier cloud models.

That gap is closing fast. And for many agent tasks — personal productivity, file management, home automation, private research — a well-configured local 70B model is more than sufficient today.

The sovereign AI stack is real. Run it yourself.