Your Local AI Can Finally Do Things
You've got Ollama running. You can ask it questions, get summaries, generate code. That's useful - but it's also working from a closed box. The model only knows what you type at it.
Model Context Protocol (MCP) changes that. It's an open standard that gives your local AI model a standardized way to reach outside its training data and actually interact with the world - your files, your web browser, your databases, your APIs.
The problem MCP solves
Before MCP, every AI tool that wanted to give a model access to, say, the filesystem had to build that integration from scratch in a custom way. A different tool wanted web access - built differently again. Nothing was compatible. Every model, every tool, every app was its own island.
MCP is the USB port for AI tools. One standard protocol, any model, any tool. Build a filesystem MCP server once and it works with Ollama, with Claude Desktop, with VS Code, with your own scripts. Build an Ollama integration once and it works with any MCP server.
How it works
MCP has two sides:
- MCP Servers - small programs that expose tools (read a file, fetch a URL, run a SQL query). You run these locally.
- MCP Clients - the host application that connects your AI model to those servers. The client routes tool calls between the model and the servers.
When you ask a question, the client tells the model what tools are available. The model decides which tools to call, the client executes them, and the results come back into the conversation. The model uses those results to form its final answer.
Everything runs on your machine. The tools run locally. Ollama runs locally. No data leaves your network unless you configure a tool that specifically fetches external content.
What this actually enables
- "Summarize the PDF in my Documents folder" - model reads the actual file
- "What's on the OpenClaw Sanctuary homepage right now?" - model fetches the live page
- "How many orders came in this week?" - model queries your SQLite database
- "Create a summary of my notes from the last three days" - model reads your notes directory
- "Check if my server is responding" - model makes an HTTP request and reports back
This is the difference between an AI that answers questions and an AI that does work.
!summary report.pdf or
!status stops being a lookup and starts being an actual agent action. That's what
this tutorial builds toward.
What you'll need
- Ollama installed and running - see Ollama Basics if you haven't done this yet
- A tool-calling model pulled:
ollama pull qwen3:8bis the recommended starting point - Node.js 18 or newer (for the MCP servers - they run as npm packages)
- Python 3.10+ if you're doing the OpenClaw integration section
- About 45 minutes
Which models support tool calling?
Not every Ollama model handles tool calling well. The reliable ones in 2026:
- qwen3:8b - recommended. Strong tool calling, 8GB RAM, fast enough for interactive use
- qwen3:14b - better accuracy on complex multi-step tool use, needs 16GB RAM
- gemma4:12b - 86% tool calling accuracy, also adds vision. Needs 16GB RAM.
- phi4 - solid tool calling, good at structured output, 16GB RAM
- llama3.2:3b - works for simple tool calls, fast on 8GB machines
Reasoning models like deepseek-r1:7b are not optimized for tool calling - they think through problems in prose, not structured function calls. Use qwen3 variants for anything involving tools.
Getting the Pieces Ready
You need three things: a tool-calling model in Ollama, Node.js to run the MCP servers, and mcphost to connect them. This section gets all three in place.
Step 1 - Confirm your model supports tool calling
Pull qwen3:8b if you haven't already:
ollama pull qwen3:8b
Quick test - if this returns a JSON object with a tool_calls field, you're good:
curl -s http://localhost:11434/api/chat -d '{
"model": "qwen3:8b",
"messages": [{"role": "user", "content": "What is 2 + 2?"}],
"tools": [{"type": "function", "function": {"name": "calculator", "description": "Calculate a math expression", "parameters": {"type": "object", "properties": {"expression": {"type": "string"}}, "required": ["expression"]}}}],
"stream": false
}' | python3 -m json.tool | grep -A5 "tool_calls\|content"
You'll either see a tool_calls array (model decided to use the tool) or a
direct content response (model answered without using the tool). Both are correct
behavior.
Step 2 - Install Node.js
Check if you already have it:
node --version
You need version 18 or newer. If you get "command not found" or a version below 18:
curl -fsSL https://deb.nodesource.com/setup_20.x | sudo -E bash -
sudo apt-get install -y nodejs
brew install node
Confirm after install: node --version should show v20.x.x or newer.
Step 3 - Test an MCP server
MCP servers are npm packages. Before wiring anything together, confirm the filesystem server runs on your machine:
npx -y @modelcontextprotocol/server-filesystem /tmp
You should see something like Filesystem MCP Server running on stdio. Hit
Ctrl+C - that's all you need to confirm it works. The first run downloads the package;
subsequent runs use the cache.
Step 4 - Install mcphost
mcphost is the client that connects Ollama to your MCP servers. It's a single binary - download it, make it executable, done.
Go to the mcphost releases page and download the binary for your platform:
# Replace vX.X.X with the latest release version from the releases page
curl -L https://github.com/mark3labs/mcphost/releases/latest/download/mcphost_linux_amd64 -o mcphost
chmod +x mcphost
sudo mv mcphost /usr/local/bin/
go install github.com/mark3labs/mcphost@latest
Verify it installed:
mcphost --help
mcp Python SDK lets you write your own client in a few dozen lines.
The OpenClaw integration section later in this guide shows that approach. Install both if
you want the interactive mcphost experience and the programmatic Python integration.
Give Your Model Access to Your Files
The filesystem MCP server is the right starting point. It's the most immediately useful tool, it's fully local, and once it's working you'll understand the pattern for everything else.
Create your MCP config file
mcphost reads a config file that tells it which MCP servers to connect to. Create
mcp.json anywhere on your machine - your home directory is fine:
{
"mcpServers": {
"filesystem": {
"command": "npx",
"args": [
"-y",
"@modelcontextprotocol/server-filesystem",
"/home/youruser/Documents"
]
}
}
}
Replace /home/youruser/Documents with an actual path on your machine. The
filesystem server only has access to paths you explicitly list here - anything outside is
off limits. On macOS use /Users/yourname/Documents.
/
or /home. Give it access to one folder you control - Documents, a project
directory, a notes folder. You can always add more paths later by adding them to the
args array.
Start mcphost
mcphost --model ollama/qwen3:8b --config ~/mcp.json
mcphost starts Ollama, connects to the filesystem server, and gives you an interactive prompt. You should see something like:
Connected to MCP server: filesystem
Tools available: read_file, write_file, list_directory, create_directory, move_file, search_files, get_file_info
Model: ollama/qwen3:8b
>
Try it
Type these at the prompt and watch the model use actual tools to respond:
> What files are in my Documents folder?
The model calls list_directory on the path you configured. The result comes
back as actual directory contents, and the model formats a response using that real data.
> Read the file notes.txt and summarize what it's about
The model calls read_file, gets the actual file contents, and summarizes them.
If the file doesn't exist it tells you that too - it's not making anything up.
> Search for any files containing the word "budget" in my Documents
search_files runs a search across the allowed directory and returns matching files.
What just happened under the hood
When you type a question, mcphost sends the conversation to Ollama along with a list of available tools (their names, descriptions, and parameter schemas). Ollama decides whether to answer directly or call a tool.
If it calls a tool, mcphost executes the call against the MCP server, gets the result, and feeds it back to Ollama as a tool response in the conversation. Ollama then formulates its final answer using that result. This loop can run multiple times in one response if the model needs to call multiple tools.
Your files never leave your machine. The model running in Ollama processes everything locally. The MCP server runs locally. Nothing touches the internet unless you add a server that explicitly fetches external content (covered in the next section).
Web, Databases, and Beyond
Filesystem access is useful. Pair it with web fetching and database access and your local AI starts handling work that used to require custom integrations for every task.
The fetch server - live web content
The fetch MCP server lets your model retrieve any URL and read its content. The server strips HTML and returns clean text, so the model isn't wading through markup.
Add it to your mcp.json:
{
"mcpServers": {
"filesystem": {
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-filesystem", "/home/youruser/Documents"]
},
"fetch": {
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-fetch"]
}
}
}
Restart mcphost and try:
> What's the current Ollama version? Check the ollama.com releases page.
> Summarize the top three items from Hacker News right now.
> Fetch openclawsanctuary.com and tell me what this site is about.
The model fetches the live page, reads the actual content, and reports back. This is current data - whatever is on the page right now, not training data from months ago.
The SQLite server - query your data
If you have any SQLite databases on your machine - exported from another app, a local project database, anything - the SQLite server lets your model query them in plain English.
{
"mcpServers": {
"filesystem": {
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-filesystem", "/home/youruser/Documents"]
},
"fetch": {
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-fetch"]
},
"database": {
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-sqlite", "--db-path", "/home/youruser/mydata.db"]
}
}
}
> What tables are in this database?
> How many records were added this week?
> Show me the five most recent entries with their details.
The model reads the schema, writes the appropriate SQL, runs it, and gives you a readable answer. You don't write SQL - you ask in plain English and the model figures out the query.
More servers worth knowing about
The official MCP server list at github.com/modelcontextprotocol/servers has everything Anthropic maintains. A few practical ones:
@modelcontextprotocol/server-github - read issues, PRs, file contents from
your repos. Useful for "what open issues do I have this week" or summarizing a PR without
leaving your terminal.
@modelcontextprotocol/server-memory - gives the model a persistent knowledge
graph it can write to and read from across sessions. Ask it to "remember that the project
deadline is July 15" and it'll know next time.
@modelcontextprotocol/server-postgres - same idea as SQLite but for a live
Postgres database. Read-only by default. Good for data analysis on your own database.
@modelcontextprotocol/server-puppeteer - headless browser control. The model
can navigate to URLs, click elements, fill forms. More advanced, but powerful for
automation tasks.
Keep your config organized
As you add more servers, your mcp.json grows. A few tips that save headaches:
- Keep a separate
mcp.jsonper project or context - one for general use, one for a specific project with access to its database and files - Be specific with filesystem paths - the server only has access to what you list
- Run
mcphost --config ~/mcp-project.jsonto use a project-specific config - Name your servers descriptively in the config -
"notes-dir"is clearer than"filesystem2"
Give Your Bot Real-World Capabilities
mcphost gives you an interactive session for personal use. Your OpenClaw bot gives other people (or your whole Discord server) access to those same capabilities through chat commands. This section wires them together.
The architecture
When someone in Discord types a command like !fetch https://example.com, OpenClaw
receives it, passes it to a skill, and the skill handles the rest. The skill calls Ollama's API
directly using tool calling - the same mechanism MCP uses under the hood.
You don't need mcphost running for this. OpenClaw skills use the Ollama API directly and handle the tool execution themselves. This approach is simpler to deploy and keeps everything in one Python process that you already control.
Update your model to a tool-calling capable one
In your OpenClaw config, make sure you're using a model that handles tool calling well. If
you're currently on llama3.2 or a reasoning model, switch to qwen3:
# Use a model with reliable tool calling
model: "ollama/qwen3:8b"
# On 16GB+ RAM for better multi-step tool use:
# model: "ollama/qwen3:14b"
See Ollama Advanced for the full OpenClaw config reference.
A practical skill: !fetch
This skill lets users in Discord type !fetch https://some-url.com and get back
a summary of that page - fetched live by your local Ollama model. Nothing goes to an external
AI service.
Create this file at ~/.openclaw/workspace/skills/fetch_skill.py:
import requests
import json
SKILL = {
"trigger": "!fetch",
"description": "Fetch a URL and summarize its content using local AI"
}
def run(args, context):
url = args.strip()
if not url or not url.startswith("http"):
return "Usage: !fetch <url> (must start with http or https)"
# Fetch the page content
try:
resp = requests.get(url, timeout=15, headers={"User-Agent": "Mozilla/5.0"})
resp.raise_for_status()
# Strip to first 8000 chars - enough for most pages, fits comfortably in context
content = resp.text[:8000]
except requests.RequestException as e:
return f"Could not fetch {url}: {e}"
# Define the fetch tool for Ollama
tools = [
{
"type": "function",
"function": {
"name": "summarize_content",
"description": "Summarize the provided web page content",
"parameters": {
"type": "object",
"properties": {
"summary": {
"type": "string",
"description": "A clear 3-5 sentence summary of the content"
},
"key_points": {
"type": "array",
"items": {"type": "string"},
"description": "Up to 3 key takeaways from the page"
}
},
"required": ["summary"]
}
}
}
]
# Ask Ollama to summarize
try:
response = requests.post(
"http://localhost:11434/api/chat",
json={
"model": "qwen3:8b",
"messages": [
{
"role": "user",
"content": f"Summarize this web page content:\n\nURL: {url}\n\nContent:\n{content}"
}
],
"tools": tools,
"stream": False
},
timeout=60
)
result = response.json()
msg = result.get("message", {})
# If model used the structured tool, extract from it
if msg.get("tool_calls"):
args_raw = msg["tool_calls"][0]["function"].get("arguments", {})
if isinstance(args_raw, str):
args_raw = json.loads(args_raw)
summary = args_raw.get("summary", "")
points = args_raw.get("key_points", [])
output = f"**{url}**\n\n{summary}"
if points:
output += "\n\n**Key points:**\n" + "\n".join(f"- {p}" for p in points)
return output
# Otherwise use the direct content response
return f"**{url}**\n\n{msg.get('content', 'No response generated.')}"
except Exception as e:
return f"Error calling Ollama: {e}"
A practical skill: !readfile
This lets users trigger a file read from your server. You control which directory is allowed. Useful for reading logs, status files, or shared notes from Discord.
import os
import requests
SKILL = {
"trigger": "!readfile",
"description": "Read a file from the allowed directory and summarize it"
}
# Only files inside this directory can be read - change to your path
ALLOWED_DIR = os.path.expanduser("~/Documents/bot-readable")
def run(args, context):
filename = args.strip().lstrip("/")
if not filename:
return "Usage: !readfile <filename> (reads from the bot-readable directory)"
# Security: resolve full path and confirm it stays inside ALLOWED_DIR
full_path = os.path.realpath(os.path.join(ALLOWED_DIR, filename))
if not full_path.startswith(os.path.realpath(ALLOWED_DIR)):
return "Access denied: path is outside the allowed directory."
if not os.path.isfile(full_path):
return f"File not found: {filename}"
try:
with open(full_path, "r", encoding="utf-8") as f:
content = f.read(10000) # cap at 10KB
except Exception as e:
return f"Could not read file: {e}"
# Summarize with Ollama
try:
response = requests.post(
"http://localhost:11434/api/chat",
json={
"model": "qwen3:8b",
"messages": [
{
"role": "user",
"content": f"Summarize this file in 3-5 sentences:\n\nFilename: {filename}\n\n{content}"
}
],
"stream": False
},
timeout=60
)
return response.json()["message"]["content"]
except Exception as e:
return f"Error calling Ollama: {e}"
Create the directory before using this: mkdir -p ~/Documents/bot-readable.
Drop any files you want the bot to be able to read into that folder. Nothing outside it is
accessible.
These skills give your Discord bot the ability to read files and fetch URLs from your server. A few things to have in place:
- Lock down which channels or users can trigger these commands in your OpenClaw config
- Use path validation (the
!readfileskill above does this) - never build a file path from user input without checking it - For
!fetch, consider a URL allowlist if you don't want users fetching arbitrary URLs - Run your OpenClaw instance on a user account with limited filesystem permissions
Connecting to a running mcphost instance
If you want your OpenClaw bot to use the full MCP server ecosystem (all the servers you
configured in mcp.json), the cleanest approach is to run mcphost as a background
service and have your skills call it via subprocess when needed.
For most use cases, the direct Ollama tool calling shown above is simpler and more reliable. But if you find yourself duplicating tool logic across multiple skills, that's the point where running a dedicated MCP server (or writing a custom MCP server) starts paying off. Anthropic's MCP Python SDK is the right tool for that - it lets you write a full MCP server or client in Python.
Where to Go From Here
You've got a local AI model with real tools, and optionally a Discord/Telegram bot that can use them. Here's what to build on next.
What you've accomplished
- ✓ Confirmed Ollama tool calling works with qwen3:8b
- ✓ Installed Node.js and the MCP server packages
- ✓ Connected mcphost to Ollama with filesystem access
- ✓ Added web fetch and SQLite database access
- ✓ Built OpenClaw skills that use Ollama tool calling from Discord
Run it 24/7 on dedicated hardware
An always-on MCP setup - where your model has persistent access to your files, databases, and web - needs hardware that stays on. Your laptop is fine for personal use, but if you want your OpenClaw bot running continuously with MCP capabilities, a dedicated mini PC is the practical answer.
A 32GB Ryzen 7 mini PC runs qwen3:8b comfortably with headroom for other services. It draws less power than a desktop, sits quietly on a shelf, and handles Ollama + OpenClaw + mcphost simultaneously without breaking a sweat.
Ryzen 7 6800H Mini PC (32GB) on Amazon → Affiliate link - costs you nothing extra, helps keep these guides free and updated.Build your own MCP server
The official MCP servers cover the common cases. When you need something specific to your setup - a server that talks to your home automation API, queries a specific internal service, or reads a proprietary data format - you build your own.
Anthropic's MCP Python SDK makes this straightforward. A minimal MCP server is about 30 lines of Python. Define your tools with their parameter schemas, implement the handlers, and any MCP client (including mcphost) can connect to it.
Explore next
- → Ollama Advanced - tune your model for better tool calling accuracy (num_ctx, temperature settings matter here)
- → Run Gemma 4 Locally - if you want vision input alongside tool calling, gemma4:12b does both
- → Local Coding Assistant - Continue.dev also supports MCP now, so your VS Code AI can use the same servers
- → MCP Server Registry - the full list of official servers
- → MCP Specification - the full protocol spec if you're building your own server
- → mcphost on GitHub - release notes and configuration reference
Most people running local AI are still using it as a glorified search box. You've wired your model into your actual environment - it can read your files, browse the web, and talk to your databases. And if you did the OpenClaw section, your Discord server has access to all of that. That's a local AI agent setup that's actually doing work.