Ollama + MCP

Connect your local Ollama model to real tools via MCP - read files, browse the web, query databases. Your AI stops answering and starts doing. Extend your OpenClaw bot with the same capabilities.

45 min Intermediate Free Updated June 2026

Your Local AI Can Finally Do Things

You've got Ollama running. You can ask it questions, get summaries, generate code. That's useful - but it's also working from a closed box. The model only knows what you type at it.

Model Context Protocol (MCP) changes that. It's an open standard that gives your local AI model a standardized way to reach outside its training data and actually interact with the world - your files, your web browser, your databases, your APIs.

The problem MCP solves

Before MCP, every AI tool that wanted to give a model access to, say, the filesystem had to build that integration from scratch in a custom way. A different tool wanted web access - built differently again. Nothing was compatible. Every model, every tool, every app was its own island.

MCP is the USB port for AI tools. One standard protocol, any model, any tool. Build a filesystem MCP server once and it works with Ollama, with Claude Desktop, with VS Code, with your own scripts. Build an Ollama integration once and it works with any MCP server.

How it works

MCP has two sides:

  • MCP Servers - small programs that expose tools (read a file, fetch a URL, run a SQL query). You run these locally.
  • MCP Clients - the host application that connects your AI model to those servers. The client routes tool calls between the model and the servers.

When you ask a question, the client tells the model what tools are available. The model decides which tools to call, the client executes them, and the results come back into the conversation. The model uses those results to form its final answer.

Everything runs on your machine. The tools run locally. Ollama runs locally. No data leaves your network unless you configure a tool that specifically fetches external content.

What this actually enables

  • "Summarize the PDF in my Documents folder" - model reads the actual file
  • "What's on the OpenClaw Sanctuary homepage right now?" - model fetches the live page
  • "How many orders came in this week?" - model queries your SQLite database
  • "Create a summary of my notes from the last three days" - model reads your notes directory
  • "Check if my server is responding" - model makes an HTTP request and reports back

This is the difference between an AI that answers questions and an AI that does work.

Where OpenClaw comes in: Your OpenClaw bot on Discord or Telegram currently works from training data and what users type in chat. Once Ollama is wired to MCP tools, your bot can read files from your server, fetch live data from the web, query a local database - all triggered by a message in Discord. A command like !summary report.pdf or !status stops being a lookup and starts being an actual agent action. That's what this tutorial builds toward.

What you'll need

Prerequisites
  • Ollama installed and running - see Ollama Basics if you haven't done this yet
  • A tool-calling model pulled: ollama pull qwen3:8b is the recommended starting point
  • Node.js 18 or newer (for the MCP servers - they run as npm packages)
  • Python 3.10+ if you're doing the OpenClaw integration section
  • About 45 minutes

Which models support tool calling?

Not every Ollama model handles tool calling well. The reliable ones in 2026:

  • qwen3:8b - recommended. Strong tool calling, 8GB RAM, fast enough for interactive use
  • qwen3:14b - better accuracy on complex multi-step tool use, needs 16GB RAM
  • gemma4:12b - 86% tool calling accuracy, also adds vision. Needs 16GB RAM.
  • phi4 - solid tool calling, good at structured output, 16GB RAM
  • llama3.2:3b - works for simple tool calls, fast on 8GB machines

Reasoning models like deepseek-r1:7b are not optimized for tool calling - they think through problems in prose, not structured function calls. Use qwen3 variants for anything involving tools.

Getting the Pieces Ready

You need three things: a tool-calling model in Ollama, Node.js to run the MCP servers, and mcphost to connect them. This section gets all three in place.

Step 1 - Confirm your model supports tool calling

Pull qwen3:8b if you haven't already:

Terminal
ollama pull qwen3:8b

Quick test - if this returns a JSON object with a tool_calls field, you're good:

Terminal
curl -s http://localhost:11434/api/chat -d '{
  "model": "qwen3:8b",
  "messages": [{"role": "user", "content": "What is 2 + 2?"}],
  "tools": [{"type": "function", "function": {"name": "calculator", "description": "Calculate a math expression", "parameters": {"type": "object", "properties": {"expression": {"type": "string"}}, "required": ["expression"]}}}],
  "stream": false
}' | python3 -m json.tool | grep -A5 "tool_calls\|content"

You'll either see a tool_calls array (model decided to use the tool) or a direct content response (model answered without using the tool). Both are correct behavior.

Step 2 - Install Node.js

Check if you already have it:

Terminal
node --version

You need version 18 or newer. If you get "command not found" or a version below 18:

Terminal (Linux / Debian-based)
curl -fsSL https://deb.nodesource.com/setup_20.x | sudo -E bash -
sudo apt-get install -y nodejs
Terminal (macOS)
brew install node

Confirm after install: node --version should show v20.x.x or newer.

Step 3 - Test an MCP server

MCP servers are npm packages. Before wiring anything together, confirm the filesystem server runs on your machine:

Terminal
npx -y @modelcontextprotocol/server-filesystem /tmp

You should see something like Filesystem MCP Server running on stdio. Hit Ctrl+C - that's all you need to confirm it works. The first run downloads the package; subsequent runs use the cache.

Step 4 - Install mcphost

mcphost is the client that connects Ollama to your MCP servers. It's a single binary - download it, make it executable, done.

Go to the mcphost releases page and download the binary for your platform:

Terminal (Linux x86_64)
# Replace vX.X.X with the latest release version from the releases page
curl -L https://github.com/mark3labs/mcphost/releases/latest/download/mcphost_linux_amd64 -o mcphost
chmod +x mcphost
sudo mv mcphost /usr/local/bin/
Terminal (macOS - if you have Go installed)
go install github.com/mark3labs/mcphost@latest

Verify it installed:

Terminal
mcphost --help
Prefer a pure Python approach? If you'd rather skip the Go binary, Anthropic's mcp Python SDK lets you write your own client in a few dozen lines. The OpenClaw integration section later in this guide shows that approach. Install both if you want the interactive mcphost experience and the programmatic Python integration.

Give Your Model Access to Your Files

The filesystem MCP server is the right starting point. It's the most immediately useful tool, it's fully local, and once it's working you'll understand the pattern for everything else.

Create your MCP config file

mcphost reads a config file that tells it which MCP servers to connect to. Create mcp.json anywhere on your machine - your home directory is fine:

~/mcp.json
{
  "mcpServers": {
    "filesystem": {
      "command": "npx",
      "args": [
        "-y",
        "@modelcontextprotocol/server-filesystem",
        "/home/youruser/Documents"
      ]
    }
  }
}

Replace /home/youruser/Documents with an actual path on your machine. The filesystem server only has access to paths you explicitly list here - anything outside is off limits. On macOS use /Users/yourname/Documents.

Start with a specific folder, not your root. Don't point it at / or /home. Give it access to one folder you control - Documents, a project directory, a notes folder. You can always add more paths later by adding them to the args array.

Start mcphost

Terminal
mcphost --model ollama/qwen3:8b --config ~/mcp.json

mcphost starts Ollama, connects to the filesystem server, and gives you an interactive prompt. You should see something like:

Output
Connected to MCP server: filesystem
Tools available: read_file, write_file, list_directory, create_directory, move_file, search_files, get_file_info
Model: ollama/qwen3:8b

> 

Try it

Type these at the prompt and watch the model use actual tools to respond:

mcphost prompt
> What files are in my Documents folder?

The model calls list_directory on the path you configured. The result comes back as actual directory contents, and the model formats a response using that real data.

mcphost prompt
> Read the file notes.txt and summarize what it's about

The model calls read_file, gets the actual file contents, and summarizes them. If the file doesn't exist it tells you that too - it's not making anything up.

mcphost prompt
> Search for any files containing the word "budget" in my Documents

search_files runs a search across the allowed directory and returns matching files.

What just happened under the hood

When you type a question, mcphost sends the conversation to Ollama along with a list of available tools (their names, descriptions, and parameter schemas). Ollama decides whether to answer directly or call a tool.

If it calls a tool, mcphost executes the call against the MCP server, gets the result, and feeds it back to Ollama as a tool response in the conversation. Ollama then formulates its final answer using that result. This loop can run multiple times in one response if the model needs to call multiple tools.

Your files never leave your machine. The model running in Ollama processes everything locally. The MCP server runs locally. Nothing touches the internet unless you add a server that explicitly fetches external content (covered in the next section).

Your local model just read a real file on your machine. No cloud. No API key. No subscription. That's the core of what MCP gives you - the model stops being a static knowledge base and starts being an active participant in your actual environment.

Web, Databases, and Beyond

Filesystem access is useful. Pair it with web fetching and database access and your local AI starts handling work that used to require custom integrations for every task.

The fetch server - live web content

The fetch MCP server lets your model retrieve any URL and read its content. The server strips HTML and returns clean text, so the model isn't wading through markup.

Add it to your mcp.json:

~/mcp.json
{
  "mcpServers": {
    "filesystem": {
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-filesystem", "/home/youruser/Documents"]
    },
    "fetch": {
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-fetch"]
    }
  }
}

Restart mcphost and try:

mcphost prompt
> What's the current Ollama version? Check the ollama.com releases page.
> Summarize the top three items from Hacker News right now.
> Fetch openclawsanctuary.com and tell me what this site is about.

The model fetches the live page, reads the actual content, and reports back. This is current data - whatever is on the page right now, not training data from months ago.

The fetch server has no rate limiting or access controls. It can request any URL your machine can reach - including internal network addresses. Be deliberate about where you run this. On a home mini PC it's fine. On a server with access to internal services, think about what you're exposing.

The SQLite server - query your data

If you have any SQLite databases on your machine - exported from another app, a local project database, anything - the SQLite server lets your model query them in plain English.

~/mcp.json
{
  "mcpServers": {
    "filesystem": {
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-filesystem", "/home/youruser/Documents"]
    },
    "fetch": {
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-fetch"]
    },
    "database": {
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-sqlite", "--db-path", "/home/youruser/mydata.db"]
    }
  }
}
mcphost prompt
> What tables are in this database?
> How many records were added this week?
> Show me the five most recent entries with their details.

The model reads the schema, writes the appropriate SQL, runs it, and gives you a readable answer. You don't write SQL - you ask in plain English and the model figures out the query.

More servers worth knowing about

The official MCP server list at github.com/modelcontextprotocol/servers has everything Anthropic maintains. A few practical ones:

GitHubrepo access
@modelcontextprotocol/server-github - read issues, PRs, file contents from your repos. Useful for "what open issues do I have this week" or summarizing a PR without leaving your terminal.
Memorypersistent context
@modelcontextprotocol/server-memory - gives the model a persistent knowledge graph it can write to and read from across sessions. Ask it to "remember that the project deadline is July 15" and it'll know next time.
PostgreSQLproduction databases
@modelcontextprotocol/server-postgres - same idea as SQLite but for a live Postgres database. Read-only by default. Good for data analysis on your own database.
Puppeteerbrowser automation
@modelcontextprotocol/server-puppeteer - headless browser control. The model can navigate to URLs, click elements, fill forms. More advanced, but powerful for automation tasks.

Keep your config organized

As you add more servers, your mcp.json grows. A few tips that save headaches:

  • Keep a separate mcp.json per project or context - one for general use, one for a specific project with access to its database and files
  • Be specific with filesystem paths - the server only has access to what you list
  • Run mcphost --config ~/mcp-project.json to use a project-specific config
  • Name your servers descriptively in the config - "notes-dir" is clearer than "filesystem2"

Give Your Bot Real-World Capabilities

mcphost gives you an interactive session for personal use. Your OpenClaw bot gives other people (or your whole Discord server) access to those same capabilities through chat commands. This section wires them together.

The architecture

When someone in Discord types a command like !fetch https://example.com, OpenClaw receives it, passes it to a skill, and the skill handles the rest. The skill calls Ollama's API directly using tool calling - the same mechanism MCP uses under the hood.

You don't need mcphost running for this. OpenClaw skills use the Ollama API directly and handle the tool execution themselves. This approach is simpler to deploy and keeps everything in one Python process that you already control.

Update your model to a tool-calling capable one

In your OpenClaw config, make sure you're using a model that handles tool calling well. If you're currently on llama3.2 or a reasoning model, switch to qwen3:

OpenClaw config (config.yaml or equivalent)
# Use a model with reliable tool calling
model: "ollama/qwen3:8b"

# On 16GB+ RAM for better multi-step tool use:
# model: "ollama/qwen3:14b"

See Ollama Advanced for the full OpenClaw config reference.

A practical skill: !fetch

This skill lets users in Discord type !fetch https://some-url.com and get back a summary of that page - fetched live by your local Ollama model. Nothing goes to an external AI service.

Create this file at ~/.openclaw/workspace/skills/fetch_skill.py:

~/.openclaw/workspace/skills/fetch_skill.py
import requests
import json

SKILL = {
    "trigger": "!fetch",
    "description": "Fetch a URL and summarize its content using local AI"
}

def run(args, context):
    url = args.strip()
    if not url or not url.startswith("http"):
        return "Usage: !fetch <url>  (must start with http or https)"

    # Fetch the page content
    try:
        resp = requests.get(url, timeout=15, headers={"User-Agent": "Mozilla/5.0"})
        resp.raise_for_status()
        # Strip to first 8000 chars - enough for most pages, fits comfortably in context
        content = resp.text[:8000]
    except requests.RequestException as e:
        return f"Could not fetch {url}: {e}"

    # Define the fetch tool for Ollama
    tools = [
        {
            "type": "function",
            "function": {
                "name": "summarize_content",
                "description": "Summarize the provided web page content",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "summary": {
                            "type": "string",
                            "description": "A clear 3-5 sentence summary of the content"
                        },
                        "key_points": {
                            "type": "array",
                            "items": {"type": "string"},
                            "description": "Up to 3 key takeaways from the page"
                        }
                    },
                    "required": ["summary"]
                }
            }
        }
    ]

    # Ask Ollama to summarize
    try:
        response = requests.post(
            "http://localhost:11434/api/chat",
            json={
                "model": "qwen3:8b",
                "messages": [
                    {
                        "role": "user",
                        "content": f"Summarize this web page content:\n\nURL: {url}\n\nContent:\n{content}"
                    }
                ],
                "tools": tools,
                "stream": False
            },
            timeout=60
        )
        result = response.json()
        msg = result.get("message", {})

        # If model used the structured tool, extract from it
        if msg.get("tool_calls"):
            args_raw = msg["tool_calls"][0]["function"].get("arguments", {})
            if isinstance(args_raw, str):
                args_raw = json.loads(args_raw)
            summary = args_raw.get("summary", "")
            points = args_raw.get("key_points", [])
            output = f"**{url}**\n\n{summary}"
            if points:
                output += "\n\n**Key points:**\n" + "\n".join(f"- {p}" for p in points)
            return output

        # Otherwise use the direct content response
        return f"**{url}**\n\n{msg.get('content', 'No response generated.')}"

    except Exception as e:
        return f"Error calling Ollama: {e}"

A practical skill: !readfile

This lets users trigger a file read from your server. You control which directory is allowed. Useful for reading logs, status files, or shared notes from Discord.

~/.openclaw/workspace/skills/readfile_skill.py
import os
import requests

SKILL = {
    "trigger": "!readfile",
    "description": "Read a file from the allowed directory and summarize it"
}

# Only files inside this directory can be read - change to your path
ALLOWED_DIR = os.path.expanduser("~/Documents/bot-readable")

def run(args, context):
    filename = args.strip().lstrip("/")
    if not filename:
        return "Usage: !readfile <filename>  (reads from the bot-readable directory)"

    # Security: resolve full path and confirm it stays inside ALLOWED_DIR
    full_path = os.path.realpath(os.path.join(ALLOWED_DIR, filename))
    if not full_path.startswith(os.path.realpath(ALLOWED_DIR)):
        return "Access denied: path is outside the allowed directory."

    if not os.path.isfile(full_path):
        return f"File not found: {filename}"

    try:
        with open(full_path, "r", encoding="utf-8") as f:
            content = f.read(10000)  # cap at 10KB
    except Exception as e:
        return f"Could not read file: {e}"

    # Summarize with Ollama
    try:
        response = requests.post(
            "http://localhost:11434/api/chat",
            json={
                "model": "qwen3:8b",
                "messages": [
                    {
                        "role": "user",
                        "content": f"Summarize this file in 3-5 sentences:\n\nFilename: {filename}\n\n{content}"
                    }
                ],
                "stream": False
            },
            timeout=60
        )
        return response.json()["message"]["content"]
    except Exception as e:
        return f"Error calling Ollama: {e}"

Create the directory before using this: mkdir -p ~/Documents/bot-readable. Drop any files you want the bot to be able to read into that folder. Nothing outside it is accessible.

Security note on file and web access from a bot.

These skills give your Discord bot the ability to read files and fetch URLs from your server. A few things to have in place:

  • Lock down which channels or users can trigger these commands in your OpenClaw config
  • Use path validation (the !readfile skill above does this) - never build a file path from user input without checking it
  • For !fetch, consider a URL allowlist if you don't want users fetching arbitrary URLs
  • Run your OpenClaw instance on a user account with limited filesystem permissions

Connecting to a running mcphost instance

If you want your OpenClaw bot to use the full MCP server ecosystem (all the servers you configured in mcp.json), the cleanest approach is to run mcphost as a background service and have your skills call it via subprocess when needed.

For most use cases, the direct Ollama tool calling shown above is simpler and more reliable. But if you find yourself duplicating tool logic across multiple skills, that's the point where running a dedicated MCP server (or writing a custom MCP server) starts paying off. Anthropic's MCP Python SDK is the right tool for that - it lets you write a full MCP server or client in Python.

What you've built: A Discord/Telegram bot that can fetch live web content and read files from your server on demand. It runs entirely on your hardware, the AI is local, and you control exactly what it has access to.

Where to Go From Here

You've got a local AI model with real tools, and optionally a Discord/Telegram bot that can use them. Here's what to build on next.

What you've accomplished

  • ✓ Confirmed Ollama tool calling works with qwen3:8b
  • ✓ Installed Node.js and the MCP server packages
  • ✓ Connected mcphost to Ollama with filesystem access
  • ✓ Added web fetch and SQLite database access
  • ✓ Built OpenClaw skills that use Ollama tool calling from Discord

Run it 24/7 on dedicated hardware

An always-on MCP setup - where your model has persistent access to your files, databases, and web - needs hardware that stays on. Your laptop is fine for personal use, but if you want your OpenClaw bot running continuously with MCP capabilities, a dedicated mini PC is the practical answer.

A 32GB Ryzen 7 mini PC runs qwen3:8b comfortably with headroom for other services. It draws less power than a desktop, sits quietly on a shelf, and handles Ollama + OpenClaw + mcphost simultaneously without breaking a sweat.

Ryzen 7 6800H Mini PC (32GB) on Amazon → Affiliate link - costs you nothing extra, helps keep these guides free and updated.

Build your own MCP server

The official MCP servers cover the common cases. When you need something specific to your setup - a server that talks to your home automation API, queries a specific internal service, or reads a proprietary data format - you build your own.

Anthropic's MCP Python SDK makes this straightforward. A minimal MCP server is about 30 lines of Python. Define your tools with their parameter schemas, implement the handlers, and any MCP client (including mcphost) can connect to it.

Explore next

You've built something genuinely useful.

Most people running local AI are still using it as a glorified search box. You've wired your model into your actual environment - it can read your files, browse the web, and talk to your databases. And if you did the OpenClaw section, your Discord server has access to all of that. That's a local AI agent setup that's actually doing work.