Open WebUI

Set up your own private ChatGPT using Open WebUI and Ollama. A polished chat interface running on your hardware - conversation history, document upload, multi-user support, zero subscriptions.

45 min Beginner Free Updated March 2026

Your Own Private ChatGPT

You've got Ollama running. You can pull models, chat in the terminal, hit the REST API. That's great. But let's be honest - typing into a terminal isn't exactly how most people want to talk to an AI.

What if you had something that looked and felt like ChatGPT, but ran entirely on your own hardware? No subscription. No data leaving your network. No usage limits. Just you, your models, and a clean interface in your browser.

That's exactly what Open WebUI gives you.

What is Open WebUI?

Open WebUI is a self-hosted web interface for local AI models. It connects directly to Ollama and gives you a polished chat experience in your browser. Think ChatGPT's interface, but pointing at your own models instead of OpenAI's servers.

It's one of the most popular open source AI projects right now - over 124,000 stars on GitHub and millions of Docker pulls. People use it daily as their primary AI chat tool.

Why bother?

The terminal is fine for quick questions, but Open WebUI gives you things that actually matter for daily use:

  • Conversation history - your chats are saved and searchable
  • Multiple models - switch between models mid-conversation with a dropdown
  • Document upload - drag in a PDF and ask questions about it
  • Multi-user - your family or team each get their own accounts
  • Mobile friendly - works great on your phone's browser too

All of this runs on your machine. Nothing phones home. Nothing gets logged by a third party.

What you'll need

Prerequisites
  • Ollama installed and running with at least one model pulled (see Ollama Basics)
  • Docker installed (we'll cover this in the next section if you don't have it)
  • A machine with 16GB+ RAM (the Ryzen 7 6800H with 32GB handles this with room to spare)
  • About 10 minutes of your time

What you'll have at the end

A fully working ChatGPT-style interface running in your browser, connected to your local Ollama models, with conversation history and document upload. You'll wonder why you ever paid for a ChatGPT subscription.

How this fits into Book III: This tutorial is an optional add-on. The core path is Ollama Basics then Ollama Advanced (where you wire local models into OpenClaw). Open WebUI gives you a separate web chat interface for your Ollama models - useful on its own, but not required for OpenClaw integration.

Getting Docker Ready

Open WebUI runs as a Docker container. If you already have Docker installed from an earlier tutorial, skip ahead to "Run Open WebUI" below. If not, this takes about two minutes.

Install Docker

If you're on Linux (including your mini PC or EC2 instance), run:

🖥️ Terminal
curl -fsSL https://get.docker.com | sh

Then add your user to the docker group so you don't need sudo every time:

🖥️ Terminal
sudo usermod -aG docker $USER

Log out and back in for the group change to take effect.

On macOS, download Docker Desktop instead. It's a regular app install.

Verify Docker is running

🖥️ Terminal
docker --version

You should see something like:

🖥️ Output
Docker version 27.x.x, build xxxxxxx

If you get a "command not found" or "permission denied" error, make sure you logged out and back in after adding yourself to the docker group.

Run Open WebUI

One command. That's it:

🖥️ Terminal
docker run -d \
  --name open-webui \
  --network host \
  -v open-webui:/app/backend/data \
  --restart unless-stopped \
  ghcr.io/open-webui/open-webui:main

Here's what each part does:

  • -d - runs in the background (detached mode)
  • --name open-webui - gives the container a name you can reference later
  • --network host - shares your machine's network with the container (this is the important one - see below)
  • -v open-webui:/app/backend/data - saves your data (chats, settings) in a Docker volume so it survives restarts
  • --restart unless-stopped - auto-starts when your machine boots

The first run will take a minute or two while Docker downloads the image. After that, it starts in seconds.

Why --network host matters

This is worth understanding because it trips people up. Docker containers are isolated - they run in their own little world with their own networking. By default, a container can't see anything running on your machine. It doesn't know Ollama exists.

But Ollama is running directly on your machine (not in Docker), listening on localhost:11434. So we need to bridge that gap.

The --network host flag tells Docker: "don't isolate the network - share the host machine's network stack." That way, when Open WebUI tries to reach localhost:11434, it actually connects to Ollama running on your machine. Your models, your conversations, your data - all staying local.

No models showing up?

If you open Open WebUI and the model dropdown is empty, that's the connection to Ollama not working. Check two things:

  • Is Ollama actually running? Run ollama list in your terminal. If it errors out, start Ollama first.
  • In Open WebUI, go to Settings > Connections and make sure the Ollama URL is set to http://localhost:11434. Hit the refresh icon next to it.
macOS users: use port mapping instead.

--network host doesn't work the same way on macOS because Docker Desktop runs inside a Linux VM. Use this command instead:

🖥️ Terminal (macOS)
docker run -d \
  --name open-webui \
  -p 3000:8080 \
  -v open-webui:/app/backend/data \
  -e OLLAMA_BASE_URL=http://host.docker.internal:11434 \
  --restart unless-stopped \
  ghcr.io/open-webui/open-webui:main

The host.docker.internal address is Docker's way of saying "the machine running this container." Access the UI at http://localhost:3000.

Verify it's running

🖥️ Terminal
docker ps

You should see a container named open-webui with status "Up." Give it about 30 seconds after the first start, then open your browser.

Say Hello to Your Private AI

Open WebUI is running. Time to open it up and have your first conversation with a local model through a real interface - not a terminal.

Step 1 - Open the interface

In your browser, go to:

🌐 Browser
http://localhost:8080

If you used the port mapping option from the previous section, use http://localhost:3000 instead.

Step 2 - Create your admin account

The first person to sign up becomes the admin. This is you. Pick a name, email, and password. This account is stored locally in the Docker volume - it's not sent anywhere.

Don't skip the password.

Even though this is running on your local network, use a real password. If you ever expose this to your home network for other devices, you'll want that protection in place.

Step 3 - Pick a model

At the top of the chat window, you'll see a model selector dropdown. Because we set up the Docker container with --network host (or the OLLAMA_BASE_URL variable on macOS), Open WebUI can talk to Ollama running on your machine and automatically sees every model you've already pulled.

If you followed the Ollama Basics tutorial, you should see your models listed here. Pick whichever one you want to chat with - llama3.2 or mistral are solid general-purpose choices.

You can also pull new models from right here. Go to Settings > Models, type a model name (like qwen2.5-coder), and Open WebUI will tell Ollama to download it. No need to go back to the terminal.

Step 4 - Send your first message

Type something in the chat box and hit Enter. Try something simple like "explain how a car engine works in three sentences" or "write me a short poem about coffee."

Watch the response stream in, word by word. That's your local model generating text on your own hardware. No API call to OpenAI. No token charges. No data leaving your machine.

What just happened

Here's the flow, start to finish:

  • You typed a message in your browser
  • Open WebUI sent it to Ollama (running on localhost:11434)
  • Ollama loaded the model into memory and generated a response
  • The response streamed back through Open WebUI to your browser

Everything happened on your machine. The conversation is saved in your local Docker volume. Close the tab, come back tomorrow, and it'll still be there.

You've got a working private AI chat.

Seriously, that's it. You now have something that does 90% of what a ChatGPT subscription does, running entirely on hardware you own. The next sections cover the features that make daily use even better.

The Right Model for the Job

One of the best things about running your own AI is choice. You're not locked into one model. You can pull different models for different tasks and switch between them whenever you want.

Pull models from the interface

You can pull new models directly from Open WebUI without touching the terminal. Go to Settings > Models and type in a model name from the Ollama model library. Click the download button and it'll pull in the background.

Or if you prefer the terminal (sometimes faster for big models), just pull it with Ollama directly:

🖥️ Terminal
ollama pull llama3.2
ollama pull codellama
ollama pull mistral

Open WebUI picks up new models automatically. No restart needed.

Switch models mid-conversation

Click the model name at the top of your chat and pick a different one from the dropdown. The conversation history stays, but the new model takes over from that point. This is great for comparing how different models handle the same question.

Which model for what

Here's a practical starting point based on common tasks:

  • General chat and questions - llama3.2 or mistral (fast, good all-around)
  • Writing help - llama3.2 or neural-chat (more creative, natural-sounding)
  • Code generation - qwen2.5-coder or deepseek-coder-v2 (purpose-built for code)
  • Quick simple tasks - phi3 or gemma2:2b (tiny, instant responses)
  • Complex reasoning - phi4 or gemma3:12b (14B-class models that fit in 32GB and punch way above their weight)
Start small, scale up.

Don't jump straight to the biggest model. A 7-8B parameter model handles most everyday tasks just fine and responds much faster. Only reach for larger models when you actually need deeper reasoning or more nuanced output.

Hardware reality check

Model size directly affects speed and memory use. On the recommended Ryzen 7 6800H with 32GB RAM (CPU-only, no dedicated GPU):

  • 1-3B models (phi3, gemma2:2b) - near-instant responses, ~20-30 tokens/sec
  • 7-8B models (llama3.2, mistral) - good speed, ~10-15 tokens/sec. The sweet spot for daily use.
  • 13B models - noticeably slower (~5-10 tokens/sec) but more capable. Worth it for complex questions.
  • 30B+ models - pushing it on 32GB RAM. A 30B quantized model will use most of your memory and run at ~3-5 tokens/sec. 70B models need 40GB+ and aren't practical on our recommended hardware.

If a model feels sluggish, try a smaller one. The jump from 3B to 7B is meaningful. The jump from 7B to 13B is less dramatic for everyday questions.

More Than Just Chat

Open WebUI does a lot more than basic chat. Here are the features worth knowing about right away - the ones you'll actually use on a daily basis.

Conversation history

Every conversation is automatically saved in the sidebar. Come back hours, days, or weeks later and pick up where you left off. You can rename conversations, pin important ones, and delete the ones you don't need.

Since your data lives in a Docker volume on your machine, your conversation history is as safe as your hard drive. No cloud sync, no third-party access.

Document upload

Drag a PDF, text file, or document directly into the chat window. Open WebUI will read it and let you ask questions about the content. This is surprisingly useful for things like:

  • Summarizing a long report
  • Finding specific information in a manual
  • Asking questions about a research paper
  • Getting a quick overview of a contract or legal document

For more advanced document work (searching across many documents, building a knowledge base), check out the Document Q&A with RAG tutorial.

System prompts and personas

You can set a system prompt that shapes how the model responds. Go to a conversation's settings and add instructions like "You are a helpful coding assistant. Always provide examples in Python" or "Explain things simply, like I'm a beginner."

You can also create saved presets for different use cases - one for coding help, one for writing feedback, one for brainstorming. Switch between them depending on what you're doing.

Web search (optional)

Open WebUI can optionally search the web to supplement model responses. This helps when you need current information that the model wasn't trained on. You can configure this in Settings > Web Search using free search APIs like SearXNG (which you can also self-host).

This is optional. Without it, the model works purely from its training data and any documents you upload.

Keyboard shortcuts

A few shortcuts that speed things up:

  • Ctrl + Shift + O - new chat
  • Ctrl + Shift + S - toggle sidebar
  • Ctrl + Shift + Backspace - delete current chat
  • / in the message box - browse available commands and shortcuts

Check the Open WebUI docs for the full list. They add new features regularly.

Let Others Use It Too

One of the nicest things about running your own AI server is sharing it. Your partner, kids, roommates, or coworkers can all have their own accounts with separate conversation histories. No extra cost, no extra subscriptions.

Create additional accounts

As admin, go to Admin Panel > Users. You can either create accounts yourself or change the signup settings to let people register on their own.

By default, new signups require admin approval. This is a good setting to keep - you probably don't want random people on your network creating accounts.

User roles

Open WebUI has two roles:

  • Admin - can manage users, change settings, pull models, see everything
  • User - can chat, upload documents, and manage their own conversations

Give everyone the "User" role unless they need to manage the system. Each user gets their own separate conversation history that other users can't see.

Access from other devices

By default, Open WebUI listens on all network interfaces. That means anyone on your local network can access it using your machine's IP address instead of localhost.

Find your machine's local IP:

🖥️ Terminal
hostname -I | awk '{print $1}'

Then from any other device on your network (phone, tablet, another computer), open:

🌐 Browser (other device)
http://192.168.1.XXX:8080

Replace 192.168.1.XXX with your actual IP. Bookmark it on your phone for quick access.

This is for your home network only.

Don't expose Open WebUI to the internet without proper security (reverse proxy, HTTPS, authentication). On your home network behind your router, you're fine. But opening port 8080 to the world is asking for trouble.

Where to Go From Here

You've got a private AI chat running on your own hardware. Let's make sure it stays running and talk about what to explore next.

Keeping Open WebUI updated

The project moves fast - new features land regularly. To update:

🖥️ Terminal
docker pull ghcr.io/open-webui/open-webui:main
docker stop open-webui
docker rm open-webui
docker run -d \
  --name open-webui \
  --network host \
  -v open-webui:/app/backend/data \
  --restart unless-stopped \
  ghcr.io/open-webui/open-webui:main

Your conversations and settings are stored in the Docker volume (open-webui), so they survive the container being replaced. Nothing gets lost.

Docker Compose (recommended for long-term use)

If you want a cleaner setup that's easier to maintain, create a docker-compose.yml file:

🖥️ docker-compose.yml
version: '3.8'
services:
  open-webui:
    image: ghcr.io/open-webui/open-webui:main
    container_name: open-webui
    network_mode: host
    volumes:
      - open-webui:/app/backend/data
    restart: unless-stopped

volumes:
  open-webui:

Then start it with:

🖥️ Terminal
docker compose up -d

And update later with docker compose pull && docker compose up -d. Much cleaner than remembering the long docker run command.

What you've accomplished

  • ✓ Installed Docker and ran Open WebUI
  • ✓ Created your admin account
  • ✓ Had your first conversation with a local model through a real interface
  • ✓ Learned how to manage and switch between models
  • ✓ Set up multi-user access for your household

Explore next

Open WebUI tutorial complete.

You're running a private, self-hosted AI chat. No subscriptions, no data collection, no limits. This is what self-hosted AI is all about.