Your Own Private ChatGPT
You've got Ollama running. You can pull models, chat in the terminal, hit the REST API. That's great. But let's be honest - typing into a terminal isn't exactly how most people want to talk to an AI.
What if you had something that looked and felt like ChatGPT, but ran entirely on your own hardware? No subscription. No data leaving your network. No usage limits. Just you, your models, and a clean interface in your browser.
That's exactly what Open WebUI gives you.
What is Open WebUI?
Open WebUI is a self-hosted web interface for local AI models. It connects directly to Ollama and gives you a polished chat experience in your browser. Think ChatGPT's interface, but pointing at your own models instead of OpenAI's servers.
It's one of the most popular open source AI projects right now - over 124,000 stars on GitHub and millions of Docker pulls. People use it daily as their primary AI chat tool.
Why bother?
The terminal is fine for quick questions, but Open WebUI gives you things that actually matter for daily use:
- Conversation history - your chats are saved and searchable
- Multiple models - switch between models mid-conversation with a dropdown
- Document upload - drag in a PDF and ask questions about it
- Multi-user - your family or team each get their own accounts
- Mobile friendly - works great on your phone's browser too
All of this runs on your machine. Nothing phones home. Nothing gets logged by a third party.
What you'll need
- Ollama installed and running with at least one model pulled (see Ollama Basics)
- Docker installed (we'll cover this in the next section if you don't have it)
- A machine with 16GB+ RAM (the Ryzen 7 6800H with 32GB handles this with room to spare)
- About 10 minutes of your time
What you'll have at the end
A fully working ChatGPT-style interface running in your browser, connected to your local Ollama models, with conversation history and document upload. You'll wonder why you ever paid for a ChatGPT subscription.
Getting Docker Ready
Open WebUI runs as a Docker container. If you already have Docker installed from an earlier tutorial, skip ahead to "Run Open WebUI" below. If not, this takes about two minutes.
Install Docker
If you're on Linux (including your mini PC or EC2 instance), run:
curl -fsSL https://get.docker.com | sh
Then add your user to the docker group so you don't need sudo every time:
sudo usermod -aG docker $USER
Log out and back in for the group change to take effect.
On macOS, download Docker Desktop instead. It's a regular app install.
Verify Docker is running
docker --version
You should see something like:
Docker version 27.x.x, build xxxxxxx
If you get a "command not found" or "permission denied" error, make sure you logged out and back in after adding yourself to the docker group.
Run Open WebUI
One command. That's it:
docker run -d \
--name open-webui \
--network host \
-v open-webui:/app/backend/data \
--restart unless-stopped \
ghcr.io/open-webui/open-webui:main
Here's what each part does:
-d- runs in the background (detached mode)--name open-webui- gives the container a name you can reference later--network host- shares your machine's network with the container (this is the important one - see below)-v open-webui:/app/backend/data- saves your data (chats, settings) in a Docker volume so it survives restarts--restart unless-stopped- auto-starts when your machine boots
The first run will take a minute or two while Docker downloads the image. After that, it starts in seconds.
Why --network host matters
This is worth understanding because it trips people up. Docker containers are isolated - they run in their own little world with their own networking. By default, a container can't see anything running on your machine. It doesn't know Ollama exists.
But Ollama is running directly on your machine (not in Docker), listening on
localhost:11434. So we need to bridge that gap.
The --network host flag tells Docker: "don't isolate the network - share the
host machine's network stack." That way, when Open WebUI tries to reach
localhost:11434, it actually connects to Ollama running on your machine.
Your models, your conversations, your data - all staying local.
If you open Open WebUI and the model dropdown is empty, that's the connection to Ollama not working. Check two things:
- Is Ollama actually running? Run
ollama listin your terminal. If it errors out, start Ollama first. - In Open WebUI, go to Settings > Connections and make sure the Ollama URL is set to
http://localhost:11434. Hit the refresh icon next to it.
--network host doesn't work the same way on macOS because Docker Desktop
runs inside a Linux VM. Use this command instead:
docker run -d \
--name open-webui \
-p 3000:8080 \
-v open-webui:/app/backend/data \
-e OLLAMA_BASE_URL=http://host.docker.internal:11434 \
--restart unless-stopped \
ghcr.io/open-webui/open-webui:main
The host.docker.internal address is Docker's way of saying "the machine
running this container." Access the UI at http://localhost:3000.
Verify it's running
docker ps
You should see a container named open-webui with status "Up." Give it about 30 seconds
after the first start, then open your browser.
Say Hello to Your Private AI
Open WebUI is running. Time to open it up and have your first conversation with a local model through a real interface - not a terminal.
Step 1 - Open the interface
In your browser, go to:
http://localhost:8080
If you used the port mapping option from the previous section, use http://localhost:3000 instead.
Step 2 - Create your admin account
The first person to sign up becomes the admin. This is you. Pick a name, email, and password. This account is stored locally in the Docker volume - it's not sent anywhere.
Even though this is running on your local network, use a real password. If you ever expose this to your home network for other devices, you'll want that protection in place.
Step 3 - Pick a model
At the top of the chat window, you'll see a model selector dropdown. Because we set up the
Docker container with --network host (or the OLLAMA_BASE_URL variable
on macOS), Open WebUI can talk to Ollama running on your machine and automatically sees every
model you've already pulled.
If you followed the Ollama Basics tutorial, you should see
your models listed here. Pick whichever one you want to chat with - llama3.2 or
mistral are solid general-purpose choices.
You can also pull new models from right here. Go to Settings > Models, type a
model name (like qwen2.5-coder), and Open WebUI will tell Ollama to download it.
No need to go back to the terminal.
Step 4 - Send your first message
Type something in the chat box and hit Enter. Try something simple like "explain how a car engine works in three sentences" or "write me a short poem about coffee."
Watch the response stream in, word by word. That's your local model generating text on your own hardware. No API call to OpenAI. No token charges. No data leaving your machine.
What just happened
Here's the flow, start to finish:
- You typed a message in your browser
- Open WebUI sent it to Ollama (running on localhost:11434)
- Ollama loaded the model into memory and generated a response
- The response streamed back through Open WebUI to your browser
Everything happened on your machine. The conversation is saved in your local Docker volume. Close the tab, come back tomorrow, and it'll still be there.
Seriously, that's it. You now have something that does 90% of what a ChatGPT subscription does, running entirely on hardware you own. The next sections cover the features that make daily use even better.
The Right Model for the Job
One of the best things about running your own AI is choice. You're not locked into one model. You can pull different models for different tasks and switch between them whenever you want.
Pull models from the interface
You can pull new models directly from Open WebUI without touching the terminal. Go to Settings > Models and type in a model name from the Ollama model library. Click the download button and it'll pull in the background.
Or if you prefer the terminal (sometimes faster for big models), just pull it with Ollama directly:
ollama pull llama3.2
ollama pull codellama
ollama pull mistral
Open WebUI picks up new models automatically. No restart needed.
Switch models mid-conversation
Click the model name at the top of your chat and pick a different one from the dropdown. The conversation history stays, but the new model takes over from that point. This is great for comparing how different models handle the same question.
Which model for what
Here's a practical starting point based on common tasks:
- General chat and questions -
llama3.2ormistral(fast, good all-around) - Writing help -
llama3.2orneural-chat(more creative, natural-sounding) - Code generation -
qwen2.5-coderordeepseek-coder-v2(purpose-built for code) - Quick simple tasks -
phi3orgemma2:2b(tiny, instant responses) - Complex reasoning -
phi4orgemma3:12b(14B-class models that fit in 32GB and punch way above their weight)
Don't jump straight to the biggest model. A 7-8B parameter model handles most everyday tasks just fine and responds much faster. Only reach for larger models when you actually need deeper reasoning or more nuanced output.
Hardware reality check
Model size directly affects speed and memory use. On the recommended Ryzen 7 6800H with 32GB RAM (CPU-only, no dedicated GPU):
- 1-3B models (phi3, gemma2:2b) - near-instant responses, ~20-30 tokens/sec
- 7-8B models (llama3.2, mistral) - good speed, ~10-15 tokens/sec. The sweet spot for daily use.
- 13B models - noticeably slower (~5-10 tokens/sec) but more capable. Worth it for complex questions.
- 30B+ models - pushing it on 32GB RAM. A 30B quantized model will use most of your memory and run at ~3-5 tokens/sec. 70B models need 40GB+ and aren't practical on our recommended hardware.
If a model feels sluggish, try a smaller one. The jump from 3B to 7B is meaningful. The jump from 7B to 13B is less dramatic for everyday questions.
More Than Just Chat
Open WebUI does a lot more than basic chat. Here are the features worth knowing about right away - the ones you'll actually use on a daily basis.
Conversation history
Every conversation is automatically saved in the sidebar. Come back hours, days, or weeks later and pick up where you left off. You can rename conversations, pin important ones, and delete the ones you don't need.
Since your data lives in a Docker volume on your machine, your conversation history is as safe as your hard drive. No cloud sync, no third-party access.
Document upload
Drag a PDF, text file, or document directly into the chat window. Open WebUI will read it and let you ask questions about the content. This is surprisingly useful for things like:
- Summarizing a long report
- Finding specific information in a manual
- Asking questions about a research paper
- Getting a quick overview of a contract or legal document
For more advanced document work (searching across many documents, building a knowledge base), check out the Document Q&A with RAG tutorial.
System prompts and personas
You can set a system prompt that shapes how the model responds. Go to a conversation's settings and add instructions like "You are a helpful coding assistant. Always provide examples in Python" or "Explain things simply, like I'm a beginner."
You can also create saved presets for different use cases - one for coding help, one for writing feedback, one for brainstorming. Switch between them depending on what you're doing.
Web search (optional)
Open WebUI can optionally search the web to supplement model responses. This helps when you need current information that the model wasn't trained on. You can configure this in Settings > Web Search using free search APIs like SearXNG (which you can also self-host).
This is optional. Without it, the model works purely from its training data and any documents you upload.
Keyboard shortcuts
A few shortcuts that speed things up:
Ctrl + Shift + O- new chatCtrl + Shift + S- toggle sidebarCtrl + Shift + Backspace- delete current chat/in the message box - browse available commands and shortcuts
Check the Open WebUI docs for the full list. They add new features regularly.
Let Others Use It Too
One of the nicest things about running your own AI server is sharing it. Your partner, kids, roommates, or coworkers can all have their own accounts with separate conversation histories. No extra cost, no extra subscriptions.
Create additional accounts
As admin, go to Admin Panel > Users. You can either create accounts yourself or change the signup settings to let people register on their own.
By default, new signups require admin approval. This is a good setting to keep - you probably don't want random people on your network creating accounts.
User roles
Open WebUI has two roles:
- Admin - can manage users, change settings, pull models, see everything
- User - can chat, upload documents, and manage their own conversations
Give everyone the "User" role unless they need to manage the system. Each user gets their own separate conversation history that other users can't see.
Access from other devices
By default, Open WebUI listens on all network interfaces. That means anyone on your local network can access it using your machine's IP address instead of localhost.
Find your machine's local IP:
hostname -I | awk '{print $1}'
Then from any other device on your network (phone, tablet, another computer), open:
http://192.168.1.XXX:8080
Replace 192.168.1.XXX with your actual IP. Bookmark it on your phone for quick access.
Don't expose Open WebUI to the internet without proper security (reverse proxy, HTTPS, authentication). On your home network behind your router, you're fine. But opening port 8080 to the world is asking for trouble.
Where to Go From Here
You've got a private AI chat running on your own hardware. Let's make sure it stays running and talk about what to explore next.
Keeping Open WebUI updated
The project moves fast - new features land regularly. To update:
docker pull ghcr.io/open-webui/open-webui:main
docker stop open-webui
docker rm open-webui
docker run -d \
--name open-webui \
--network host \
-v open-webui:/app/backend/data \
--restart unless-stopped \
ghcr.io/open-webui/open-webui:main
Your conversations and settings are stored in the Docker volume (open-webui), so
they survive the container being replaced. Nothing gets lost.
Docker Compose (recommended for long-term use)
If you want a cleaner setup that's easier to maintain, create a docker-compose.yml file:
version: '3.8'
services:
open-webui:
image: ghcr.io/open-webui/open-webui:main
container_name: open-webui
network_mode: host
volumes:
- open-webui:/app/backend/data
restart: unless-stopped
volumes:
open-webui:
Then start it with:
docker compose up -d
And update later with docker compose pull && docker compose up -d. Much cleaner
than remembering the long docker run command.
What you've accomplished
- ✓ Installed Docker and ran Open WebUI
- ✓ Created your admin account
- ✓ Had your first conversation with a local model through a real interface
- ✓ Learned how to manage and switch between models
- ✓ Set up multi-user access for your household
Explore next
- → Document Q&A with RAG - build a local system that can search and answer questions about your own documents
- → Local Coding Assistant - use your local models to power VS Code autocomplete and AI chat
- → Open WebUI documentation - explore advanced features like pipelines, functions, and tools
- → Open WebUI on GitHub - star the project, follow updates, report issues
- → Ollama Advanced - tune your models for better performance and quality
You're running a private, self-hosted AI chat. No subscriptions, no data collection, no limits. This is what self-hosted AI is all about.