In April 2026, Ollama shipped a single command that gives you a fully local, free alternative to Cursor Pro and GitHub Copilot: ollama launch opencode. OpenCode is an open-source terminal AI coding agent with 140,000 GitHub stars and support for 75+ model providers — and running it through Ollama means your code never leaves your machine, and your monthly bill stays at zero. This guide walks through both setup methods, how to pick the right model for your hardware, and the one configuration step most guides miss that will break your agent if you skip it.
What Is OpenCode?
OpenCode is an open-source AI coding agent built by the SST (Serverless Stack) team. Unlike Cursor, which is a full IDE fork, or GitHub Copilot, which lives inside your existing editor, OpenCode runs in a terminal and works against your project directory directly. It reads your files, plans changes, executes code, and iterates — without requiring a cloud subscription or sending anything to an external API when you run it with Ollama.
Key things that make it different from the competition:
- Completely model-agnostic — 75+ providers supported via models.dev. Use Ollama, Anthropic, OpenAI, or any OpenAI-compatible endpoint.
- Zero data storage — OpenCode does not store your code or conversation history on any server
- LSP integration — Language Server Protocol support means it understands your codebase structure, not just raw file text
- Plan and Build modes — review proposed changes before execution, or let it run autonomously
- MCP support — integrates with Model Context Protocol servers for extended tool use
- Free with local models — £0/month versus £16/month for Cursor Pro or £9/month for GitHub Copilot Individual
If you already use Ollama for chat or API access, adding OpenCode takes about five minutes. If you are comparing options, see the Ollama VS Code guide for a comparison of coding integrations that work inside an editor rather than in a terminal.
What Is ollama launch?
The ollama launch command is a new Ollama feature (introduced in 0.15+) that installs and configures supported tools in a single step, with Ollama pre-wired as the provider. No JSON config files, no environment variables, no manual provider setup. The four currently supported tools are:
ollama launch opencode— OpenCode AI coding agentollama launch claude— Claude Code CLIollama launch codex— OpenAI Codex CLIollama launch droid— Droid agent
For OpenCode, ollama launch handles installation, provider configuration, and model selection in one go. This is the fastest path to a working setup and the one this guide leads with.
Prerequisites
Before starting, make sure you have:
- Ollama 0.15 or later — run
ollama --versionto check. See how to update Ollama if you are on an older build. - At least 8 GB RAM — 16 GB recommended for a comfortable experience
- A compatible coding model pulled — the setup step will prompt you to choose one
- Node.js 18+ (for manual install only) — not required for
ollama launch
Method 1: One-Command Setup with ollama launch opencode
This is the recommended path for most users. With Ollama running, open a terminal and run:
ollama launch opencode
Ollama will:
- Download and install OpenCode if it is not already present
- Create a provider configuration pointing at your local Ollama instance
- Present a model selection menu from your currently pulled models
- Launch OpenCode in the current directory
If you want to configure without immediately launching (useful for servers or automated setups), use the --config flag:
ollama launch opencode --config
This writes the configuration to ~/.config/opencode/opencode.jsonc without opening the agent. You can then launch OpenCode manually with opencode when ready.
Before you run your first prompt: do not skip the context window step below. The default Ollama context of 4096 tokens will break OpenCode’s tool calls and agent loops almost immediately.
Method 2: Manual Configuration (Full Control)
If you want to define specific models, use a non-standard Ollama port, or integrate OpenCode into a larger setup, manual configuration gives you complete control.
Install OpenCode
curl -fsSL https://opencode.ai/install | bash
Or via npm:
npm install -g opencode
Create the provider config
Create or edit ~/.config/opencode/opencode.jsonc:
{
"$schema": "https://opencode.ai/config.json",
"provider": {
"ollama": {
"npm": "@ai-sdk/openai-compatible",
"name": "Ollama (local)",
"options": {
"baseURL": "http://localhost:11434/v1"
},
"models": {
"qwen2.5-coder:7b": {
"name": "Qwen 2.5 Coder 7B"
},
"deepseek-coder-v2:16b": {
"name": "DeepSeek Coder V2 16B"
}
}
}
}
}
Important: OpenCode does not hot-reload its configuration. After any config change you must quit the application fully and restart it — not just open a new session.
Launch OpenCode
Navigate to your project directory and run:
opencode
Select the Ollama provider and your configured model. OpenCode opens as a TUI (terminal user interface) showing your project context in the left panel and the chat interface on the right.
Choosing the Right Model for Your Hardware
Model selection is the most impactful decision for OpenCode performance. Coding agents need strong tool-calling capability and a large context window — not all models deliver both. Here is a practical decision table by VRAM:
| VRAM / RAM | Recommended Model | HumanEval Score | Notes |
|---|---|---|---|
| 8 GB GPU / 16 GB RAM | qwen2.5-coder:7b (~4.7 GB) | ~88% | Minimum viable; good for single-file edits and simple tasks |
| 12 GB GPU / 24 GB RAM | devstral-small (14B) | ~90% | Better reasoning and tool use; Mistral’s coding specialist |
| 16–24 GB GPU | qwen2.5-coder:32b | 88.4% | Beats GPT-4 on HumanEval; best quality per VRAM in this range |
| Apple Silicon 36 GB+ | qwen2.5-coder:32b or qwen3 | 88%+ | Fast inference on M3 Pro/Max; unified memory handles 32B comfortably |
| CPU only (16 GB+ RAM) | qwen2.5-coder:7b (Q4) | ~85% | Usable but slow — expect 10–30 seconds per response |
If you want to experiment with reasoning-capable models for complex debugging, Ollama thinking mode works with Qwen3 — though the latency trade-off is significant for fast agent loops. For most OpenCode workflows, a dedicated coding model without thinking mode will be faster and more practical.
Pull your chosen model before launching OpenCode:
ollama pull qwen2.5-coder:7b
Fixing the Context Window — The Step Most Guides Skip
This is the single most important configuration step and the most commonly missed. Ollama’s default context window is 4096 tokens. OpenCode’s agent loops use tool calls that consume context quickly — a typical multi-file edit will hit the 4096 limit within the first few turns, causing tool calls to fail or the agent to lose track of earlier context.
The recommended minimum context for OpenCode is 64,000 tokens. Set this by creating a custom Modelfile:
FROM qwen2.5-coder:7b
PARAMETER num_ctx 65536
Build and use the extended-context variant:
ollama create qwen-coder-64k -f Modelfile
ollama pull qwen-coder-64k
Then reference qwen-coder-64k as your model in OpenCode’s provider config or select it from the ollama launch opencode menu.
If you are on constrained hardware, start with 16,384 tokens as a minimum — this is enough for most single-file tasks. Increase to 32,768 or 65,536 if you find the agent losing context on multi-file operations.
For more on how Ollama handles API requests and parameters, see the Ollama REST API developer guide.
Testing Your Setup
Once OpenCode is running with a model and an appropriate context window, try these prompts to confirm everything is working:
Explain what this project does based on the files in the current directory.
If OpenCode successfully reads and summarises your files, the file-reading tool call is working. Then test a write operation:
Add a README.md to this project with a brief description and a list of dependencies from package.json.
If OpenCode proposes and writes the file correctly, your agent is fully functional. If it stalls, produces empty output, or drops context mid-task, the most likely cause is an insufficient context window — revisit the Modelfile step above.
OpenCode Modes: Plan vs Build
OpenCode operates in two modes that control how autonomously it executes changes:
Plan mode — OpenCode analyses the task and proposes a detailed plan before touching any files. You review the plan and approve it before execution begins. This is the safer option for large refactors, unfamiliar codebases, or any task where you want to verify the approach first.
Build mode — OpenCode executes autonomously without pausing for approval. Useful for well-defined, low-risk tasks where you trust the model and want maximum speed. Treat this similarly to running a script with write access to your project — use it on tasks where the worst case is a quick git checkout.
Switch between modes within the OpenCode TUI using the mode selector at the top of the interface. Most users start in Plan mode until they have a feel for how their chosen model approaches their codebase.
Troubleshooting Common Issues
Tool calls fail immediately or the agent loops without progress
Almost always a context window problem. The default 4096 tokens is not enough for agent loops. Apply the Modelfile fix in the context window section above and restart OpenCode.
Model selection menu shows no models
Run ollama list to confirm you have models pulled. If models are present but not appearing in OpenCode’s menu, they may have been defined in a manual config file that conflicts with the ollama launch configuration. Check ~/.config/opencode/opencode.jsonc for conflicts.
OpenCode doesn’t see changes after editing config
OpenCode does not hot-reload. Quit fully with Ctrl+C or the quit command and restart from scratch — do not just open a new session tab.
Slow responses on CPU-only setups
This is expected on CPU inference. Reduce the context window (try 16,384), switch to a smaller quantised model (Q4_K_M), or consider running on a machine with a dedicated GPU. The qwen2.5-coder:7b Q4 quantisation gives the best speed-quality balance for CPU-only environments.
ollama launch command not found
You are on an older version of Ollama. The launch subcommand requires 0.15 or later. Update with your system package manager or re-download from ollama.com and retry.
OpenCode vs Cursor vs GitHub Copilot
| OpenCode + Ollama | Cursor Pro | GitHub Copilot | |
|---|---|---|---|
| Cost | £0/month | ~£16/month | £9/month |
| Data privacy | Fully local, zero egress | Code sent to cloud | Code sent to GitHub/MS |
| Model choice | Any Ollama-compatible model | Fixed model set | Fixed model set |
| IDE integration | Terminal only (VS Code ext in beta) | Full IDE fork | Native IDE plugins |
| Autocomplete | No (agent only) | Yes | Yes |
| Offline use | Fully offline | No | No |
| Open source | Yes (MIT) | No | No |
OpenCode with Ollama is the right choice if privacy, cost, or offline capability matters. Cursor wins if you want deep IDE integration and inline autocomplete. GitHub Copilot is the path of least resistance if your team is already on GitHub Enterprise. For a deeper look at running Ollama with coding tools, see the Ollama with GitHub Copilot CLI guide.
Related articles: What is Hermes Agent and How Does It Work with Ollama?, What is Kimi K2.6 and Is It Worth Using on Ollama?, How to Use Ollama with Cursor IDE: Local AI for Free, Ollama Context Window: How to Set num_ctx






