In April 2026, Ollama shipped a single command that gives you a fully local, free alternative to Cursor Pro and GitHub Copilot: ollama launch opencode. OpenCode is an open-source terminal AI coding agent with 140,000 GitHub stars and support for 75+ model providers — and running it through Ollama means your code never leaves your machine, and your monthly bill stays at zero. This guide walks through both setup methods, how to pick the right model for your hardware, and the one configuration step most guides miss that will break your agent if you skip it.

What Is OpenCode?

OpenCode is an open-source AI coding agent built by the SST (Serverless Stack) team. Unlike Cursor, which is a full IDE fork, or GitHub Copilot, which lives inside your existing editor, OpenCode runs in a terminal and works against your project directory directly. It reads your files, plans changes, executes code, and iterates — without requiring a cloud subscription or sending anything to an external API when you run it with Ollama.

Key things that make it different from the competition:

Completely model-agnostic — 75+ providers supported via models.dev. Use Ollama, Anthropic, OpenAI, or any OpenAI-compatible endpoint.
Zero data storage — OpenCode does not store your code or conversation history on any server
LSP integration — Language Server Protocol support means it understands your codebase structure, not just raw file text
Plan and Build modes — review proposed changes before execution, or let it run autonomously
MCP support — integrates with Model Context Protocol servers for extended tool use
Free with local models — £0/month versus £16/month for Cursor Pro or £9/month for GitHub Copilot Individual

If you already use Ollama for chat or API access, adding OpenCode takes about five minutes. If you are comparing options, see the Ollama VS Code guide for a comparison of coding integrations that work inside an editor rather than in a terminal.

What Is ollama launch?

The ollama launch command is a new Ollama feature (introduced in 0.15+) that installs and configures supported tools in a single step, with Ollama pre-wired as the provider. No JSON config files, no environment variables, no manual provider setup. The four currently supported tools are:

ollama launch opencode — OpenCode AI coding agent
ollama launch claude — Claude Code CLI
ollama launch codex — OpenAI Codex CLI
ollama launch droid — Droid agent

For OpenCode, ollama launch handles installation, provider configuration, and model selection in one go. This is the fastest path to a working setup and the one this guide leads with.

Prerequisites

Before starting, make sure you have:

Ollama 0.15 or later — run ollama --version to check. See how to update Ollama if you are on an older build.
At least 8 GB RAM — 16 GB recommended for a comfortable experience
A compatible coding model pulled — the setup step will prompt you to choose one
Node.js 18+ (for manual install only) — not required for ollama launch

Method 1: One-Command Setup with ollama launch opencode

This is the recommended path for most users. With Ollama running, open a terminal and run:

ollama launch opencode

Ollama will:

Download and install OpenCode if it is not already present
Create a provider configuration pointing at your local Ollama instance
Present a model selection menu from your currently pulled models
Launch OpenCode in the current directory

If you want to configure without immediately launching (useful for servers or automated setups), use the --config flag:

ollama launch opencode --config

This writes the configuration to ~/.config/opencode/opencode.jsonc without opening the agent. You can then launch OpenCode manually with opencode when ready.

Before you run your first prompt: do not skip the context window step below. The default Ollama context of 4096 tokens will break OpenCode’s tool calls and agent loops almost immediately.

Method 2: Manual Configuration (Full Control)

If you want to define specific models, use a non-standard Ollama port, or integrate OpenCode into a larger setup, manual configuration gives you complete control.

Install OpenCode

curl -fsSL https://opencode.ai/install | bash

Or via npm:

npm install -g opencode

Create the provider config

Create or edit ~/.config/opencode/opencode.jsonc:

{
  "$schema": "https://opencode.ai/config.json",
  "provider": {
    "ollama": {
      "npm": "@ai-sdk/openai-compatible",
      "name": "Ollama (local)",
      "options": {
        "baseURL": "http://localhost:11434/v1"
      },
      "models": {
        "qwen2.5-coder:7b": {
          "name": "Qwen 2.5 Coder 7B"
        },
        "deepseek-coder-v2:16b": {
          "name": "DeepSeek Coder V2 16B"
        }
      }
    }
  }
}

Important: OpenCode does not hot-reload its configuration. After any config change you must quit the application fully and restart it — not just open a new session.

Launch OpenCode

Navigate to your project directory and run:

opencode

Select the Ollama provider and your configured model. OpenCode opens as a TUI (terminal user interface) showing your project context in the left panel and the chat interface on the right.

Choosing the Right Model for Your Hardware

Model selection is the most impactful decision for OpenCode performance. Coding agents need strong tool-calling capability and a large context window — not all models deliver both. Here is a practical decision table by VRAM:

VRAM / RAM	Recommended Model	HumanEval Score	Notes
8 GB GPU / 16 GB RAM	qwen2.5-coder:7b (~4.7 GB)	~88%	Minimum viable; good for single-file edits and simple tasks
12 GB GPU / 24 GB RAM	devstral-small (14B)	~90%	Better reasoning and tool use; Mistral’s coding specialist
16–24 GB GPU	qwen2.5-coder:32b	88.4%	Beats GPT-4 on HumanEval; best quality per VRAM in this range
Apple Silicon 36 GB+	qwen2.5-coder:32b or qwen3	88%+	Fast inference on M3 Pro/Max; unified memory handles 32B comfortably
CPU only (16 GB+ RAM)	qwen2.5-coder:7b (Q4)	~85%	Usable but slow — expect 10–30 seconds per response

If you want to experiment with reasoning-capable models for complex debugging, Ollama thinking mode works with Qwen3 — though the latency trade-off is significant for fast agent loops. For most OpenCode workflows, a dedicated coding model without thinking mode will be faster and more practical.

Pull your chosen model before launching OpenCode:

ollama pull qwen2.5-coder:7b

Fixing the Context Window — The Step Most Guides Skip

This is the single most important configuration step and the most commonly missed. Ollama’s default context window is 4096 tokens. OpenCode’s agent loops use tool calls that consume context quickly — a typical multi-file edit will hit the 4096 limit within the first few turns, causing tool calls to fail or the agent to lose track of earlier context.

The recommended minimum context for OpenCode is 64,000 tokens. Set this by creating a custom Modelfile:

FROM qwen2.5-coder:7b
PARAMETER num_ctx 65536

Build and use the extended-context variant:

ollama create qwen-coder-64k -f Modelfile
ollama pull qwen-coder-64k

Then reference qwen-coder-64k as your model in OpenCode’s provider config or select it from the ollama launch opencode menu.

If you are on constrained hardware, start with 16,384 tokens as a minimum — this is enough for most single-file tasks. Increase to 32,768 or 65,536 if you find the agent losing context on multi-file operations.

For more on how Ollama handles API requests and parameters, see the Ollama REST API developer guide.

Testing Your Setup

Once OpenCode is running with a model and an appropriate context window, try these prompts to confirm everything is working:

Explain what this project does based on the files in the current directory.

If OpenCode successfully reads and summarises your files, the file-reading tool call is working. Then test a write operation:

Add a README.md to this project with a brief description and a list of dependencies from package.json.

If OpenCode proposes and writes the file correctly, your agent is fully functional. If it stalls, produces empty output, or drops context mid-task, the most likely cause is an insufficient context window — revisit the Modelfile step above.

OpenCode Modes: Plan vs Build

OpenCode operates in two modes that control how autonomously it executes changes:

Plan mode — OpenCode analyses the task and proposes a detailed plan before touching any files. You review the plan and approve it before execution begins. This is the safer option for large refactors, unfamiliar codebases, or any task where you want to verify the approach first.

Build mode — OpenCode executes autonomously without pausing for approval. Useful for well-defined, low-risk tasks where you trust the model and want maximum speed. Treat this similarly to running a script with write access to your project — use it on tasks where the worst case is a quick git checkout.

Switch between modes within the OpenCode TUI using the mode selector at the top of the interface. Most users start in Plan mode until they have a feel for how their chosen model approaches their codebase.

Troubleshooting Common Issues

Tool calls fail immediately or the agent loops without progress

Almost always a context window problem. The default 4096 tokens is not enough for agent loops. Apply the Modelfile fix in the context window section above and restart OpenCode.

Run ollama list to confirm you have models pulled. If models are present but not appearing in OpenCode’s menu, they may have been defined in a manual config file that conflicts with the ollama launch configuration. Check ~/.config/opencode/opencode.jsonc for conflicts.

OpenCode doesn’t see changes after editing config

OpenCode does not hot-reload. Quit fully with Ctrl+C or the quit command and restart from scratch — do not just open a new session tab.

Slow responses on CPU-only setups

This is expected on CPU inference. Reduce the context window (try 16,384), switch to a smaller quantised model (Q4_K_M), or consider running on a machine with a dedicated GPU. The qwen2.5-coder:7b Q4 quantisation gives the best speed-quality balance for CPU-only environments.

ollama launch command not found

You are on an older version of Ollama. The launch subcommand requires 0.15 or later. Update with your system package manager or re-download from ollama.com and retry.

OpenCode vs Cursor vs GitHub Copilot

	OpenCode + Ollama	Cursor Pro	GitHub Copilot
Cost	£0/month	~£16/month	£9/month
Data privacy	Fully local, zero egress	Code sent to cloud	Code sent to GitHub/MS
Model choice	Any Ollama-compatible model	Fixed model set	Fixed model set
IDE integration	Terminal only (VS Code ext in beta)	Full IDE fork	Native IDE plugins
Autocomplete	No (agent only)	Yes	Yes
Offline use	Fully offline	No	No
Open source	Yes (MIT)	No	No

OpenCode with Ollama is the right choice if privacy, cost, or offline capability matters. Cursor wins if you want deep IDE integration and inline autocomplete. GitHub Copilot is the path of least resistance if your team is already on GitHub Enterprise. For a deeper look at running Ollama with coding tools, see the Ollama with GitHub Copilot CLI guide.

Ollama + OpenCode: Free Local AI Coding Agent Setup

Table of Contents