Home / AI / Ollama / How to Use Ollama with GitHub Copilot CLI

How to Use Ollama with GitHub Copilot CLI

How to Use Ollama with GitHub Copilot CLI

GitHub’s Copilot CLI is a terminal-based coding agent that works directly with your repositories — reading issues, referencing pull requests, running tasks in parallel across your codebase, and making changes through your editor. Until now it relied on GitHub’s own cloud infrastructure. Ollama has added support for running it locally, which changes the picture significantly for developers who want more control over where their code goes.

What is GitHub Copilot CLI?

The Copilot CLI is different from the Copilot autocomplete you see inside VS Code or other editors. It’s a terminal agent — meaning it works in your command line, understands your whole repository, and can plan and execute multi-step tasks rather than just completing lines of code.

The key capabilities are:

  • GitHub context awareness — you can reference issues and pull requests by number (e.g. tell me about issue #15291) and the agent will pull in the comments, diffs, and current status as context for whatever you’re working on
  • Parallel subagents — using the /fleet command, the CLI can break a task into multiple subagents that work across your codebase simultaneously, then bring the results together for you to review and commit
  • Policy compliance — if your organisation uses GitHub Business or Enterprise, the Copilot CLI runs within your existing access policies, keeping repository data and secrets inside your usual guardrails

Why run it through Ollama?

The standard Copilot CLI sends your code and context to GitHub’s cloud. For many developers that’s fine, but there are good reasons to want a local alternative:

  • Privacy — your code never leaves your machine. This matters for proprietary projects, client work, or anything under NDA
  • Cost — running inference locally through Ollama has no per-token cost once your hardware is set up
  • Control — you choose which model runs underneath, including options like Kimi K2.6 for more complex tasks or a lighter model for quick queries
  • Offline use — once models are downloaded, you’re not dependent on an internet connection or a third-party service being available

How to set it up

You’ll need Ollama installed first — if you haven’t done that yet, follow our Ollama installation guide. Once Ollama is running, starting the Copilot CLI integration is a single command:

ollama launch copilot

That’s it. Ollama handles downloading the necessary components and starting the agent. From there, you use it from your terminal in the same way you’d use the GitHub-hosted version — the difference is that inference is running locally through Ollama rather than in GitHub’s cloud.

Using Copilot CLI in practice

Once running, you interact with the CLI through natural language in your terminal. Some practical examples of what it can handle:

  • “Summarise the changes in PR #42 and tell me if there are any obvious issues” — pulls in the PR diff and comments as context
  • “Refactor the auth module to use the new token format from issue #108” — reads the issue, understands the change needed, and makes it across the relevant files
  • /fleet “Add input validation to all API endpoints” — splits the task into parallel subagents, one per endpoint, and coordinates the results

The /fleet command is particularly useful for changes that touch multiple files in a consistent way — adding logging, updating imports, renaming a function across a codebase. Tasks that would otherwise mean opening every file manually.

How this compares to other Ollama coding setups

If you’ve been using Ollama with VS Code or running Ollama with Python for code assistance, the Copilot CLI sits above those in terms of autonomy. The VS Code integration completes code as you type; the Python API lets you build your own tools. The Copilot CLI is an agent that can plan and execute — you give it a goal and it works out the steps.

It’s also worth comparing to Hermes Agent, which is Ollama’s own self-improving agent. Hermes is designed around persistent memory and building knowledge over sessions. The Copilot CLI is more focused on repository-level tasks with GitHub context — it’s a better fit if your work is tied to a specific codebase and GitHub workflow.

Who should use this?

The Ollama + Copilot CLI combination makes most sense for:

  • Developers already using GitHub who want a terminal agent without sending code to the cloud
  • Teams with privacy or compliance requirements around source code
  • Anyone doing large-scale refactors or consistent changes across many files
  • Developers experimenting with agentic workflows who want to start with a well-defined tool rather than building from scratch

If you’re newer to Ollama and want to understand the broader landscape before diving into agents, our introduction to Ollama is a good place to start.