Home / AI / Ollama / How to Use Ollama with VS Code (Continue and Cline)

Ollama

How to Use Ollama with VS Code (Continue and Cline)

1. Why Run AI Code Assistance Locally?

3. Option 1: Continue (Recommended)

10. Which Model to Use for What

Running AI code assistance locally with Ollama and VS Code gives you GitHub Copilot-style autocomplete and chat — without sending your code to any external server. This guide covers two main approaches: the Continue extension (recommended) and Cline.

Why Run AI Code Assistance Locally?

Cloud-based tools like GitHub Copilot work well, but they send your code to external servers. For private codebases, client work, or environments with strict data policies, a fully local setup is preferable. Ollama handles the model; VS Code extensions handle the integration.

Prerequisites

Install Ollama and pull a good coding model:

ollama pull llama3.1
# or for dedicated code models:
ollama pull deepseek-coder-v2
ollama pull qwen2.5-coder:7b

See the best Ollama models for coding for a full comparison. Make sure Ollama is running (ollama serve).

Option 1: Continue (Recommended)

Continue is the most mature open-source AI coding assistant for VS Code. It supports chat, inline autocomplete, and slash commands — all pointing at your local Ollama instance.

Installation

Open VS Code
Go to Extensions (Ctrl+Shift+X / Cmd+Shift+X)
Search for Continue and install it

Configuration

After installation, Continue will open a config file at ~/.continue/config.json. Update it to use Ollama:

{
  "models": [
    {
      "title": "Llama 3.1",
      "provider": "ollama",
      "model": "llama3.1",
      "apiBase": "http://localhost:11434"
    },
    {
      "title": "DeepSeek Coder V2",
      "provider": "ollama",
      "model": "deepseek-coder-v2",
      "apiBase": "http://localhost:11434"
    }
  ],
  "tabAutocompleteModel": {
    "title": "Qwen2.5 Coder",
    "provider": "ollama",
    "model": "qwen2.5-coder:7b",
    "apiBase": "http://localhost:11434"
  }
}

Using Continue

Chat panel: Click the Continue icon in the sidebar, or press Ctrl+Shift+L (Cmd+Shift+L on Mac)
Inline chat: Select code, press Ctrl+Shift+J to ask a question about it
Autocomplete: Starts suggesting as you type once a tab autocomplete model is configured
Slash commands: Type /edit, /comment, or /test in the chat panel

Useful Slash Commands

/edit refactor this function to use async/await
/comment add docstrings to all functions
/test write unit tests for this class
/share export this conversation

Option 2: Cline

Cline (formerly Claude Dev) is an agentic coding assistant that can read and edit files, run terminal commands, and complete multi-step tasks. It’s more autonomous than Continue.

Installation and Setup

Install the Cline extension from the VS Code marketplace
Open Cline settings and set the provider to OpenAI Compatible
Set the base URL to http://localhost:11434/v1
Set the API key to ollama (any string works)
Set the model name to your pulled model (e.g. llama3.1)

Which Model to Use for What

Task	Recommended Model	Command
Chat and explanation	Llama 3.1 8B	`ollama pull llama3.1`
Code generation	DeepSeek Coder V2 16B	`ollama pull deepseek-coder-v2`
Tab autocomplete	Qwen2.5 Coder 7B	`ollama pull qwen2.5-coder:7b`
Lightweight / fast	Phi-3 Mini	`ollama pull phi3:mini`

Performance Tips

Use a dedicated autocomplete model — smaller models (3B-7B) respond faster for tab completion; save the larger models for chat
GPU acceleration — if you have an NVIDIA or AMD GPU, Ollama will use it automatically for much faster responses
Keep Ollama running in the background — it starts as a daemon by default, so VS Code extensions can connect any time

Troubleshooting

Continue can’t connect to Ollama: Check that Ollama is running with ollama list. If not, run ollama serve.

Slow autocomplete: Switch to a smaller model like qwen2.5-coder:1.5b for faster suggestions.

Model not found error: Make sure you’ve pulled the model — ollama pull model-name — before referencing it in your config.

Next Steps

Once you have Ollama working in VS Code, consider calling Ollama from Python to build your own tools, or set up Ollama in Docker for a portable development environment.

How to Use Ollama with VS Code (Continue and Cline)

Table of Contents

1. Why Run AI Code Assistance Locally?

2. Prerequisites

3. Option 1: Continue (Recommended)

4. Installation

5. Configuration

6. Using Continue

7. Useful Slash Commands

8. Option 2: Cline

9. Installation and Setup

10. Which Model to Use for What

11. Performance Tips

12. Troubleshooting

13. Next Steps

Why Run AI Code Assistance Locally?

Prerequisites

Option 1: Continue (Recommended)

Installation

Configuration

Using Continue

Useful Slash Commands

Option 2: Cline

Installation and Setup

Which Model to Use for What

Performance Tips

Troubleshooting

Next Steps

How to Use Ollama with Python

How to Run Ollama with Docker

Leave a Reply Cancel reply

How to Use Ollama with VS Code (Continue and Cline)

Table of Contents

Why Run AI Code Assistance Locally?

Prerequisites

Option 1: Continue (Recommended)

Installation

Configuration

Using Continue

Useful Slash Commands

Option 2: Cline

Installation and Setup

Which Model to Use for What

Performance Tips

Troubleshooting

Next Steps

How to Use Ollama with Python

How to Run Ollama with Docker

Sign Up For Daily Newsletter

Related Posts

Leave a Reply Cancel reply