Home / AI / Ollama / How to Install Ollama on Windows 11 (Step-by-Step Guide)

Ollama

How to Install Ollama on Windows 11 (Step-by-Step Guide)

3. Recommended for Best Performance

4. Step 1: Download the Ollama Installer

6. Step 3: Verify the Installation

7. Step 4: Pull Your First Model

9. Running Ollama as a Background Service

13. Troubleshooting Common Windows Issues

14. ‘ollama’ Is Not Recognised as a Command

15. Windows Defender Blocking Ollama

16. Model Download Failing or Stalling

17. CUDA Errors or GPU Not Detected

Ollama makes running large language models locally on your own hardware remarkably straightforward — and Windows support has matured significantly. Whether you want to experiment with Llama 3.2, Mistral, or Gemma without sending data to a cloud service, this guide walks you through every step of getting Ollama installed and running on Windows 10 or Windows 11. No prior Linux experience required, no complex configuration files, and no subscription fees.

System Requirements

Minimum Requirements

Operating System: Windows 10 (version 1903 or later) or Windows 11
RAM: 8 GB minimum — 16 GB recommended for 7B parameter models
Storage: At least 10 GB of free disk space per model you plan to download
CPU: Any modern 64-bit processor (x86_64)

Recommended for Best Performance

RAM: 16–32 GB for running larger models comfortably
GPU: NVIDIA GPU with 6 GB+ VRAM (CUDA 11.8 or later), or AMD GPU with ROCm support
Storage: SSD rather than HDD — models load significantly faster from solid-state storage

A GPU is optional but makes a dramatic difference. Running a 7B parameter model on CPU alone is usable but slow; the same model with GPU acceleration responds several times faster.

Step 1: Download the Ollama Installer

Open your browser and navigate to ollama.com
Click the Download button on the homepage
Select Windows from the platform options if not already selected
The file OllamaSetup.exe will download — typically around 10–15 MB

Always download Ollama directly from the official site at ollama.com. Avoid third-party download mirrors.

Step 2: Run the Installer

Locate OllamaSetup.exe in your Downloads folder and double-click it
If Windows Defender SmartScreen shows a warning, click More info then Run anyway — this is a standard warning for newly released software
If a User Account Control (UAC) prompt appears, click Yes
The installer extracts files and sets up Ollama automatically — this takes under a minute
Once complete, Ollama starts automatically and a small icon appears in your system tray

The installer places Ollama’s files in %LOCALAPPDATA%\Programs\Ollama and adds the ollama command to your system PATH.

Step 3: Verify the Installation

Press Win + R, type cmd, and press Enter to open Command Prompt
Type the following command and press Enter:

ollama --version

You should see output similar to ollama version 0.x.x. If you receive a “‘ollama’ is not recognized” error, close your terminal and reopen it — PATH changes sometimes require a fresh session.

Step 4: Pull Your First Model

Llama 3.2 (the 3B parameter version) is an excellent starting point — it’s fast, capable, and runs well even without a dedicated GPU.

ollama pull llama3.2

This downloads approximately 2 GB. Models are stored in %USERPROFILE%\.ollama\models by default. Other popular models to try:

ollama pull mistral — Mistral 7B, strong general-purpose model
ollama pull gemma3 — Google’s Gemma 3 model
ollama pull phi4 — Microsoft’s Phi-4, very capable for its size
ollama pull llama3.2:1b — Llama 3.2 1B, very fast on CPU

Step 5: Run a Model

ollama run llama3.2

After a few seconds, you’ll see a >>> prompt. Type any message and press Enter. To exit, type /bye.

You can also send a single prompt without entering interactive mode:

ollama run llama3.2 "Explain what a neural network is in simple terms"

Running Ollama as a Background Service

By default, Ollama installs itself as a startup application and the API server (on http://localhost:11434) is available whenever Windows boots. Useful commands:

Start the server manually: ollama serve
Check running models: ollama ps
List downloaded models: ollama list
Remove a model: ollama rm llama3.2

GPU Acceleration Setup

NVIDIA GPU (CUDA)

Ollama automatically detects NVIDIA GPUs if the correct drivers are installed. You do not need to install CUDA separately.

Ensure your NVIDIA drivers are up to date — download from nvidia.com/drivers
Driver must support CUDA 11.8 or later (any driver released after mid-2022)
After updating, restart your PC
Run ollama run llama3.2 and confirm GPU use with ollama ps in a second terminal

AMD GPU (ROCm)

Check that your AMD GPU is supported — Radeon RX 6000 series and newer generally work best
Install the latest AMD Software: Adrenalin Edition drivers from amd.com/en/support
Download and install ROCm for Windows from AMD’s developer site
Restart your PC — Ollama will attempt to use the AMD GPU automatically

Troubleshooting Common Windows Issues

‘ollama’ Is Not Recognised as a Command

Close all terminal windows and open a fresh Command Prompt or PowerShell
If the problem persists, search for Environment Variables in Start, open Edit the system environment variables, then check that %LOCALAPPDATA%\Programs\Ollama is in the Path entry

Windows Defender Blocking Ollama

Open Windows Security → Virus and threat protection → Protection history
If Ollama was quarantined, select it and choose Allow on device
Add an exclusion for %LOCALAPPDATA%\Programs\Ollama to prevent future false positives

Model Download Failing or Stalling

Simply re-run the same ollama pull command — it resumes from where it left off
Check whether a firewall or corporate proxy is blocking outbound connections
Temporarily disable VPN software if you’re using one

CUDA Errors or GPU Not Detected

Update your NVIDIA drivers to the latest version and restart your PC
Open the Ollama log file at %LOCALAPPDATA%\Ollama\ for detailed error messages
Models too large for your GPU’s VRAM will automatically fall back to CPU — this is normal

What to Do Next

With Ollama installed and a model running, you have a fully local AI inference stack on your Windows machine. From here, explore Open WebUI for a browser-based chat interface, connect Ollama to development tools like VS Code extensions, or call the local API at http://localhost:11434/api from your own applications. The Ollama model library continues to grow — check ollama.com/library regularly for new models suited to coding, summarisation, translation, and other specific tasks.