Ollama makes running large language models locally on your own hardware remarkably straightforward — and Windows support has matured significantly. Whether you want to experiment with Llama 3.2, Mistral, or Gemma without sending data to a cloud service, this guide walks you through every step of getting Ollama installed and running on Windows 10 or Windows 11. No prior Linux experience required, no complex configuration files, and no subscription fees.
System Requirements
Minimum Requirements
- Operating System: Windows 10 (version 1903 or later) or Windows 11
- RAM: 8 GB minimum — 16 GB recommended for 7B parameter models
- Storage: At least 10 GB of free disk space per model you plan to download
- CPU: Any modern 64-bit processor (x86_64)
Recommended for Best Performance
- RAM: 16–32 GB for running larger models comfortably
- GPU: NVIDIA GPU with 6 GB+ VRAM (CUDA 11.8 or later), or AMD GPU with ROCm support
- Storage: SSD rather than HDD — models load significantly faster from solid-state storage
A GPU is optional but makes a dramatic difference. Running a 7B parameter model on CPU alone is usable but slow; the same model with GPU acceleration responds several times faster.
Step 1: Download the Ollama Installer
- Open your browser and navigate to ollama.com
- Click the Download button on the homepage
- Select Windows from the platform options if not already selected
- The file OllamaSetup.exe will download — typically around 10–15 MB
Always download Ollama directly from the official site at ollama.com. Avoid third-party download mirrors.
Step 2: Run the Installer
- Locate OllamaSetup.exe in your Downloads folder and double-click it
- If Windows Defender SmartScreen shows a warning, click More info then Run anyway — this is a standard warning for newly released software
- If a User Account Control (UAC) prompt appears, click Yes
- The installer extracts files and sets up Ollama automatically — this takes under a minute
- Once complete, Ollama starts automatically and a small icon appears in your system tray
The installer places Ollama’s files in %LOCALAPPDATA%\Programs\Ollama and adds the ollama command to your system PATH.
Step 3: Verify the Installation
- Press Win + R, type cmd, and press Enter to open Command Prompt
- Type the following command and press Enter:
ollama --version
You should see output similar to ollama version 0.x.x. If you receive a “‘ollama’ is not recognized” error, close your terminal and reopen it — PATH changes sometimes require a fresh session.
Step 4: Pull Your First Model
Llama 3.2 (the 3B parameter version) is an excellent starting point — it’s fast, capable, and runs well even without a dedicated GPU.
ollama pull llama3.2
This downloads approximately 2 GB. Models are stored in %USERPROFILE%\.ollama\models by default. Other popular models to try:
ollama pull mistral— Mistral 7B, strong general-purpose modelollama pull gemma3— Google’s Gemma 3 modelollama pull phi4— Microsoft’s Phi-4, very capable for its sizeollama pull llama3.2:1b— Llama 3.2 1B, very fast on CPU
Step 5: Run a Model
ollama run llama3.2
After a few seconds, you’ll see a >>> prompt. Type any message and press Enter. To exit, type /bye.
You can also send a single prompt without entering interactive mode:
ollama run llama3.2 "Explain what a neural network is in simple terms"
Running Ollama as a Background Service
By default, Ollama installs itself as a startup application and the API server (on http://localhost:11434) is available whenever Windows boots. Useful commands:
- Start the server manually:
ollama serve - Check running models:
ollama ps - List downloaded models:
ollama list - Remove a model:
ollama rm llama3.2
GPU Acceleration Setup
NVIDIA GPU (CUDA)
Ollama automatically detects NVIDIA GPUs if the correct drivers are installed. You do not need to install CUDA separately.
- Ensure your NVIDIA drivers are up to date — download from nvidia.com/drivers
- Driver must support CUDA 11.8 or later (any driver released after mid-2022)
- After updating, restart your PC
- Run
ollama run llama3.2and confirm GPU use withollama psin a second terminal
AMD GPU (ROCm)
- Check that your AMD GPU is supported — Radeon RX 6000 series and newer generally work best
- Install the latest AMD Software: Adrenalin Edition drivers from amd.com/en/support
- Download and install ROCm for Windows from AMD’s developer site
- Restart your PC — Ollama will attempt to use the AMD GPU automatically
Troubleshooting Common Windows Issues
‘ollama’ Is Not Recognised as a Command
- Close all terminal windows and open a fresh Command Prompt or PowerShell
- If the problem persists, search for Environment Variables in Start, open Edit the system environment variables, then check that
%LOCALAPPDATA%\Programs\Ollamais in the Path entry
Windows Defender Blocking Ollama
- Open Windows Security → Virus and threat protection → Protection history
- If Ollama was quarantined, select it and choose Allow on device
- Add an exclusion for
%LOCALAPPDATA%\Programs\Ollamato prevent future false positives
Model Download Failing or Stalling
- Simply re-run the same
ollama pullcommand — it resumes from where it left off - Check whether a firewall or corporate proxy is blocking outbound connections
- Temporarily disable VPN software if you’re using one
CUDA Errors or GPU Not Detected
- Update your NVIDIA drivers to the latest version and restart your PC
- Open the Ollama log file at
%LOCALAPPDATA%\Ollama\for detailed error messages - Models too large for your GPU’s VRAM will automatically fall back to CPU — this is normal
What to Do Next
With Ollama installed and a model running, you have a fully local AI inference stack on your Windows machine. From here, explore Open WebUI for a browser-based chat interface, connect Ollama to development tools like VS Code extensions, or call the local API at http://localhost:11434/api from your own applications. The Ollama model library continues to grow — check ollama.com/library regularly for new models suited to coding, summarisation, translation, and other specific tasks.


