DeepSeek R1 is one of the most significant open-source AI models released in 2025. It’s a reasoning model — like OpenAI’s o1 — that thinks through problems step by step before answering, making it dramatically better at maths, coding, and logic than standard chat models. And you can run it entirely locally using Ollama.
What Makes DeepSeek R1 Different?
Most LLMs generate an answer directly. DeepSeek R1 uses chain-of-thought reasoning — it works through the problem internally before committing to a response. You can actually see this process as the model outputs its thinking in <think> tags before giving the final answer.
The full DeepSeek R1 model has 671 billion parameters and matches OpenAI’s o1 on benchmark tests for maths and coding. The distilled versions (7B to 70B) bring much of that reasoning capability to consumer hardware.
DeepSeek R1 Model Sizes Available on Ollama
| Model | RAM/VRAM needed | Best for |
|---|---|---|
| deepseek-r1:1.5b | ~2GB | Very low-spec hardware, quick tests |
| deepseek-r1:7b | ~5GB | Most users — good balance |
| deepseek-r1:8b | ~6GB | Llama 3 distil — slightly stronger than 7B |
| deepseek-r1:14b | ~10GB | Better reasoning, needs 12GB+ VRAM |
| deepseek-r1:32b | ~20GB | Strong performance, high-end GPU |
| deepseek-r1:70b | ~40GB | Near full-model quality, workstation GPU |
Not sure which size to pick? See our full guide: Which DeepSeek R1 model size should you use?
Installing Ollama
If you haven’t installed Ollama yet:
# macOS / Linux
curl -fsSL https://ollama.com/install.sh | sh
# Windows: download the installer from ollama.com
Pulling and Running DeepSeek R1
# Default (7B — recommended starting point)
ollama pull deepseek-r1
# Specific sizes
ollama pull deepseek-r1:1.5b
ollama pull deepseek-r1:14b
ollama pull deepseek-r1:32b
ollama pull deepseek-r1:70b
Run it interactively:
ollama run deepseek-r1
What the Output Looks Like
Unlike standard models, DeepSeek R1 shows its reasoning before answering. Here’s an example:
> What is 17 multiplied by 23?
<think>
I need to calculate 17 × 23.
17 × 20 = 340
17 × 3 = 51
340 + 51 = 391
</think>
17 multiplied by 23 is 391.
The <think> block is the model’s internal reasoning. The final answer comes after. For complex problems, this thinking phase can be quite long — that’s normal and is what makes the model more accurate.
Using DeepSeek R1 via the API
Once pulled, you can call it from your applications via Ollama’s REST API or the Python library. See the full guide on how to use Ollama with Python.
import ollama
response = ollama.chat(
model='deepseek-r1',
messages=[{
'role': 'user',
'content': 'Write a Python function to check if a number is prime.'
}]
)
print(response['message']['content'])
To strip the thinking tags from the output in your application:
import re
def strip_thinking(text):
return re.sub(r'<think>.*?</think>', '', text, flags=re.DOTALL).strip()
clean_response = strip_thinking(response['message']['content'])
Using DeepSeek R1 with Open WebUI
For a browser-based chat interface with DeepSeek R1, install Open WebUI alongside Ollama. The thinking process is displayed in a collapsible section, similar to how it appears in ChatGPT’s o1 interface. See the Docker guide for the full Open WebUI setup.
Privacy — Why Local Matters for DeepSeek
The DeepSeek cloud API (at chat.deepseek.com) raised concerns when it emerged that data is processed on servers in China. Running DeepSeek R1 locally via Ollama means no data leaves your machine — your prompts, code, and documents stay entirely private. See: Is DeepSeek R1 safe to run locally?
What DeepSeek R1 Excels At
- Maths and logic — the chain-of-thought reasoning gives it a significant edge over standard models. See DeepSeek R1 for maths and reasoning.
- Coding — strong at generating, debugging, and explaining code. See DeepSeek R1 for coding.
- Complex multi-step tasks — anything that benefits from careful, structured thinking
Compared to Other Models
- DeepSeek R1 vs Llama 3.1 — how the two compare on real tasks
- DeepSeek R1 vs ChatGPT — can it replace your ChatGPT subscription?


