Home / AI / Ollama / DeepSeek R1 vs Llama 3.1 on Ollama

DeepSeek R1 vs Llama 3.1 on Ollama

DeepSeek R1 and Llama 3.1 are two of the best open-source models you can run locally with Ollama — but they’re built for different things. Llama 3.1 is a strong all-round model; DeepSeek R1 is a reasoning specialist. Here’s how they compare in practice.

The Key Difference

Llama 3.1 is a standard instruction-tuned model. It generates answers quickly and handles a wide range of tasks well — chat, coding, summarisation, writing.

DeepSeek R1 is a reasoning model. Before answering, it thinks through the problem step by step, outputting its reasoning in <think> tags. This makes it slower but significantly more accurate on problems that require logic, calculation, or multi-step planning.

Head-to-Head Comparison

Feature DeepSeek R1 7B Llama 3.1 8B
Model type Reasoning (chain-of-thought) Standard instruction-tuned
Response speed Slower (thinks first) Faster
Maths / logic Excellent Good
Coding Excellent Very good
General chat Good Excellent
Summarisation Good Excellent
Context window 128K tokens 128K tokens
RAM needed (4-bit) ~5GB ~6GB
Shows reasoning Yes (think tags) No
Ollama command ollama pull deepseek-r1 ollama pull llama3.1

Benchmark Comparison

Benchmark DeepSeek R1 7B Llama 3.1 8B
MATH-500 (maths) ~92% ~84%
GSM8K (maths reasoning) ~91% ~84%
HumanEval (coding) ~82% ~72%
MMLU (general knowledge) ~70% ~73%

Benchmarks are approximate and vary by quantisation level and test methodology.

Coding Tasks

Both models are strong at coding, but DeepSeek R1 tends to produce more correct solutions on complex algorithmic problems because it reasons through the logic before writing code. Llama 3.1 is faster and still very capable for everyday coding tasks.

For a full breakdown, see DeepSeek R1 for coding and the best Ollama models for coding.

Maths and Reasoning

This is where DeepSeek R1 clearly wins. Its chain-of-thought approach means it works through multi-step problems rather than pattern-matching to an answer. For anything involving calculation, proof, or logical deduction, R1 is the better tool. See DeepSeek R1 for maths and reasoning.

General Chat and Writing

Llama 3.1 wins here. It’s more conversational, responds faster, and doesn’t spend time “thinking” when a question doesn’t require it. DeepSeek R1 can feel over-engineered for simple questions — it’ll sometimes write three paragraphs of internal reasoning before answering “what’s the capital of France?”

Speed

Llama 3.1 is noticeably faster because it generates the answer directly. DeepSeek R1 produces additional thinking tokens before the response, which adds latency. On a typical query, R1 might take 2-3x as long. For real-time chat this is noticeable; for batch processing it matters less.

When to Use DeepSeek R1

  • Maths problems, proofs, or calculation tasks
  • Complex coding challenges where correctness matters more than speed
  • Multi-step reasoning or planning tasks
  • Problems where you want to see the working, not just the answer

When to Use Llama 3.1

  • Everyday chat and Q&A
  • Writing, editing, and summarisation
  • Situations where response speed matters
  • General-purpose use where you switch between many task types

Can You Run Both?

Yes — Ollama lets you switch between models instantly. Many users keep both pulled and switch based on the task:

ollama pull deepseek-r1
ollama pull llama3.1

# Use R1 for hard problems
ollama run deepseek-r1

# Use Llama for general chat
ollama run llama3.1

See the full setup guide: How to run DeepSeek R1 on Ollama.

Sign Up For Daily Newsletter

Stay updated with our weekly newsletter. Subscribe now to never miss an update!

[mc4wp_form]

Leave a Reply

Your email address will not be published. Required fields are marked *