Home / AI / Ollama / DeepSeek R1 for Maths and Reasoning on Ollama

DeepSeek R1 for Maths and Reasoning on Ollama

DeepSeek R1 was built to compete with OpenAI’s o1 reasoning model — and maths is where that competition is most visible. Its chain-of-thought reasoning makes it substantially better at mathematical problems than standard LLMs of the same size. Here’s what it can do and how to use it effectively.

Why Reasoning Models Are Better at Maths

Standard LLMs struggle with maths because they generate answers token by token without backtracking or checking. They often reach a plausible-sounding answer that’s mathematically wrong.

DeepSeek R1 uses chain-of-thought reasoning — it explicitly works through the problem in <think> tags before committing to an answer. This means it:

  • Breaks problems into steps
  • Checks intermediate results
  • Backtracks when it spots an error in its own reasoning
  • Shows the full working, not just the answer

On MATH-500 (a challenging maths benchmark), DeepSeek R1 7B scores around 92% — compared to ~84% for Llama 3.1 8B and significantly higher than most 7B models.

Setup

ollama pull deepseek-r1
ollama run deepseek-r1

For harder maths problems, the 14B or 32B models are worth it if your hardware allows. See which model size to choose.

Example: Multi-Step Arithmetic

> A train travels 120 miles at 60 mph, then 80 miles at 40 mph. 
  What is the average speed for the whole journey?

<think>
Average speed = total distance / total time

Time for first leg: 120 / 60 = 2 hours
Time for second leg: 80 / 40 = 2 hours
Total time: 4 hours
Total distance: 200 miles

Average speed: 200 / 4 = 50 mph

Note: this is NOT (60 + 40) / 2 = 50 mph by coincidence in this case, 
but the correct method is total distance / total time.
</think>

The average speed for the whole journey is 50 mph.
Total distance: 200 miles. Total time: 4 hours. Average = 200/4 = 50 mph.

What DeepSeek R1 Can Help With

Word problems and applied maths

A company's revenue grew by 15% in year 1, then fell by 10% in year 2, 
then grew by 20% in year 3. Starting from £100,000, what is the final revenue?
Show all working.

Algebra

Solve for x: 3x² - 7x + 2 = 0
Show the steps using the quadratic formula.

Probability

A bag contains 4 red balls and 6 blue balls. If you draw 3 balls without 
replacement, what is the probability of drawing exactly 2 red balls?

Logic puzzles

Three people — Alice, Bob, and Carol — each make one statement:
Alice: "Bob is lying."
Bob: "Carol is lying."
Carol: "Alice and Bob are both lying."
If exactly one person is telling the truth, who is it?

Explaining concepts

Explain the intuition behind the Monty Hall problem. 
Why does switching doors give a 2/3 probability of winning?

Checking Your Own Maths

One of the most practical uses is verifying calculations:

Check my working for this compound interest calculation:
Principal: £5,000
Rate: 3.5% per year
Time: 10 years
Formula: A = P(1 + r/n)^(nt) where n = 12 (monthly compounding)

My answer: £7,063.47
Is this correct?

Using DeepSeek R1 for Maths via Python

For automated maths checking in your own applications, see how to use Ollama with Python. A simple example:

import ollama

def solve(problem):
    response = ollama.chat(
        model='deepseek-r1',
        messages=[{
            'role': 'system',
            'content': 'You are a precise mathematical assistant. Show all working.'
        }, {
            'role': 'user',
            'content': problem
        }]
    )
    return response['message']['content']

result = solve("What is the sum of the first 100 natural numbers?")
print(result)

Limitations

DeepSeek R1 is strong at reasoning but is not a computer algebra system. For symbolic manipulation, calculus, or high-precision numerical work, dedicated tools like Python’s SymPy or WolframAlpha are more reliable. R1 is best thought of as a very capable tutor that can explain, verify, and guide — not a replacement for a CAS.

Compared to Other Models

For a broader look at models suited to mathematical tasks, see best Ollama models for maths. For the comparison with Llama 3.1, see DeepSeek R1 vs Llama 3.1.

Getting Started

See the full setup guide: How to run DeepSeek R1 on Ollama.

Sign Up For Daily Newsletter

Stay updated with our weekly newsletter. Subscribe now to never miss an update!

[mc4wp_form]

Leave a Reply

Your email address will not be published. Required fields are marked *