Home / AI / Ollama / How to Run Phi-4 on Ollama (Microsoft’s Best Small Model)

How to Run Phi-4 on Ollama (Microsoft’s Best Small Model)

Phi-4 is Microsoft’s latest small language model and one of the most impressive models available in Ollama for its size. At just 14 billion parameters it outperforms much larger models on reasoning and STEM tasks, making it ideal if you want high quality output without needing a high-end machine. Here’s how to get it running.

What is Phi-4?

Phi-4 was released by Microsoft in December 2024 and builds on the Phi series known for exceptional quality at small sizes. Unlike models trained on massive web crawls, Phi-4 was trained on carefully curated, high-quality data — which is why it performs so well on reasoning and logic tasks despite being relatively small.

Phi-4 at 14B parameters achieves benchmark scores comparable to models two to three times its size, which is remarkable for hardware with limited VRAM or RAM.

System Requirements

  • RAM: Minimum 16 GB (8 GB free for the model), 32 GB recommended for comfortable use
  • VRAM (GPU): 12 GB VRAM for GPU acceleration, or runs on CPU with 16 GB system RAM
  • Storage: ~9 GB for the model download

How to Install Phi-4 in Ollama

With Ollama installed, open a terminal and run:

ollama pull phi4

The download is around 9 GB. Once complete, run it with:

ollama run phi4

What is Phi-4 Good At?

Phi-4 excels in specific areas:

  • Reasoning and logic — chain-of-thought problems, puzzles, and multi-step reasoning are where Phi-4 really stands out
  • Mathematics — one of the best small models for solving maths problems step by step
  • STEM questions — science, technology and engineering explanations are detailed and accurate
  • Coding — solid coding assistant, particularly for Python and common algorithms
  • Concise answers — tends to give focused, direct responses without unnecessary padding

Where Phi-4 is less strong: very long documents, creative writing, and multilingual tasks outside of English.

Phi-4 vs Llama 3.1 8B — Which is Better?

For reasoning-heavy tasks like maths, logic problems and step-by-step analysis, Phi-4 14B wins comfortably. For general conversation, creative tasks and broader knowledge, Llama 3.1 8B is more versatile. If your primary use is technical or analytical work, Phi-4 is the better choice despite needing more RAM.

Phi-4 vs Qwen2.5 14B

Both are excellent 14B models. Qwen2.5 14B is stronger on multilingual tasks and long context, while Phi-4 14B edges ahead on pure reasoning and mathematics. For coding, use Qwen2.5-Coder rather than either base model.

Example Prompts to Try

Phi-4 responds well to structured prompts. Try these to see it at its best:

Solve this step by step: A train leaves London at 9am travelling at 80mph. Another train leaves Birmingham (113 miles away) at 9:30am travelling at 100mph toward London. At what time do they meet?
Explain the difference between TCP and UDP as if I'm a network engineer who needs a quick refresher, not a beginner.

Tips for Best Results with Phi-4

  • Ask it to “think step by step” for complex problems — it follows this instruction reliably
  • It works well with structured prompts and specific constraints
  • For long documents, break them into sections rather than pasting everything at once
  • If it’s running slowly, check GPU usage: Ollama GPU Not Detected Fix

Sign Up For Daily Newsletter

Stay updated with our weekly newsletter. Subscribe now to never miss an update!

[mc4wp_form]

Leave a Reply

Your email address will not be published. Required fields are marked *