Home / AI / Ollama / Ollama for Beginners: A Complete Getting Started Guide

Ollama

Ollama for Beginners: A Complete Getting Started Guide

1. What Is Ollama, in Plain English?

8. Step 2: Download Your First Model

11. Choosing Which Model to Use

12. What Can You Actually Do With It?

18. Getting a Friendlier Interface (Optional)

19. Common Questions From Beginners

21. Will it slow down my computer?

22. Can I run it while doing other things?

23. The responses are slow — is something wrong?

24. How do I know which model to start with?

If you have heard about Ollama but are not sure where to start, this guide is for you. By the time you finish reading, you will have Ollama installed, your first AI model running, and a clear understanding of what it can do and how to use it effectively. No prior experience with AI or machine learning is required.

What Is Ollama, in Plain English?

Ollama is a free program that lets you run AI assistants directly on your computer. These assistants — called large language models — can hold conversations, answer questions, help you write, explain complex topics, summarise documents, and assist with coding.

The key difference from tools like ChatGPT is that everything happens on your own machine. When you type a question, it does not get sent to a company’s server. The AI processes it locally, gives you an answer, and nothing leaves your computer. This means your conversations are private, it works without an internet connection once set up, and there is no monthly fee.

What You Need

Ollama runs on Windows, Mac, and Linux. The minimum requirements to get started:

Operating system: Windows 10 or 11, macOS 11 Big Sur or later, or a modern Linux distribution
RAM: 8 GB is the recommended minimum for running 7B models. 16 GB gives comfortable headroom. You can run smaller models (1B–3B) on 4–6 GB of RAM.
Storage: AI models are large files. Plan for at least 5–10 GB of free disk space to store a couple of models.
GPU (optional but helpful): A modern graphics card speeds things up enormously. Without a GPU, models still run, just more slowly.

Step 1: Install Ollama

On Windows

Download the installer from ollama.com. Run the downloaded file and follow the installation wizard. Ollama will install and start running in the background — you will see its icon in the system tray near the clock.

On Mac

Download the Mac app from ollama.com. Open the downloaded file and drag Ollama to your Applications folder. Launch it from Applications. You will see the Ollama icon in your menu bar at the top right of the screen.

On Linux

Open a terminal and run:

curl -fsSL https://ollama.com/install.sh | sh

The script installs Ollama and sets it up to run automatically in the background.

Verify the Installation

Open a terminal (on Mac: press Cmd+Space and type “Terminal”; on Windows: press Win+R, type “cmd”, press Enter) and run:

ollama --version

If you see a version number, Ollama is installed correctly.

Step 2: Download Your First Model

Ollama does not come with any AI models pre-installed. You need to download one. Models are identified by a name and optionally a size (1b, 3b, 7b, etc. — the “b” stands for billions of parameters, which roughly indicates capability and size).

A great first model is Llama 3.2, Meta’s latest open-weight model. For most computers, the 3B version strikes a good balance between speed and capability:

ollama pull llama3.2:3b

This downloads the model from Ollama’s library. It is about 2 GB, so it may take a few minutes depending on your internet connection. You will see a progress bar as it downloads.

If you have a powerful computer with 8 GB or more of RAM, you can try the 7B version for better quality responses:

ollama pull llama3.2

(Without a size suffix, Ollama downloads the recommended default variant.)

Step 3: Start Chatting

Once the model is downloaded, start a conversation:

ollama run llama3.2:3b

You will see a prompt appear. Type your message and press Enter. The model will respond.

>>> Hello! What can you help me with today?

I'm happy to help with a wide range of tasks! I can answer questions, help you 
write or edit text, explain complex topics in simple terms, assist with coding 
problems, brainstorm ideas, summarise information, translate text, and much more.

What would you like to work on?

To end the conversation and return to your terminal, type /bye and press Enter.

Basic Commands to Know

These are the Ollama commands you will use most often:

ollama pull model-name — Download a model from the library
ollama run model-name — Start a chat with a model (downloads it first if needed)
ollama list — See all models you have downloaded
ollama rm model-name — Delete a model and free up disk space
ollama show model-name — See information about a model

Inside a chat session:

/bye — Exit the chat
/clear — Clear the conversation history and start fresh
Ctrl+C — Stop the model’s current response immediately

Choosing Which Model to Use

Ollama’s library contains dozens of models. The right choice depends on what you want to do and how much RAM your computer has:

Use case	Recommended model	RAM needed
General chat and questions	llama3.2:3b or llama3.2	4 GB / 8 GB
Writing assistance	llama3.2 or qwen2.5:7b	8 GB
Coding help	qwen2.5-coder:7b or codellama	8 GB
Very fast responses (low RAM)	llama3.2:1b or qwen2.5:1.5b	2–3 GB
High-quality answers (16 GB+)	llama3.1:8b or qwen2.5:14b	8–10 GB

Browse all available models at ollama.com/library.

What Can You Actually Do With It?

Here are some practical things to try once you have a model running:

Summarise a long document

>>> Here is a long article. Please summarise it in 5 bullet points:
[paste your text here]

Explain something complex

>>> Explain how neural networks work, as if I am 12 years old

Help with writing

>>> I need to write a professional email declining a meeting. 
Can you write a draft? The meeting is about a project I am 
not involved in.

Translate text

>>> Translate the following text into French: "I would like to 
book a table for two at 7pm on Saturday."

Brainstorm ideas

>>> I am starting a small online business selling homemade candles. 
Give me 10 marketing ideas for social media.

Getting a Friendlier Interface (Optional)

The terminal interface works but is not as polished as ChatGPT. If you prefer a browser-based chat interface with a proper UI, you can set up Open WebUI — a free, open-source web app that connects to Ollama and provides a full chat experience with conversation history, model switching, and file uploads.

If you have Docker installed, Open WebUI can be running in one command:

docker run -d -p 3000:8080 --add-host=host.docker.internal:host-gateway \
  -v open-webui:/app/backend/data --name open-webui --restart always \
  ghcr.io/open-webui/open-webui:main

Then open http://localhost:3000 in your browser. If you do not have Docker, the Open WebUI website has alternative installation instructions.

Common Questions From Beginners

Is it safe to use?

Yes. Ollama runs entirely on your machine. Nothing you type is sent to the internet. It is no different from running any other local application.

Will it slow down my computer?

Running a model does use CPU or GPU resources, and you may notice your fan spin up. Once you exit the model, resources are released. Ollama itself (the background service) uses almost no resources when no model is loaded.

Can I run it while doing other things?

Yes, but expect your computer to be slower while a model is actively generating a response. For background tasks like browsing the web, the impact is usually minimal between responses.

The responses are slow — is something wrong?

Slow responses usually mean the model is running on CPU rather than GPU, or the model is too large for your available RAM. Try a smaller model variant (1b or 3b instead of 7b). If you have a supported NVIDIA or AMD GPU, Ollama will use it automatically, which makes a substantial speed difference.

How do I know which model to start with?

If in doubt, start with llama3.2:3b. It is small enough to run on most computers, fast enough for real-time conversation, and capable enough for everyday tasks. You can always download and try other models later — they are free and easy to delete if you change your mind.

Next Steps

Once you are comfortable with the basics, there is much more to explore:

Try different models — each model has different strengths. Code-focused models like qwen2.5-coder are significantly better at programming tasks than general chat models.
Use Ollama with VS Code — the Continue extension brings local AI code completion directly into your editor.
Set up Open WebUI — for a much more comfortable long-form chat experience with history and file uploads.
Learn about the API — Ollama exposes a local HTTP API that lets you call models from your own scripts and applications.
Customise models with Modelfiles — create your own model configurations with custom system prompts and parameter settings.

Local AI with Ollama is worth the small time investment to set up. Once it is running, you have a capable, private AI assistant available whenever you need it, at no ongoing cost.