Home / AI / Ollama / What is Hermes Agent and How Does It Work with Ollama?

What is Hermes Agent and How Does It Work with Ollama?

What is Hermes Agent and How Does It Work with Ollama?

Most AI assistants start fresh every time you open them. You explain your project, your preferences, your context — and then you do it again next session. Hermes Agent, built by Nous Research and now available through Ollama 0.21, takes a different approach. It learns from your conversations, builds on what it already knows about you, and improves its own capabilities over time. It’s one of the more genuinely different things to come out of the local AI space in a while.

What is Hermes Agent?

Hermes is a self-improving AI agent designed to run locally through Ollama. The “self-improving” part isn’t marketing language — it refers to a specific set of behaviours that distinguish it from a standard chat model:

  • It creates skills from experience — when Hermes successfully completes a task, it can store that approach as a reusable skill, so next time it faces a similar problem it draws on what worked rather than starting from scratch
  • It improves skills during use — if a skill doesn’t work perfectly, Hermes refines it rather than simply failing or giving a different answer
  • It maintains memory across sessions — Hermes builds a model of who you are over time: your preferences, your working style, your projects. This persists between conversations, so it doesn’t ask you the same questions repeatedly
  • It searches its own past conversations — rather than forgetting everything when a session ends, Hermes can retrieve relevant context from previous sessions when it’s useful

How is this different from a normal chatbot?

The easiest way to understand the difference is to think about what happens after a few weeks of use. With a standard chat model — whether that’s a local Ollama model or a cloud tool like ChatGPT — each session is independent. The model has no record of what you discussed yesterday, what tasks you work on regularly, or how you prefer things done.

With Hermes, that changes. If you spend a month using it for Python development, it accumulates knowledge of your codebase patterns, your preferred style, and the kinds of tasks you do. It becomes more useful over time rather than staying at the same baseline.

This is also different from simply using a long system prompt or saving notes manually. Hermes handles the accumulation and retrieval automatically — you don’t manage the memory, it does.

Getting started with Hermes on Ollama

Hermes is available from Ollama 0.21 onwards. If you don’t have Ollama installed, start with our Ollama installation guide. To launch Hermes:

ollama launch hermes

Ollama will download the necessary components and start the agent. From your first session, Hermes begins building its model of you and your work. The more you use it, the more useful it becomes.

Running Hermes with a more powerful model

By default, Hermes runs with a standard model. You can significantly increase its capability by running it with Kimi K2.6, which is designed for complex, long-horizon tasks:

ollama launch hermes --model kimi-k2.6:cloud

This gives Hermes the underlying intelligence of a state-of-the-art coding and agent model, while keeping the persistent memory and self-improvement framework that makes Hermes distinctive. For serious development work, this combination is worth trying.

What Hermes is good at

The self-improving, memory-persistent design makes Hermes particularly well suited to:

  • Ongoing projects — anything you return to regularly. Hermes builds up context about the project so you don’t need to re-explain it
  • Personalised assistance — if you have preferences about how things are done (code style, explanation depth, format), Hermes learns and applies them
  • Repetitive task types — if you do similar things often (writing documentation, reviewing code, summarising meeting notes), Hermes develops reusable approaches that get faster and more accurate
  • Long-running work — tasks that span multiple sessions, where continuity matters

Is it genuinely useful or just clever marketing?

The honest answer is that it depends what you use it for. If you open Ollama once a week for one-off questions, the memory and skill-building features won’t have much to work with. The more consistently you use Hermes for a specific domain — development, writing, research, analysis — the more the accumulated knowledge compounds into something genuinely useful.

It’s also worth being realistic about what “self-improving” means here. Hermes doesn’t rewrite its own weights or change at a model level. What it improves is its library of skills and its knowledge of you — essentially a structured, automatically-maintained memory that informs how it approaches new tasks. That’s a meaningful capability, but it’s different from the model itself getting smarter.

How it compares to other Ollama agents

Ollama’s ecosystem now includes a few different agent options. GitHub Copilot CLI is focused specifically on repository-level coding tasks with GitHub integration. Hermes is a more general-purpose agent designed around long-term memory and personal use.

If your work is primarily tied to a specific codebase and GitHub workflow, the Copilot CLI is probably the better fit. If you want a personal AI assistant that gets more useful over time across a range of tasks, Hermes is the more interesting option.

For a broader look at what’s available, our guide to the best Ollama models for coding covers the model landscape in more detail.

Related articles: How to Use Ollama with Cursor IDE: Local AI for Free, Ollama Context Window: How to Set num_ctx