How to build a local Copilot AI using Ollama

Build a privacy Copilot AI using Ollama

Ollama is an open-source framework that allows users to run large language models (LLMs) locally on their machine. It'`s designed to be easy to use, efficient, and scalable, making it a good option for developers and organizations that want to deploy AI models into production.

MacCopilot is a Copilot AI that can interact with desktop conveniently, which integrated with multi-platform LLMs support.

In this article, we'll show you how to build a local Copilot AI using Ollama and MacCopilot, and run the powerful MiniCPM-Llama3-V-2.5 model.

Install Ollama

Use prebuilt binary

Download model

In case you don'`t want to build Ollama and download models via the ollama cli, here is a prebuilt Ollama App which bundles Ollama and a model management UI together, making it super easy to set up everything.

First, download the Ollama App from the release page:Ollama downloader UI

Download Ollama Downloader UI App

Then open the Ollama app and download the model

Download MiniCPM-Llama3-V-2.5 model


After downloading the model, we need to quit the Ollama app and run the model.

Open a Terminal, and run:

ollama serve
ollama run hhao/openbmb-minicpm-llama3-v-2_5:q4_K_M

Now we are ready to chat.

Build manually

The following example shows how to build Ollama with MiniCPM-Llama3-V-2.5 support.

The OpenBMB team is engaged in preparing a PR for the Ollama official repository! For now, we need to run MiniCPM-Llama3-V 2.5 with the OpenBMB fork.

Install Requirements

  • cmake version 3.24 or higher
  • go version 1.22 or higher
  • gcc version 11.4.0 or higher

Setup the Code

Prepare both our llama.cpp fork and this Ollama fork.

git clone -b minicpm-v2.5
cd ollama/llm
git clone -b minicpm-v2.5
cd ../


Here we give a MacOS example. See the developer guide for more platforms.

brew install go cmake gcc

Optionally enable debugging and more verbose logging:

# At build time
export CGO_CFLAGS="-g"

# At runtime

Get the required libraries and build the native LLM code:

go generate ./...

Build ollama:

go build .

Start the server:

./ollama serve



ollama run hhao/openbmb-minicpm-llama3-v-2_5:q4_K_M

Ollama will pull the modelfile, and serve as an interactive chat.

Config on MacCopilot

Add an Ollama format model, and select the MiniCPM-Llama3-V-2.5

Add Ollama Vision model in MacCopilot

Now we can chat with MiniCPM-Llama3-V-2.5 via Ollama.

Chat with local Copilot AI