How to build a local Copilot AI using Ollama
Ollama is an open-source framework that allows users to run large language models (LLMs) locally on their machine. It'`s designed to be easy to use, efficient, and scalable, making it a good option for developers and organizations that want to deploy AI models into production.
MacCopilot is a Copilot AI that can interact with desktop conveniently, which integrated with multi-platform LLMs support.
In this article, we'll show you how to build a local Copilot AI using Ollama and MacCopilot, and run the powerful MiniCPM-Llama3-V-2.5 model.
Install Ollama
Use prebuilt binary
Download model
In case you don'`t want to build Ollama and download models via the ollama cli, here is a prebuilt Ollama App which bundles Ollama and a model management UI together, making it super easy to set up everything.
First, download the Ollama App from the release page:Ollama downloader UI
Then open the Ollama app and download the model
Running
After downloading the model, we need to quit the Ollama app and run the model.
Open a Terminal, and run:
ollama serve
ollama run hhao/openbmb-minicpm-llama3-v-2_5:q4_K_M
Now we are ready to chat.
Build manually
The following example shows how to build Ollama with MiniCPM-Llama3-V-2.5 support.
The OpenBMB team is engaged in preparing a PR for the Ollama official repository! For now, we need to run MiniCPM-Llama3-V 2.5 with the OpenBMB fork.
Install Requirements
- cmake version 3.24 or higher
- go version 1.22 or higher
- gcc version 11.4.0 or higher
Setup the Code
Prepare both our llama.cpp fork and this Ollama fork.
git clone -b minicpm-v2.5 https://github.com/OpenBMB/ollama.git
cd ollama/llm
git clone -b minicpm-v2.5 https://github.com/OpenBMB/llama.cpp.git
cd ../
Build
Here we give a MacOS example. See the developer guide for more platforms.
brew install go cmake gcc
Optionally enable debugging and more verbose logging:
# At build time
export CGO_CFLAGS="-g"
# At runtime
export OLLAMA_DEBUG=1
Get the required libraries and build the native LLM code:
go generate ./...
Build ollama:
go build .
Start the server:
./ollama serve
Running
Run:
ollama run hhao/openbmb-minicpm-llama3-v-2_5:q4_K_M
Ollama will pull the modelfile, and serve as an interactive chat.
Config on MacCopilot
Add an Ollama format model, and select the MiniCPM-Llama3-V-2.5
Now we can chat with MiniCPM-Llama3-V-2.5 via Ollama.