In this blog post, I will show you how to use AI models locally without the need of Open AI API or other cloud-based AI services.
Ollama is a tool designed for running open-source large language models (LLMs) like Llama 2 and Code Llama directly on a user’s device. It packages model weights, configurations, and data into a single unit managed by a Modelfile, optimizing for efficient GPU use. This makes Ollama an ideal platform for developers and AI enthusiasts for deploying language models in various applications, such as chatbots, summarization tools, and creative writing aids. The platform is extensible, supports privacy, and is free to use, offering easy integration for macOS, Linux and Windows.
With Ollama, You can easily use powerful AI models such as llama2, mistral and many others. The advantage is that you don’t share your data with a third party, and you can use the models offline. This make it perfect for an entreprise use case where your data can’t be shared with others.
Some models available on Ollama:
llama2 : to generate human-like text based on the input prompt.
openchat : to build a tchat like chatGPT (it’s supposed to be more powerful on some benchmark).
llava : to describe images
…
It is also extensible and you can add your own trained models. It will be the topic of a future blog post.
Go to Ollama download page and download the version that fits your OS.
For me, it’s the Mac OS version.
Let’s click on the Download for macOS button.
Wait for the download to finish.
Unzip the zip file Ollama-darwin.zip.
It contains the Ollama application.
Click on it and authorize the app.
Move the app to the Applications folder.
Et voilà, Ollama is installed on your machine.
To run Ollama, click on the app in the Applications folder.
And go to a terminal and type ollama list
.
ollama list
NAME ID SIZE MODIFIED
llama2:latest 78e26419b446 3.8 GB 5 days ago
You should see the installed models. In my case, I have two models installed: llama2 and mistral.
Let’s run the llama2 model.
ollama run llama2
>>> Send a message (/? for help)
Let’s see the available commands.
>>> /?
Available Commands:
/set Set session variables
/show Show model information
/load <model> Load a session or model
/save <model> Save your current session
/bye Exit
/?, /help Help for a command
/? shortcuts Help for keyboard shortcuts
Use """ to begin a multi-line message.
>>> Send a message (/? for help)
Let’s try the command show
>>> /show
Available Commands:
/show info Show details for this model
/show license Show model license
/show modelfile Show Modelfile for this model
/show parameters Show parameters for this model
/show system Show system message
/show template Show prompt template
>>> /show info
Model details:
Family llama
Parameter Size 7B
Quantization Level Q4_0
Let’s quit the ollama command line.
>>> /bye
Evan though, we leave the ollama command line, the server is still running and could be used with a REST client. It will be the topic of my next blog post.
To download models, you can use the following ollama command line.
By example, let’s download the mistral model.
ollama pull mistral:latest
Now we should have it on the list of models available.
ollama list
NAME ID SIZE MODIFIED
llama2:latest 78e26419b446 3.8 GB 5 days ago
mistral:latest 61e88e884507 4.1 GB 6 seconds ago
To find other models to play with you can go to the Ollama models page.
Let’s use the mistral model.
ollama run mistral
To check that the model is the one we want, we can use the show command.
>>> /show modelfile
# Modelfile generated by "ollama show"
# To build a new Modelfile based on this one, replace the FROM line with:
# FROM mistral:latest
FROM /Users/xavierbouclet/.ollama/models/blobs/sha256:e8a35b5937a5e6d5c35d1f2a15f161e07eefe5e5bb0a3cdd42998ee79b057730
TEMPLATE """[INST] {{ .System }} {{ .Prompt }} [/INST]"""
PARAMETER stop "[INST]"
PARAMETER stop "[/INST]"
Last but not list, you can ask question to your model.
>>> Tell me a chuck norris fact
Sure thing! Here's a classic Chuck Norris fact:
Chuck Norris doesn't read books. He stares them down until they speak to him.
Or how about this one:
When the Boogeyman goes to sleep every night, he checks his closet for Chuck Norris.
These facts are meant to be humorous and are not based in reality. But isn't it fun to imagine that Chuck Norris has superhuman abilities? After all, the man is a martial arts
legend and an action movie icon!
In my point of view, Ollama is a nice way to play with some AI models locally.