LlamaBarn 🦙 🌾

LlamaBarn is a tiny menu bar app that lets you install and run local LLMs with just a few clicks. It automatically configures each model to run optimally on your Mac, and exposes a standard API that any app can connect to.

Install with brew install --cask llamabarn or download from Releases ↗

How it works

LlamaBarn runs as a tiny menu bar app on your Mac.

Select a model to install -- only models that can run on your Mac are shown
Select an installed model to run -- configures and starts a server at http://localhost:2276
Use the model through the API or web UI -- both at http://localhost:2276

Under the hood, LlamaBarn is a thin wrapper around llama.cpp and the llama-server that comes with it. llama-server runs the API and web UI, while LlamaBarn handles model installation, configuration, and process management.

API endpoints

LlamaBarn builds on llama-server and supports the same API endpoints:

# check server health
curl http://localhost:2276/v1/health

# chat with the running model
curl http://localhost:2276/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"messages": [{"role": "user", "content": "Hi"}]}'

Find the complete reference in the llama-server docs ↗

Roadmap

Questions

How does LlamaBarn compare to llama.cpp web UI? — LlamaBarn doesn't replace the llama.cpp web UI, it builds on top of it — when you run a model in LlamaBarn it starts both the llama.cpp server and the llama.cpp web UI at http://localhost:2276.
How do I use LlamaBarn with other apps? — LlamaBarn exposes a standard API at http://localhost:2276. You can connect it to any app that supports custom LLM APIs. See the API endpoints section for example requests.
Why don't I see all models in the catalog? — LlamaBarn shows only models that can run on your Mac based on its available memory. If a model you're looking for isn't in the catalog, it requires more memory than your system can provide.
Can I load models that aren't in the catalog? — LlamaBarn uses a curated catalog where each model is tested and configured to work optimally across different Mac hardware setups. Loading arbitrary models isn't currently supported, but if there's a specific model you'd like to see added, feel free to open a feature request.

Name		Name	Last commit message	Last commit date
Latest commit History 439 Commits
LlamaBarn.xcodeproj		LlamaBarn.xcodeproj
LlamaBarn		LlamaBarn
llama-cpp		llama-cpp
.gitignore		.gitignore
LICENSE		LICENSE
contributing.md		contributing.md
readme.md		readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

LlamaBarn 🦙 🌾

How it works

API endpoints

Roadmap

Questions

About

Uh oh!

Releases 14

Contributors 2

Languages

License

ggml-org/LlamaBarn

Folders and files

Latest commit

History

Repository files navigation

LlamaBarn 🦙 🌾

How it works

API endpoints

Roadmap

Questions

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 14

Contributors 2

Languages