How it works

Go from a local model to a production API in under a minute.

01

Point at your model

Run Ollama, vLLM, llama.cpp, or MLX locally — on one machine or many. Any backend that speaks the OpenAI chat completions format works.

02

Publish with one command

Run cllmhub publish. The CLI discovers your models and makes them available as a live API endpoint.

03

Share and monitor

Create API keys, invite your team into Hives to pool models across accounts, and track every request from the dashboard.

Start Building