How it works
Go from a local model to a production API in under a minute.
01
Point at your model
Run Ollama, vLLM, llama.cpp, or MLX locally — on one machine or many. Any backend that speaks the OpenAI chat completions format works.
02
Publish with one command
Run cllmhub publish. The CLI discovers your models and makes them available as a live API endpoint.
03
Share and monitor
Create API keys, invite your team into Hives to pool models across accounts, and track every request from the dashboard.