Introduction

What is cLLMHub?

cLLMHub is a hosted gateway for self-hosted LLMs. You run an open-weight model on your own hardware — Ollama, vLLM, llama.cpp, MLX, LM Studio, or any OpenAI-compatible server — and we expose it as a stable, OpenAI-compatible API at api.cllmhub.com/v1. You subscribe to a monthly plan; your plan determines daily request quota and how many API keys and models you can register.

Three things to know

1. You bring the model. cLLMHub does not host model weights — you do. Publish your local backend with the CLI to make it callable through the hub. 2. The API is OpenAI-compatible. Any tool or library that works with OpenAI works with cLLMHub by changing the base URL and the API key. 3. Pricing is a flat monthly subscription, not per-token. Free, Pro, and Max plans each set a daily request quota — no surprise bills.

Why not just expose my own server?

You could. But then you have to: open inbound ports, run TLS, build OpenAI-compat shims if your backend does not speak it, build auth and key management, write request logging, and figure out how to revoke access. cLLMHub gives you all of that with one CLI command — the gateway opens an outbound WebSocket to your model server, so nothing on your side has to face the internet.