How it works

cLLMHub is a two-sided marketplace. Hosts publish models and set prices. Developers load credits and pay per token. Hosts keep 85%.

For hosts — turn your GPU into a paid API

01

Point at your model

Run Ollama, vLLM, llama.cpp, or MLX locally — on one machine or many. Any backend that speaks the OpenAI chat completions format works.

02

Publish and price

Run cllmhub publish. The CLI discovers your models and registers them with the catalog. From the dashboard, set your price per million input and output tokens.

03

Earn as developers use it

When a developer calls your model, you keep 85% of every dollar. Track earnings live from the dashboard and request a PayPal payout whenever you want.

For developers — pay per token, no subscription

01

Load credits

Add USD to your account via PayPal — no subscription, no minimum. Your balance is just dollars; spend it on whichever models you want.

02

Create an API key

Generate a key from the dashboard. Optionally restrict it by model, IP, daily request cap, or maximum price per million tokens.

03

Call any model

Point any OpenAI SDK at cllmhub.com/v1 and send requests. You only pay for the tokens you use, at the price the host has posted.

One simple fee

Hosts keep 85% of every dollar earned. cLLMHub takes a flat 15% platform fee to cover routing, billing, payouts, and infrastructure. No tiers, no hidden charges. Self-consumption — when you call your own model — is always free.