About cLLMHub
Our Mission
cLLMHub is an open marketplace for LLM inference. We believe anyone with a GPU should be able to publish their models as a paid API — and any developer should be able to access them through a single OpenAI-compatible endpoint, without contracts, quotas, or vendor lock-in.
How It Works
For hosts: Run the cLLMHub CLI alongside any local backend (Ollama, vLLM, llama.cpp, MLX, or any OpenAI-compatible server) and publish your models with one command. Set your own price per million input and output tokens. The hub handles discovery, routing, authentication, monitoring, billing, and PayPal payouts — you keep 85% of every dollar earned.
For developers: Load credits via PayPal, create an API key, and call any model in the catalog using any OpenAI SDK — just point the base URL at cllmhub.com/v1. You only pay for the tokens you actually use, at the price the host has posted. No subscriptions, no minimums.
The same account can act as both a developer and a host. Self-consumption (calling your own model) is always free.
Open Source
The cLLMHub CLI is open source under the Apache 2.0 license. We believe in transparency and community-driven development. You can inspect, audit, and contribute to the code on GitHub.
Catalog Pricing
Publishing a model is free. Hosts set their own price per million input tokens and per million output tokens. Consumers load USD credits via PayPal and pay only for the tokens they use. cLLMHub takes a flat 15% platform fee on each transaction — hosts keep the other 85%. No subscriptions, no monthly minimums, no surprise overages.
Built For
Developers who want pay-per-token access to a wide range of open models without committing to a hyperscaler. Hosts who want to monetize idle GPU time. Mac users running models on Apple Silicon via MLX. Anyone who wants OpenAI SDK compatibility without OpenAI.