An intelligent reverse proxy for routing requests to multiple Large Language Model (LLM) providers with real-time monitoring and cost optimization.
- Intelligent Routing: Automatically selects providers using a cost-optimized algorithm that prioritizes zero-cost providers and selects the lowest-cost paid options when needed.
- Real-time Monitoring: A Vue.js dashboard provides a live view of usage statistics, including requests, tokens, and cost per provider.
- Dynamic Configuration: Update provider settings, models, and limits from the UI without restarting the server.
- Extensive Provider Support: Natively supports a wide range of LLM providers through the Vercel AI SDK, including OpenAI, Anthropic, Google, Groq, Mistral, claude-code, gemini-cli, and any OpenAI-compatible API.
- Unified API: A single, consistent OpenAI-compatible endpoint for all backend providers.
- Node.js 22+
- npm (comes with Node.js)
-
Clone the repository:
git clone https://github.com/mcowger/costrouter.git cd costrouter
-
Install dependencies:
npm install
-
Create and configure your settings:
- Create a
config
directory:mkdir config
- Optional: Copy the example configuration:
cp config.test.jsonc config/config.jsonc
. If you dont create a config, you'll just start up with an empty config that you can edit via the UI. - Optional Edit
config/config.jsonc
and add your provider API keys and settings.
- Create a
-
Run the application:
npm run dev
This command starts both the backend server (on port 3000) and the frontend UI (on port 5173) concurrently.
-
Access the Dashboard: Open http://localhost:5173 in your browser.
-
Build the Docker image:
npm run docker-build
-
Create a config directory (optional):
mkdir -p ./config cp config.test.jsonc ./config/config.jsonc # Optional: copy example config
-
Run the Docker container:
docker run --rm -p 3000:3000 -v $(pwd)/config:/config --name llm-gateway-container llm-gateway
- The gateway will be accessible on
http://localhost:3000
. - The UI is served from the same port at the
/
route. - Configuration is mounted from your host machine. If you dont have an existing config file, an empty one will be made for you.
- The gateway will be accessible on
Send requests to the gateway's OpenAI-compatible endpoint:
curl -X POST http://localhost:3000/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer any-string-is-valid" \
-d '{
"model": "gpt-4",
"messages": [{"role": "user", "content": "Hello!"}]
}'
The gateway uses an intelligent, cost-optimized routing algorithm that works as follows:
- Model Matching: Identifies all providers that support the requested model (by name or mapped name)
- Rate Limit Filtering: Removes providers that have exceeded their configured rate limits
- Cost-Based Partitioning: Separates remaining providers into two groups:
- Zero-cost providers: Those with all pricing fields explicitly set to 0
- Paid providers: All others, including those with undefined/unknown pricing
- Selection Strategy:
- If zero-cost providers are available, randomly selects one to distribute load
- Otherwise, selects the lowest-cost paid provider based on:
- Primary sort: Input cost per million tokens
- Secondary sort: Output cost per million tokens
- Providers with undefined costs are deprioritized (sorted last)
This algorithm ensures cost optimization while maintaining high availability through intelligent failover.
The gateway uses a pipeline pattern with singleton managers for core services:
ConfigManager
: Loads and validates configuration.UsageManager
: Tracks and enforces rate limits in real-time.Router
: Selects the optimal provider for each incoming request using the cost-optimized algorithm.UnifiedExecutor
: Executes the request against the chosen provider using the Vercel AI SDK.
- Backend: Node.js, Express, TypeScript, Vercel AI SDK, Zod, Pino, rate-limiter-flexible
- Frontend: Vue 3, Vite, Pinia, Vue Router
- Build Tools: npm (use
npx
for binaries liketsc
,tsx
,vue-tsc
,prettier
), Docker, and the newdocker-build
npm script.