LLM Gateway

An intelligent reverse proxy for routing requests to multiple Large Language Model (LLM) providers with real-time monitoring and cost optimization.

Features

Intelligent Routing: Automatically selects providers using a cost-optimized algorithm that prioritizes zero-cost providers and selects the lowest-cost paid options when needed.
Real-time Monitoring: A Vue.js dashboard provides a live view of usage statistics, including requests, tokens, and cost per provider.
Dynamic Configuration: Update provider settings, models, and limits from the UI without restarting the server.
Extensive Provider Support: Natively supports a wide range of LLM providers through the Vercel AI SDK, including OpenAI, Anthropic, Google, Groq, Mistral, claude-code, gemini-cli, and any OpenAI-compatible API.
Unified API: A single, consistent OpenAI-compatible endpoint for all backend providers.

Getting Started (Development)

Prerequisites

Node.js 22+
npm (comes with Node.js)

Setup

Clone the repository:

git clone https://github.com/mcowger/costrouter.git
cd costrouter

Install dependencies:
```
npm install
```
Create and configure your settings:
- Create a config directory: mkdir config
- Optional: Copy the example configuration: cp config.test.jsonc config/config.jsonc. If you dont create a config, you'll just start up with an empty config that you can edit via the UI.
- Optional Edit config/config.jsonc and add your provider API keys and settings.
Run the application:
```
npm run dev
```
This command starts both the backend server (on port 3000) and the frontend UI (on port 5173) concurrently.
Access the Dashboard: Open http://localhost:5173 in your browser.

Running with Docker

Build the Docker image:
```
npm run docker-build
```

Create a config directory (optional):

mkdir -p ./config
cp config.test.jsonc ./config/config.jsonc  # Optional: copy example config

Run the Docker container:
```
docker run --rm -p 3000:3000 -v $(pwd)/config:/config --name llm-gateway-container llm-gateway
```
- The gateway will be accessible on http://localhost:3000.
- The UI is served from the same port at the / route.
- Configuration is mounted from your host machine. If you dont have an existing config file, an empty one will be made for you.

Basic Usage

Send requests to the gateway's OpenAI-compatible endpoint:

curl -X POST http://localhost:3000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer any-string-is-valid" \
  -d '{
    "model": "gpt-4",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

Provider Selection Algorithm

The gateway uses an intelligent, cost-optimized routing algorithm that works as follows:

Model Matching: Identifies all providers that support the requested model (by name or mapped name)
Rate Limit Filtering: Removes providers that have exceeded their configured rate limits
Cost-Based Partitioning: Separates remaining providers into two groups:
- Zero-cost providers: Those with all pricing fields explicitly set to 0
- Paid providers: All others, including those with undefined/unknown pricing
Selection Strategy:
- If zero-cost providers are available, randomly selects one to distribute load
- Otherwise, selects the lowest-cost paid provider based on:
  - Primary sort: Input cost per million tokens
  - Secondary sort: Output cost per million tokens
  - Providers with undefined costs are deprioritized (sorted last)

This algorithm ensures cost optimization while maintaining high availability through intelligent failover.

Architecture

The gateway uses a pipeline pattern with singleton managers for core services:

ConfigManager: Loads and validates configuration.
UsageManager: Tracks and enforces rate limits in real-time.
Router: Selects the optimal provider for each incoming request using the cost-optimized algorithm.
UnifiedExecutor: Executes the request against the chosen provider using the Vercel AI SDK.

Technology Stack

Backend: Node.js, Express, TypeScript, Vercel AI SDK, Zod, Pino, rate-limiter-flexible
Frontend: Vue 3, Vite, Pinia, Vue Router
Build Tools: npm (use npx for binaries like tsc, tsx, vue-tsc, prettier), Docker, and the new docker-build npm script.

Name		Name	Last commit message	Last commit date
Latest commit History 89 Commits
.github/workflows		.github/workflows
.kilocode/rules/memory-bank		.kilocode/rules/memory-bank
.vscode		.vscode
config		config
schemas		schemas
scripts		scripts
server		server
test		test
ui		ui
.dockerignore		.dockerignore
.editorconfig		.editorconfig
.gitignore		.gitignore
.nvmrc		.nvmrc
Dockerfile		Dockerfile
README.md		README.md
env.d.ts		env.d.ts
jest.config.js		jest.config.js
jest.e2e.config.js		jest.e2e.config.js
nodemon.json		nodemon.json
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json
tsconfig.tsbuildinfo		tsconfig.tsbuildinfo

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

LLM Gateway

Features

Getting Started (Development)

Prerequisites

Setup

Running with Docker

Basic Usage

Provider Selection Algorithm

Architecture

Technology Stack

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

mcowger/costrouter

Folders and files

Latest commit

History

Repository files navigation

LLM Gateway

Features

Getting Started (Development)

Prerequisites

Setup

Running with Docker

Basic Usage

Provider Selection Algorithm

Architecture

Technology Stack

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages