This project serves the Lean REPL using FastAPI. It supports massive parallelization to verify Lean 4 proofs at scale.
📄 Technical report: Technical Report
- High-throughput Lean4 proof verification
- FastAPI-based async server with configurable concurrency
- REPL pooling and context caching for performance
Clone this repository and change directory:
git clone [email protected]:project-numina/kimina-lean-server.git
cd kimina-lean-server
You can build the Docker image with (add --build-arg LEAN_VERSION=v4.18.0
if you don't want the default v4.15.0
Lean version):
cp .env.template .env
docker compose up -d
Test it works with a request:
curl --request POST \
--url http://localhost/verify \
--header 'Content-Type: application/json' \
--data '{
"codes": [
{
"custom_id": "1234",
"proof": "#check Nat"
}
],
"infotree_type": "original"
}' | jq
To shut down the container / view logs:
docker compose down
docker compose logs -f
First, install elan — the Lean version manager: reference.
After installing elan, make sure that elan --version
works correctly.
(lake --version
should also work after elan is properly installed.)
Install dependencies:
pip install -e .
Set Up the Lean Environment:
bash setup.sh
This script installs Lean 4 and builds mathlib4
and repl
in the current working directory.
Start the FastAPI server:
cp .env.template .env
python -m server
Once running, the server exposes a FastAPI application for LeanREPL interaction.
Note
Make sure mathlib4
and repl
exist in the workspace directory before launching the server.
The server is up! You can test the endpoint with:
pytest
You can verify a large number of Lean proofs in parallel using the following example:
import nest_asyncio
from client import Lean4Client
# Enable nested asyncio for Jupyter notebooks
nest_asyncio.apply()
client = Lean4Client(base_url="http://127.0.0.1:12332")
mock_proof = """import Mathlib
import Aesop
set_option maxHeartbeats 0
open BigOperators Real Nat Topology Rat
theorem lean_workbook_10009 (a b c: ℝ) (ha : a ≥ 0 ∧ b ≥ 0 ∧ c ≥ 0 ∧ a + b + c = 1): a^3 + b^3 + c^3 + (15 * a * b * c)/4 ≥ 1/4 := by
nlinarith [sq_nonneg (a - b), sq_nonneg (b - c), sq_nonneg (c - a),
sq_nonneg (a + b + c)]"""
resposne = client.verify([
{"proof": mock_proof, "custom_id": "1"},
{"proof": mock_proof, "custom_id": "2"}
], timeout=30)
response:
{
"results":
[
{
"custom_id": "1",
"error": null,
"response": {
"messages": [
{
"severity": "error",
"pos": {"line": 8, "column": 0},
"endPos": {"line": 9, "column": 22},
"data": "linarith failed to find a contradiction\ncase a\na b c : ℝ\nha : a ≥ 0 ∧ b ≥ 0 ∧ c ≥ 0 ∧ a + b + c = 1\na✝ : 1 / 4 > a ^ 3 + b ^ 3 + c ^ 3 + 15 * a * b * c / 4\n⊢ False\nfailed"
}
],
"env": 1,
"time": 1.0048656463623047
}
},
{
"custom_id": "2",
"..."
}
]
}
Variable | Default | Description |
---|---|---|
LEANSERVER_HOST |
0.0.0.0 |
Host address to bind the server |
LEANSERVER_PORT |
12332 |
Port number for the server |
LEANSERVER_API_KEY |
None |
Optional API key for authentication |
LEANSERVER_LOG_DIR |
./logs |
Directory where logs are stored |
LEANSERVER_LOG_LEVEL |
INFO |
Logging level (DEBUG , INFO , ERROR , etc.) |
LEANSERVER_WORKSPACE |
$(pwd) | Root directory containing mathlib and repl |
LEANSERVER_MAX_REPLS |
CPU count | Maximum number of Lean REPL instances |
LEANSERVER_MAX_CONCURRENT_REQUESTS |
CPU count | Maximum number of concurrent requests in the Lean REPL |
You can run benchmarks in the benchmarks
directory on dataset: Goedel-LM/Lean-workbook-proofs
If running benchmarks from an end-user computer, you may face the following error:
tenacity.before_sleep:log_it:65 - Retrying main.Lean4Client._query..query_with_retries in 10.0 seconds as it raised ClientConnectorError: Cannot connect to host 127.0.0.1:80 ssl:default [Too many open files].
You can check the maximum number of open files on your machine with ulimit -n
(256 on a MacBook Pro). It may be smaller than what's needed to run the benchmark: increase it with ulimit -n 65535
.
Server logs may show the following failure when a REPL gets acquired prior to be being deleted. It does not impact performances, it's only at the cache level.
Failed to evict header 'import Mathlib\nimport Aesop with id 306512ca-8935-4cdb-b88b-7510e0c98ac3, putting it back
Mode | Valid Proofs (%) | Total Verification Time (s) | Average Verification Time (s) |
---|---|---|---|
Cached | 96.00 | 350.29 | 3.65 |
Non-Cached | 96.00 | 493.67 | 5.14 |
Note:
- The benchmarks were run on a machine with 10 CPUs (MacBook Pro M2).
- Script used:
benchmark.py
- Dataset: First 100 samples from
Goedel-LM/Lean-workbook-proofs
- Params:
timeout = 60s
,batch = 1
,num_proc = 10
(number of CPU cores) - Server:
LEANSERVER_MAX_REPLS = 10
andLEANSERVER_MAX_CONCURRENT_REQUESTS = 10
Contributions are welcome! Please open an issue or submit a pull request.
This project is licensed under the MIT License. You are free to use, modify, and distribute this software with proper attribution. See the LICENSE file for full details.