PRIME-RL

P1 Public

P1: Mastering Physics Olympiads with Reinforcement Learning

SimpleVLA-RL Public

SimpleVLA-RL: Scaling VLA Training via Reinforcement Learning

Python 946 44

Entropy-Mechanism-of-RL Public

The Entropy Mechanism of Reinforcement Learning for Large Language Model Reasoning.

Python 367 12

RL-Compositionality Public

FROM $f(x)$ AND $g(x)$ TO $f(g(x))$: LLMs Learn New Skills in RL by Composing Old Ones

Python 33 3

TTRL Public

[NeurIPS 2025] TTRL: Test-Time Reinforcement Learning

Python 885 65

PRIME Public

Scalable RL solution for advanced reasoning of language models

Python 1.8k 99

Provide feedback