PRIME-RL
Researching scalable (RL) methods on language models.
Pinned Loading
Repositories
    Showing 7 of 7 repositories
    
  
  
    
      -           RL-Compositionality Public
FROM $f(x)$ AND $g(x)$ TO $f(g(x))$: LLMs Learn New Skills in RL by Composing Old Ones
PRIME-RL/RL-Compositionality’s past year of commit activity  -           Entropy-Mechanism-of-RL Public
The Entropy Mechanism of Reinforcement Learning for Large Language Model Reasoning.
PRIME-RL/Entropy-Mechanism-of-RL’s past year of commit activity  
Top languages
Loading…