- 
                Notifications
    You must be signed in to change notification settings 
- Fork 174
Pull requests: opendilab/LightZero
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
      feature(xjy): Unizero changes MCTs to PPO for strategy optimization in the Jericho environment
        
              
                research
  Research work in progress 
        
      
    
      
  
        
          #425
            opened Oct 8, 2025  by
            xiongjyu
            
        
        
            
    
  
    Loading…
 
        
        
      
    
      feature(tj): add monitoring for the gradient conflict metric of MoE in ScaleZero
      
    
        
          #421
            opened Sep 27, 2025  by
            tAnGjIa520
            
        
        
            
    
  
    Loading…
 
        
        
      
    
      feature(xjy): add RND configuration in unizero environment
        
              
                enhancement
  New feature or request 
              
                research
  Research work in progress 
        
      
    
        
          #420
            opened Sep 26, 2025  by
            xiongjyu
            
        
        
            
    
  
    Loading…
 
        
        
      
    
      feature(tj): add monitoring for the gradient conflict metric of MoE in ScaleZero
        
              
                config
  New or improved configuration 
              
                research
  Research work in progress 
        
      
    
      
  
        
          #418
            opened Sep 19, 2025  by
            tAnGjIa520
            
        
        
            
    
  
    Loading…
 
        
        
      
    
      feature(pu): add atari/dmc multitask and balance pipeline in ScaleZero paper
        
              
                config
  New or improved configuration 
              
                enhancement
  New feature or request 
              
                research
  Research work in progress 
        
      
    
        
          #417
            opened Sep 18, 2025  by
            puyuan1996
            
        
        
            
    
  
    Loading…
 
        
        
      
    
      fix(xjy): adding the messenger environment
        
              
                environment
  New or improved environment 
              
                research
  Research work in progress 
        
      
    
      
  
        
          #405
            opened Aug 18, 2025  by
            xiongjyu
            
        
        
            
    
  
    Loading…
 
        
        
      
    
      fix(tj): add moe grad analysis toy example 
        
              
                config
  New or improved configuration 
        
      
    
        
          #401
            opened Aug 12, 2025  by
            tAnGjIa520
            
        
        
            
    
  
    Loading…
 
        
        
      
    
      fix(pu): fix longrun performance of muzero in mspacman and qbert
        
              
                bug
  Something isn't working 
              
                config
  New or improved configuration 
        
      
    
      
  
        
          #400
            opened Aug 12, 2025  by
            puyuan1996
            
        
        
            
    
  
    Loading…
 
        
        
      
    
      fix(tj): finetune spaceinvaders from atari26 pretrained ckpt in ScaleZero
        
              
                enhancement
  New feature or request 
              
                research
  Research work in progress 
        
      
    
      
  
        
          #399
            opened Aug 12, 2025  by
            tAnGjIa520
            
        
        
            
    
  
    Loading…
 
        
        
      
    
      WIP: polish(pu): add a polished version of qwen prior policy
      
    
        
          #397
            opened Aug 11, 2025  by
            puyuan1996
            
        
        
            
    
  
    Loading…
 
        
        
      
    
      WIP: feature(nyz/pu): add init version of async demo using task pipeline
      
    
        
          #396
            opened Aug 5, 2025  by
            puyuan1996
            
        
        
            
    
  
    Loading…
 
        
        
      
    
      WIP: feature(pu): add init version of async unizero using multi-threading
      
    
        
          #395
            opened Aug 1, 2025  by
            puyuan1996
            
        
        
            
    
  
    Loading…
 
        
        
      
    
      feature(xjy): add multi-task learning pipeline in jericho environment
        
              
                config
  New or improved configuration 
              
                enhancement
  New feature or request 
        
      
    
      
  
        
          #365
            opened May 27, 2025  by
            xiongjyu
            
        
        
            
    
  
    Loading…
 
        
        
      
    
      fix(pu): fix chess reset bug when use alphazero ctree
      
    
        
          #364
            opened May 23, 2025  by
            puyuan1996
            
        
        
            
    
  
    Loading…
 
        
        
      
    
      How to fix the bug of loading trained model for evaluation
      
    
        
          #340
            opened Apr 2, 2025  by
            xiongjyu
            
        
        
            
    
  
    Loading…
 
        
        
      
    
      feature(xjy): add mamba2 as a unizero backbone option
        
              
                algorithm
  New algorithm 
        
      
    
      
  
        
          #338
            opened Mar 31, 2025  by
            xiongjyu
            
        
        
            
    
  
    Loading…
 
        
        
      
    
      WIP: feature(pu): add muzero with history encoder
        
              
                algorithm
  New algorithm 
              
                enhancement
  New feature or request 
        
      
    
        
          #334
            opened Mar 21, 2025  by
            puyuan1996
            
        
        
            
    
  
    Loading…
 
        
        
      
    
      feature(khev): add equation solver env and related configs
        
              
                enhancement
  New feature or request 
              
                environment
  New or improved environment 
        
      
    
      
  
        
          #331
            opened Mar 17, 2025  by
            Khev
            
        
        
            
    
  
    Loading…
 
        
        
      
    
      WIP: feature(whl): add decoder regularization
        
              
                enhancement
  New feature or request 
        
      
    
        
          #326
            opened Feb 21, 2025  by
            kxzxvbk
            
        
        
            
    
  
    Loading…
 
        
        
      
    
      WIP: feature(whl): add pretrained llm for unizero
        
              
                research
  Research work in progress 
        
      
    
        
          #310
            opened Dec 24, 2024  by
            kxzxvbk
            
        
        
            
    
  
    Loading…
 
        
        
      
    
      feature(pu): add seller env, self-judge pipeline and mcts/alphazero config
        
              
                algorithm
  New algorithm 
              
                config
  New or improved configuration 
              
                environment
  New or improved environment 
        
      
    
        
          #276
            opened Sep 19, 2024  by
            puyuan1996
            
        
        
            
    
  
    Loading…
 
        
        
      
    
      Requesting Guidance on training and testing in a tetris environment. #265
        
              
                environment
  New or improved environment 
        
      
    
        
          #267
            opened Aug 17, 2024  by
            lunathanael
            
        
        
            
    
  
    Loading…
 
        
        
      
    
      feature(wrh): add adaptive batch size for transition
        
              
                enhancement
  New feature or request 
        
      
    
        
          #256
            opened Jul 31, 2024  by
            ruiheng123
            
        
        
            
    
  
    Loading…
 
        
        
      
    Previous Next
  
  
  ProTip!
  Updated in the last three days: updated:>2025-10-23.