This repository contains our research on the comparative analysis of Reinforcement Learning (RL) algorithms in continuous control environments. We focus on benchmarking discrete and continuous algorithms across various tasks in OpenAI Gym and MuJoCo.
In this project, we conduct a comparative study of discrete and continuous policies, represented by Expected SARSA and DDQN, and TRPO and PPO respectively. We explore the performance of these algorithms across various tasks in OpenAI Gym and MuJoCo, employing three types of discretizing methods for the continuous state space - Fourier basis, radial basis function (RBF), and tile coding. Our aim is to provide a comprehensive evaluation of these algorithms using a range of metrics, including convergence speed, bias, and average return.
This project is a collaborative effort by Cheryl Wang, Alan Yang, and Haowei Qiu. Each member has contributed equally to the project
*Here are code references for TRPO, PPO, and DDQN algorithms.