Skip to main content


Project in reinforcement learning

Trainee Achievements

Project in reinforcement learning

IGERT trainees Mansley and Weinstein collaborated on a project in reinforcement learning. The classic definition of Markov Decision Processes (MDPs) is restricted to discrete state and actions spaces. While prior work has extended the models to continuous state spaces, in order to reach domains where continuous states are the natural fit, for example robotics, their work extended the model to include continuous action spaces in MDPs. The core idea of the algorithm is to perform an optimization over a stochastic function with a good exploration/exploitation trade off and focus on immediate (rather than global) planning. This enables planning in domains that had been previously thought to be too difficult or too expensive to handle. The work has presented at the International Conference on Machine Learning (ICML) 2010, and has been accepted to the International Conference on Automated Planning and Scheduling (ICAPS), 2011, where Mansley has been selected to run the workshop.