Prakash Panangaden(McGill University and Mila)
hosted by Rupak Majumdar
"Distributional analysis of sampling-based RL algorithms"
Distributional reinforcement learning (RL) is a new approach to RL withthe emphasis on the distribution of the rewards obtained rather than justthe expected reward as in traditional RL. In this work we take thedistributional point of view and analyse a number of sampling-basedalgorithms such as value iteration, TD(0) and policy iteration. Thesealgorithms have been shown to converge under various assumptions butusually with completely different proofs. We have developed a newviewpoint which allows us to prove convergence using a uniform approach.The idea is based on couplings and on viewing the approximation algorithmsas Markov processes in their own right. It originated from work onbisimulation metrics in which I have been working for the last quartercentury. This is joint work with Philip Amortila (U. Illinois), MarcBellemare (Google Brain) and Doina Precup (McGill, Mila and DeepMind).
Bio: Prakash Panangaden is a Professor of Computer Science at McGill University.His research interests are primarily in theoretical foundations of computer sciencewith a focus on stochastic systems, but ranges from black holes and curved space-timeto reinforcement learning. He has received numerous awards, including theTest of Time award at LICS. He is a Fellow of the ACM.
|Time:||Wednesday, 05.05.2021, 15:00|