Senior Staff Research Scientist          Senior Research Scientist
   Google Research (Jun. 2020 - present)        Facebook AI Research (FAIR) (Oct. 2018 - Jun. 2020)

   Senior Staff Research Scientist          Senior Analytics Researcher
   Google DeepMind (Jun. 2017 - Oct. 2018)       Adobe Research (Oct. 2013 - Jun. 2017)

   Chargé de Recherche CR1             Chargé de recherche CR2
   INRIA Lille - Team SequeL (2010 - Oct. 2013)      INRIA Lille - Team SequeL (2008 - 2010)

   Habilitation à Diriger des Recherches (HDR)    Postdoctoral Fellow
   Université Lille 1, France (June 2014)         University of Alberta, Canada (2005 - 2008)

   Ph.D. in Computer Science
   University of Massachusetts Amherst, USA (2001 - 2005)

Machine Learning, Artificial Intelligence,         ghavamza at google dot com
Reinforcement Learning, Online Learning,       mohammad dot ghavamzadeh51 at gmail dot com
Recommendation Systems, Control



  • I will serve as a senior area chair for NeurIPS-2021.

  • Our paper on “Neural Lyapunov Redesign” got accepted at Learning for Dynamics & Control Conference (L4DC-2021).

  • Our paper on “Stochastic Bandits with Linear Constraints” got accepted at AISTATS-2021.

  • Our paper on “Control-aware Representations for Model-based Reinforcement Learning” got accepted at ICLR-2021.

  • Our paper on “Deep Bayesian Quadrature Policy Optimization” got accepted at AAAI-2021.

  • I serve as an area chair for ICML-2021.

  • I serve as an area chair for ICLR-2021.


  • Our paper on “Active Learning for Classification with Abstention” short-listed as one of the six finalists for the Jack Keil Wolf student paper award at IEEE International Symposium on Information Theory (ISIT-2020).

  • Our paper on “Mirror Descent Policy Optimization” accepted for a contributed talk (8 out of about 250 submissions) at the Deep Reinforcement Learning Workshop at NeurIPS-2020.

  • Eleven conference papers published: “Improved Algorithms for Conservative Exploration in Bandits” at AAAI-2020, “Prediction, Consistency, Curvature: Representation Learning for Locally-Linear Control” at ICLR-2020, “Randomized Exploration in Generalized Linear Bandits” and “Conservative Exploration in Reinforcement Learning” at AISTATS-2020, “Active Learning for Classification with Abstention” at IEEE International Symposium on Information Theory (ISIT-2020), “Adaptive Sampling for Estimating Probability Distributions”, “Multi-step Greedy Reinforcement Learning Algorithms”, and “Predictive Coding for Locally-Linear Control” at ICML-2020, “Active Model Estimation in Markov Decision Processes” at UAI-2020, “Safe Policy Learning for Continuous Control” at CoRL-2020, and “Online Planning with Lookahead Policies” at NeurIPS-2020.

  • I gave an invited talk on “Conservative Exploration in Bandits and Reinforcement Learning” at ICML workshop on “Challenges in Deploying and Monitoring Machine Learning Systems”, and an invited talk at the “Reinforcement Learning Theory Session” at INFORMS-2020.

  • I co-chaired a tutorial on “Exploration-Exploitation in Reinforcement Learning” at AAAI-2020.

  • I served as a senior area chair for NeurIPS-2020, and as an area chair for ICML-2020 and AISTATS-2020.


  • Our paper on “Tight Regret Bounds for Model-based Reinforcement Learning with Greedy Policies” was accepted for spotlight presentation at NeurIPS-2019.

  • Six conference papers published: “Tight Regret Bounds for Model-based Reinforcement Learning with Greedy Policies” at NeurIPS-2019, “Perturbed-History Exploration in Stochastic Linear Bandits” at UAI-2019, “Perturbed-History Exploration in Stochastic Multi-Armed Bandits” at IJCAI-2019, “Garbage In, Reward Out: Bootstrapping Exploration in Multi-Armed Bandits” at ICML-2019, and “Risk-sensitive Generative Adversarial Imitation Learning” and “Optimizing over a Restricted Policy Class in MDPs” at AISTATS-2019.

  • Our paper on “Lyapunov-based Policy Optimization for Continuous Control” won the best paper award at ICML-2019 workshop on “Reinforcement Learning in Real Life”.

  • I co-chaired a workshop on “Safety and Robustness in Decision-making” at NeurIPS-2019.

  • I served as an area chair for ICML-2019 and NeurIPS-2019.


  • Two journal papers published: “Proximal Gradient Temporal Difference Learning: Stable Reinforcement Learning with Polynomial Sample Complexity” at Journal of Artificial Intelligence Research (JAIR), and “Risk-Constrained Reinforcement Learning with Percentile Risk Criteria” at Journal of Machine Learning Research (JMLR).

  • Six conference papers published: “A Lyapunov-based Approach to Safe Reinforcement Learning” and “A Block Coordinate Ascent Algorithm for Mean-Variance Optimization” at NIPS-2018, “Path Consistency Learning in Tsallis Entropy Regularized MDPs” and “More Robust Doubly Robust Off-policy Evaluation” at ICML-2018, “Robust Locally-Linear Controllable Embedding” at AISTATS-2018, and “PAC Bandits with Risk Constraints” at ISAIM-2018.

  • I gave an invited talk on “Three Approaches to Safety in Sequential Decision-making” at ICML workshop on “Machine Learning for Causal Inference, Counterfactual Prediction, and Autonomous Action” (Causal ML).

  • I taught at the Deep Learning & Reinforcement Learning summer school organized by CIFAR and the Vector Institute at the University of Toronto in August.

  • I served as an area chair for NIPS-2018 and ICML-2018, and as a senior program committee member for IJCAI-2018 and AAAI-2018.


  • A journal paper published: “Sequential Decision-making with Coherent Risk” at IEEE Transaction on Automatic Control (TAC).

  • Eight conference papers published: “Conservative Contextual Linear Bandits” at NIPS-2017, “Active Learning for Accurate Estimation of Linear Models”, “Bottleneck Conditional Density Estimation”, “Diffusion Independent Semi-Bandit Influence Maximization”, and “Online Learning to Rank in Stochastic Click Models” at ICML-2017, “Sequential Multiple Hypothesis Testing with Type I Error Control” at AISTATS-2017, “Predictive Off-Policy Evaluation for Nonstationary Decision Problems” and “Automated Data Cleansing through Meta-Learning” at IAAI-2017.

  • Together with Marek Petrik, we gave a tutorial on “Risk-averse Decision-making and Control” at AAAI-2017. (tutorial website)

  • I gave an invited talk at the 2nd Asian Workshop on Reinforcement Learning in Seoul, South Korea on November 15, 2017.

  • I served as an area chair for NIPS-2017 and as a senior program committee member for AAAI-2017.


  • Four journal papers published: “Analysis of Classification-based Policy Iteration Algorithms”, “Bayesian Policy Gradient and Actor-Critic Algorithms”, and “Regularized Policy Iteration for Non-Parametric Function Spaces” at Journal of Machine Learning Research (JMLR), and “Variance-constrained Actor-Critic Algorithms for Discounted and Average Reward MDPs” at Machine Learning Journal (MLJ).

  • Four conference papers published: “Safe Policy Improvement by Minimizing Robust Baseline Regret” at NIPS-2016, “Improved Learning Complexity in Combinatorial Pure Exploration Bandits” at AISTATS-2016, “Proximal Gradient Temporal Difference Learning Algorithms” at the sister conference best paper track at IJCAI-2016, and “Graphical Model Sketch” at ECML-2016.

  • I gave an invited talk at the 13th European Workshop on Reinforcement Learning (EWRL) in Barcelona on December 3-4, 2016.

  • I served as a senior program committee member for IJCAI-2016 and ECML-2016.


  • Three journal papers published: “Approximate Modified Policy Iteration and its Application to the Game of Tetris” at Journal of Machine Learning Research (JMLR), “Classification-based Approximate Policy Iteration” at IEEE Transactions on Automatic Control (TAC), and “Bayesian Reinforcement Learning: A Survey” at Foundation and Trends in Machine Learning.

  • Five conference papers published: “High Confidence Off-Policy Evaluation” at AAAI-2015, “Maximum Entropy Semi-Supervised Inverse Reinforcement Learning” at IJCAI-2015, “Building Personalized Ad Recommendation Systems for Life-Time Value Optimization with Guarantees” at IJCAI-2015, “High Confidence Policy Improvement” at ICML-2015, and “Policy Gradient for Coherent Risk Measures” at NIPS-2015.

  • Our paper entitled “Finite-Sample Analysis of Proximal Gradient TD Algorithms” won the Facebook best student paper award at UAI-2015.

  • I co-chaired two workshops: 12th European Workshop on Reinforcement Learning (EWRL-12) as a workshop at ICML-2015 and “Machine Learning in eCommerce” at NIPS-2015.

  • My student, Victor Gabillon, won the AFIA (French Association for Artificial Intelligence) prize for the 2nd best Ph.D. thesis (completed in 2014) on artificial intelligence in France.

  • I served as a senior program committee member for IJCAI-2015.


  • A paper published: “Algorithms for CVaR Optimization in MDPs” at NIPS-2014.

  • I co-chaired three workshops: “Sequential Decision-Making with Big Data” at AAAI-2014, “Customers Value Optimization in Digital Marketing” at ICML-2014, and “Large-scale Reinforcement Learning and Markov Decision Problems” at NIPS-2014.

  • I successfully defended my “Habilitation à Diriger des Recherches” (HDR) thesis and graduated my Ph.D. student Victor Gabillon in June 2014. Victor will be a postdoc with Prof. Peter Bartlett at UC Berkeley starting October 2014.

  • I served as an area chair for NIPS-2014.