Publications By Year

JOURNALS

JMLR = 8, IEEE Trans. & Journals = 5, MLJ = 1, JAIR = 1, Automatica = 1, Foundation & Trends = 1
JAAMAS = 1

CONFERENCES

ICML = 29, NeurIPS = 27, AISTATS = 13, IJCAI = 6, AAAI = 6, ICLR = 7, UAI = 3, AAMAS = 3
ALT = 1, ACC = 2, CDC = 1

2026

Conference

Zhuotong Chen, Fang Liu, Jennifer Zhu, Jiayu Li, Yanjun Qi, Haozhu Wang, & Mohammad Ghavamzadeh. “Preference Optimization via Contrastive Divergence: Your Policy is Secretly an NLL Estimator”. Accepted for Oral Presentation. Proceedings of the Fortieth Conference on Artificial Intelligence (AAAI-2026), 2026.

2025

Journal

Aldo Pacchiano, Mohammad Ghavamzadeh, & Peter Bartlett. “Contextual Bandits with Stage-wise Constraints”. Journal of Machine Learning Research (JMLR), 2025 pdf

Conference

Soumya Ghosal, Souradip Chakraborty, Avinash Reddy, Yifu Li, Mengdi Wang, Dinesh Manocha, Furong Huang, Mohammad Ghavamzadeh, & Amrit Singh Bedi. “Does Thinking More Always Help? Mirage of Test-Time Scaling in Reasoning Models”. Proceedings of the Thirty-Ninth Annual Conference on Advances in Neural Information Processing Systems (NeurIPS-2025), 2025. pdf
Rohan Deb, Mohammad Ghavamzadeh, & Arindam Banerjee. “Conservative Contextual Bandits: Beyond Linear Representations”. Proceedings of the Thirteenth International Conference on Learning Representations (ICLR-2025), 2025. pdf
Jia Lin Hau, Erick Delage, Esther Derman, Mohammad Ghavamzadeh, & Marek Petrik. “Q-learning for Quantile MDPs: A Decomposition, Performance, and Convergence Analysis”. Proceedings of the Twenty-Eighth International Conference on Artificial Intelligence and Statistics (AISTATS-2025), 2025. pdf
Rohan Deb, Mohammad Ghavamzadeh, & Arindam Banerjee. “Thompson Sampling for Constrained Bandits”. Proceedings of the Second Reinforcement Learning Conference (RLC-2025), 2025. pdf
Kishan Panaganti, Zaiyan Xu, Dileep Kalathil, & Mohammad Ghavamzadeh. “Bridging Distributionally Robust Learning and Offline RL: An Approach to Mitigate Distribution Shift and Partial Data Coverage”. Proceedings of the Seventh Annual Learning for Dynamics & Control Conference (L4DC-2025), 2025.

2024

Journal

Moloud Abdar, Meenakshi Kollati, Swaraja Kuraparthi, Farhad Pourpanah, Daniel McDuff, Mohammad Ghavamzadeh, Shuicheng Yan, Abduallah Mohamed, Abbas Khosravi, Erik Cambria, & Fatih Porikli. “A Review of Deep Learning for Video Captioning”. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2024 (DOI: 10.1109/TPAMI.2024.3522295)
Christina Gopfert, Alex Haig, Chih-wei Hsu, Yinlam Chow, Ivan Vendrov, Tyler Lu, Deepak Ramachandran, Hubert Pham, Mohammad Ghavamzadeh, & Craig Boutilier. “Discovering Personalized Semantics for Soft Attributes in Recommender Systems using Concept Activation Vectors”. ACM Transactions on Recommender Systems, 2024 (DOI: 10.1145/3658675)

Conference

Kyuyoung Kim, Jongheon Jeong, Minyong An, Mohammad Ghavamzadeh, Krishnamurthy Dvijotham, Jinwoo Shin, & Kimin Lee. “Confidence-aware Reward Optimization for Fine-tuning Text-to-Image Models”. Proceedings of the Twelfth International Conference on Learning Representations (ICLR-2024), 2024. pdf
Amin Rakhsha, Mete Kemertas, Mohammad Ghavamzadeh, & Amir-massoud Farahmand. “Maxi- mum Entropy Model Correction in Reinforcement Learning”. Proceedings of the Twelfth International Conference on Learning Representations (ICLR-2024), 2024. pdf
Marek Petrik, Guy Tennenholtz, & Mohammad Ghavamzadeh. “Bayesian Regret Minimization in Offline Bandits”. Proceedings of the Forty-First International Conference on Machine Learning (ICML-2024), pp. 40502-40522, 2024. pdf
Jihwan Jeong, Yinlam Chow, Guy Tennenholtz, Chih-wei Hsu, Aza Tulepbergenov, Mohammad Ghavamzadeh, & Craig Boutilier. “Factual and Tailored Recommendation Endorsements using Language Models and Reinforcement Learning”. First Conference on Language Modeling (COLM-2024), 2024. pdf
Audrey Huang, Nan Jiang, Marek Petrik, & Mohammad Ghavamzadeh. “Non-adaptive Online Fine-tuning for Offline Reinforcement Learning”. First Reinforcement Learning Conference (RLC-2024), 2024. pdf
Mohammad Javad Azizi, Thang Nhat Duong, Yasin Abbasi Yadkori, Andras Gyorgy, Claire Vernade, & Mohammad Ghavamzadeh. “Non-Stationary Bandits and Meta-Learning with a Small Set of Optimal Arms”. First Reinforcement Learning Conference (RLC-2024), 2024. pdf

2023

Conference

Jincheng Mei, Bo Dai, Alekh Agarwal, Mohammad Ghavamzadeh, Csaba Szepevari, & Dale Schuurmans. “Ordering-based Conditions for Global Convergence of Policy Gradient Methods”. Accepted for Oral Presentation (%0.54 acceptance - 67 out of 12345 submissions). Proceedings of the Thirty-Seventh Annual Conference on Advances in Neural Information Processing Systems (NeurIPS-2023), 2023. pdf
Ying Fan, Olivia Watkins, Yuqing Du, Hao Liu, Moonkyung Ryu, Craig Boutilier, Pieter Abbeel, Mohammad Ghavamzadeh, Kangwook Lee, & Kimin Lee. “DPOK: Reinforcement Learning for Fine-tuning Text-to-Image Diffusion Models”. Proceedings of the Thirty-Seventh Annual Conference on Advances in Neural Information Processing Systems (NeurIPS-2023), 2023. pdf
Jia Lin Hau, Erick Delage, Mohammad Ghavamzadeh, & Marek Petrik. “On Dynamic Programming Decompositions of Static Risk Measures in Markov Decision Processes”. Proceedings of the Thirty-Seventh Annual Conference on Advances in Neural Information Processing Systems (NeurIPS-2023), 2023. pdf
Dhawal Gupta, Yinlam Chow, Aza Tulepbergenov, Mohammad Ghavamzadeh, & Craig Boutilier. “Offline Reinforcement Learning for Mixture-of-Expert Dialogue Management”. Proceedings of the Thirty-Seventh Annual Conference on Advances in Neural Information Processing Systems (NeurIPS-2023), 2023. pdf
Kishan Panaganti, Zaiyan Xu, Dileep Kalathil, & Mohammad Ghavamzadeh. “Distributionally Robust Behavioral Cloning for Robust Imitation Learning”. Proceedings of the Sixty Second IEEE Conference on Decision and Control (CDC-2023), 2023. pdf
Joey Hong, Branislav Kveton, Manzil Zaheer, Sumeet Katariya, & Mohammad Ghavamzadeh. “Multi-Task Off-Policy Learning from Bandit Feedback”. Proceedings of the Fortieth International International Conference on Machine Learning (ICML-2023), 2023. pdf
Yinlam Chow, Aza Tulepbergenov, Ofir Nachum, MoonKyung Ryu, Mohammad Ghavamzadeh, & Craig Boutilier. “A Mixture-of-Expert Approach to RL-based Dialogue Management”. Proceedings of the Eleventh International Conference on Learning Representations (ICLR-2023), 2023. pdf
Jia Lin Hau, Marek Petrik, & Mohammad Ghavamzadeh. “Entropic Risk Optimization in Discounted MDPs”. Proceedings of the Twenty-Sixth International Conference on Artificial Intelligence and Statistics (AISTATS-2023), 2023. pdf
Christoph Dann, Mohammad Ghavamzadeh, & Teodor Marinov. “Multiple-policy High-confidence Policy Evaluation”. Proceedings of the Twenty-Sixth International Conference on Artificial Intelligence and Statistics (AISTATS-2023), 2023. pdf
Javad Azizi, Branislav Kveton, Mohammad Ghavamzadeh, & Sumeet Katariya. “Meta-Learning for Simple Regret Minimization”. Proceedings of the Thirty-Seventh Conference on Artificial Intelligence (AAAI-2023), 2023. pdf

Workshop

Audrey Huang, Mohammad Ghavamzadeh, Nan Jiang, & Marek Petrik. “Non-adaptive Online Fine- tuning for Offline Reinforcement Learning”. Seventh Workshop on “Generalization in Planning” (GenPlan-2023), Thirty-Seventh Annual Conference on Advances in Neural Information Processing Systems (NeurIPS-2023), 2023.

2022

Conference

Kishan Panaganti, Zaiyan Xu, Dileep Kalathil, & Mohammad Ghavamzadeh. “Robust Reinforcement Learning using Offline Data”. Proceedings of the Thirty-Sixth Annual Conference on Advances in Neural Information Processing Systems (NeurIPS-2022), 2022. pdf
Amin Rakhsha, Andrew Wang, Mohammad Ghavamzadeh, & Amir-massoud Farahmand. “Operator Splitting Value Iteration”. Proceedings of the Thirty-Sixth Annual Conference on Advances in Neural Information Processing Systems (NeurIPS-2022), 2022. pdf
Ido Greenberg, Yinlam Chow, Mohammad Ghavamzadeh, & Shie Mannor. “Efficient Risk-Averse Reinforcement Learning”. Proceedings of the Thirty-Sixth Annual Conference on Advances in Neural Information Processing Systems (NeurIPS-2022), 2022. pdf
Gecia Bravo-Hermsdorff, Robert Busa-Fekete, Mohammad Ghavamzadeh, Andres Munoz medina, & Umar Syed. “Private and Communication-Efficient Algorithms for Entropy Estimation”. Proceedings of the Thirty-Sixth Annual Conference on Advances in Neural Information Processing Systems (NeurIPS-2022), 2022. pdf
Ahmadreza Moradipari, Berkay Turan, Yasin Abbasi-Yadkori, Mahnoosh Alizadeh, & Mohammad Ghavamzadeh. “Feature and Parameter Selection in Stochastic Linear Bandits”. Proceedings of the Thirty-Ninth International International Conference on Machine Learning (ICML-2022), 2022. pdf
Joey Hong, Branislav Kveton, Manzil Zaheer, Sumeet Katariya, & Mohammad Ghavamzadeh. “Deep Hierarchy in Bandits”. Proceedings of the Thirty-Ninth International International Conference on Machine Learning (ICML-2022), 2022. pdf
Mohammad Javad Azizi, Branislav Kveton, & Mohammad Ghavamzadeh. “Fixed-Budget Best-Arm Identification in Structured Bandits”. Selected for a long oral presentation (%4 acceptance). Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence (IJCAI-2022), 2022. pdf
Manan Tomar, Lior Shani, Yonathan Efroni, & Mohammad Ghavamzadeh. “Mirror Descent Policy Optimization”. Proceedings of the Tenth International Conference on Learning Representations (ICLR-2022), 2022. pdf
Ahmadreza Moradipari, Mohammad Ghavamzadeh, Taha Rajabzadeh, Christos Thrampoulidis, & Mahnoosh Alizadeh. “Multi-Environment Meta-Learning in Stochastic Linear Bandits”. Proceedings of IEEE International Symposium on Information Theory (ISIT-2022), 2022. pdf
Ahmadreza Moradipari, Mohammad Ghavamzadeh, & Mahnoosh Alizadeh. “Collaborative Multi-agent Stochastic Linear Bandits”. Proceedings of the 2022 American Control Conference (ACC-2022), 2022. pdf
Joey Hong, Branislav Kveton, Manzil Zaheer, & Mohammad Ghavamzadeh. “Hierarchical Bayesian Bandits”. Proceedings of the Twenty-Fifth International Conference on Artificial Intelligence and Statistics (AISTATS-2022), 2022. pdf
Joey Hong, Branislav Kveton, Manzil Zaheer, Mohammad Ghavamzadeh, & Craig Boutilier. “Thompson Sampling with a Mixture Prior”. Proceedings of the Twenty-Fifth International Conference on Artificial Intelligence and Statistics (AISTATS-2022), 2022. pdf

2021

Journal

Shubhanshu Shekhar, Mohammad Ghavamzadeh, & Tara Javidi. “Active Learning for Classification with Abstention”. IEEE Journal on Selected Areas in Information Theory (JSAIT), 2(2):705-719, 2021 (DOI: 10.1109/JSAIT.2021.3081433). pdf
Moloud Abdar, Farhad Pourpanah, Sadiq Hussain, Dana Rezazadegan, Li Liu, Mohammad Ghavamzadeh, Paul Fieguth, Xiaochun Cao, Abbas Khosravi, Rajendra Acharya, Vladimir Makarenkov, & Saeid Nahavandi. “A Review on Uncertainty Quantification in Deep Learning: Techniques, Applications, and Challenges”. Elsevier Journal on Information Fusion, 76:243-297, 2021 (DOI: 10.1016/j.inffus.2021.05.008) (winner of the Information Fusion Journal 2022 best survey award). pdf

Conference

Shubhanshu Shekhar, Greg Fields, Mohammad Ghavamzadeh, & Tara Javidi. “Adaptive Sampling for Minimax Fair Classification”. Proceedings of the Thirty-Fifth Annual Conference on Advances in Neural Information Processing Systems (NeurIPS-2021), 2021. pdf
Amir-massoud Farahmand and Mohammad Ghavamzadeh. “PID Accelerated Value Iteration Algorithm”. Proceedings of the Thirty-Eighth International Conference on Machine Learning (ICML-2021), 2021. pdf
Yinlam Chow, Brandon Cui, Moonkyung Ryu, & Mohammad Ghavamzadeh. “Variational Model-based Policy Optimization”. Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence (IJCAI-2021), 2021. pdf
Arash Mehrjou, Mohammad Ghavamzadeh, & Bernhard Schölkopf. “Neural Lyapunov Redesign”. Proceedings of the Third Annual Learning for Dynamics & Control Conference (L4DC-2021), 2021. pdf
Aldo Pacchiano, Mohammad Ghavamzadeh, Peter Bartlett, & Heinrich Jiang. “Stochastic Bandits with Linear Constraints”. Proceedings of the Twenty-Fourth International Conference on Artificial Intelligence and Statistics (AISTATS-2021), 2021. pdf
Brandon Cui, Yinlam Chow, & Mohammad Ghavamzadeh. “Control-aware Representations for Model-based Reinforcement Learning”. Proceedings of the Ninth International Conference on Learning Representations (ICLR-2021), 2021. pdf
Ravi-Tej Akella, Kamyar Azizzadenesheli, Mohammad Ghavamzadeh, Yisong Yue, & Anima Anandkumar. “Deep Bayesian Quadrature Policy Optimization”. Proceedings of the Thirty-Fifth Conference on Artificial Intelligence (AAAI-2021), 2021. pdf

2020

Conference

Yonathan Efroni, Mohammad Ghavamzadeh, & Shie Mannor. “Online Planning with Lookahead Policies”. Proceedings of the Thirty-Fourth Annual Conference on Advances in Neural Information Processing Systems (NeurIPS-2020), 2020. pdf
Yinlam Chow, Ofir Nachum, Aleksandra Faust, Edgar Duenez-Guzman, & Mohammad Ghavamzadeh. “Safe Policy Learning for Continuous Control”. Proceedings of the Fourth Conference on Robot Learning (CoRL-2020), 2020. pdf
Shubhanshu Shekhar, Tara Javidi, & Mohammad Ghavamzadeh. “Adaptive Sampling for Estimating Probability Distributions”. Proceedings of the Thirty-Seventh International Conference on Machine Learning (ICML-2020), 2020. pdf
Manan Tomar, Yonathan Efroni, & Mohammad Ghavamzadeh. “Multi-Step Greedy Reinforcement Learning Algorithms”. Proceedings of the Thirty-Seventh International Conference on Machine Learning (ICML-2020), 2020. pdf
Rui Shu, Tung Nguyen, Yinlam Chow, Tuan Pham, Khoat Than, Mohammad Ghavamzadeh, Stefano Ermon, & Hung Bui. “Predictive Coding for Locally-Linear Control”. Proceedings of the Thirty-Seventh International Conference on Machine Learning (ICML-2020), 2020. pdf
Jean Tarbouriech, Shubhanshu Shekhar, Mohammad Ghavamzadeh, Matteo Pirotta, & Alessandro Lazaric. “Active Model Estimation in Markov Decision Processes”. Proceedings of the Thirty-Sixth Conference on Uncertainty in Artificial Intelligence (UAI-2020), 2020. pdf
Shubhanshu Shekhar, Mohammad Ghavamzadeh, & Tara Javidi. “Active Learning for Classification with Abstention”. Proceedings of IEEE International Symposium on Information Theory (ISIT-2020), 2020 (short-listed as one of the six finalists for the Jack Keil Wolf student paper award). pdf
Evrard Garcelon, Mohammad Ghavamzadeh, Alessandro Lazaric, & Matteo Pirotta. “Conservative Exploration in Reinforcement Learning”. Proceedings of the Twenty-Third International Conference on Artificial Intelligence and Statistics (AISTATS-2020), 2020. pdf
Branislav Kveton, Manzeel Zaheer, Csaba Szepesvári, Lihong Li, Mohammad Ghavamzadeh, & Craig Boutilier. “Randomized Exploration in Generalized Linear Bandits” Proceedings of the Twenty-Third International Conference on Artificial Intelligence and Statistics (AISTATS-2020), 2020. pdf
Nir Levine, Yinlam Chow, Rui Shu, Ang Li, Mohammad Ghavamzadeh, & Hung Bui. “Prediction, Consistency, Curvature: Representation Learning for Locally Linear Control”. Proceedings of the Eighth International Conference on Learning Representations (ICLR-2020), 2020. pdf
Evrard Garcelon, Mohammad Ghavamzadeh, Alessandro Lazaric, & Matteo Pirotta. “Improved Algorithms for Conservative Exploration in Bandits”. Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI-2020), 2020. pdf

Workshop

Manan Tomar, Lior Shani, Yonathan Efroni, & Mohammad Ghavamzadeh. “Mirror Descent Policy Optimization”. Workshop on “Deep Reinforcement Learning”, Thirty-Fourth Annual Conference on Advances in Neural Information Processing Systems (NeurIPS-2020), 2020 (selected for a contributed talk – 8 out of over 250 submissions).
Ravi-Tej Akella, Kamyar Azizzadenasheli, Mohammad Ghavamzadeh, Yisong Yue, & Anima Anandkumar. “Deep Bayesian Quadrature Policy Gradient”. Workshops on “Deep Reinforcement Learning” and “Challenges of Real-World Reinforcement Learning”, Thirty-Fourth Annual Conference on Advances in Neural Information Processing Systems (NeurIPS-2020), 2020.

2019

Conference

Yonathan Effroni, Nadav Merlis, Mohammad Ghavamzadeh, & Shie Mannor. “Tight Regret Bounds for Model-based Reinforcement Learning with Greedy Policies”. Accepted for Spotlight Presentation. Proceedings of the Thirty-Second Annual Conference on Advances in Neural Information Processing Systems (NeurIPS-2019), pp. 12224-12234, 2019. pdf
Branislav Kveton, Csaba Szepesvári, Mohammad Ghavamzadeh, & Craig Boutilier. “Perturbed-History Exploration in Stochastic Linear Bandits”. Proceedings of the Thirty-Fifth Conference on Uncertainty in Artificial Intelligence (UAI-2019), 2019. pdf
Branislav Kveton, Csaba Szepesvári, Mohammad Ghavamzadeh, & Craig Boutilier. “Perturbed-History Exploration in Stochastic Multi-Armed Bandits”. Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence (IJCAI-2019), pp. 2786-2793, 2019. pdf
Branislav Kveton, Csaba Szepesvári, Sharan Vaswani, Zheng Wen, Mohammad Ghavamzadeh, & Tor Lattimore. “Garbage In, Reward Out: Bootstrapping Exploration in Multi-Armed Bandits”. Proceedings of the Thirty-Sixth International Conference on Machine Learning (ICML-2019), pp. 3601-3610, 2019. pdf
Ershad Banijamali, Yasin Abbasi-Yadkori, Mohammad Ghavamzadeh, & Nikos Vlassis. “Optimizing over a Restricted Policy Class in MDPs”. Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics (AISTATS-2019), pp. 3042-3050, 2019. pdf
Jonathan Pierre Lacotte, Mohammad Ghavamzadeh, Yinlam Chow, & Marco Pavone. “Risk-sensitive Generative Adversarial Imitation Learning”. Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics (AISTATS-2019), pp. 2154-2163, 2019. pdf

Workshop

Jorge Méndez, Alborz Geramifard, Mohammad Ghavamzadeh, & Bing Liu. “Reinforcement Learning of Multi-Domain Dialog Policies via Action Embeddings”. Workshop on “Conversational AI”, Thirty-Third Annual Conference on Advances in Neural Information Processing Systems (NeurIPS-2019), 2019.
Evrard Garcelon, Mohammad Ghavamzadeh, Alessandro Lazaric, & Matteo Pirotta. “Conservative Exploration in Finite Horizon Markov Decision Processes”. Workshop on “Safety and Robustness in Decision-making”, Thirty-Third Annual Conference on Advances in Neural Information Processing Systems (NeurIPS-2019), 2019.
Evrard Garcelon, Mohammad Ghavamzadeh, Alessandro Lazaric, & Matteo Pirotta. “Improved Algorithms for Conservative Exploration in Bandits”. Workshop on “Deep Reinforcement Learning”, Thirty-Third Annual Conference on Advances in Neural Information Processing Systems (NeurIPS-2019), 2019.
Scott Fujimoto, Edoardo Conti, Mohammad Ghavamzadeh, & Joelle Pineau. “Benchmarking Batch Deep Reinforcement Learning Algorithms”. Workshop on “Conversational AI”, Thirty-Third Annual Conference on Advances in Neural Information Processing Systems (NeurIPS-2019), 2019.
Yinlam Chow, Ofir Nachum, Aleksandra Faust, and Mohammad Ghavamzadeh. “Lyapunov-based Safe Policy Optimization for Continuous Control”. Workshop on “Reinforcement Learning for Real Life”, Thirty-Sixth International Conference on Machine Learning (ICML-2019), 2019 (winner of the best paper award).

2018

Journal

Bo Liu, Ian Gemp, Mohammad Ghavamzadeh, Ji Liu, Sridhar Mahadevan, & Marek Petrik. “Proximal Gradient Temporal Difference Learning: Stable Reinforcement Learning with Polynomial Sample Complexity”. Journal of Artificial Intelligence Research (JAIR), 63:461-494, 2018. pdf
Yinlam Chow, Mohammad Ghavamzadeh, Lucas Janson, & Marco Pavone. “Risk-Constrained Reinforcement Learning with Percentile Risk Criteria”. Journal of Machine Learning Research (JMLR), 18(167):1-51, 2018. pdf

Conference

Yinlam Chow, Ofir Nachum, Mohammad Ghavamzadeh, & Edgar Duenez-Guzman. “A Lyapunov-based Approach to Safe Reinforcement Learning”. Proceedings of the Thirty-Second Annual Conference on Advances in Neural Information Processing Systems (NeurIPS-2018), 2018. pdf
Tengyang Xie, Bo Liu, Yangyang Xu, Mohammad Ghavamzadeh, Yinlam Chow, Daoming Lyu, & Daesub Yoon. “A Block Coordinate Ascent Algorithm for Mean-Variance Optimization”. Proceedings of the Thirty-Second Annual Conference on Advances in Neural Information Processing Systems (NeurIPS-2018), pp. 1073-1083, 2018. pdf
Ofir Nachum, Yinlam Chow, & Mohammad Ghavamzadeh. “Path Consistency Learning in Tsallis Entropy Regularized MDPs”. Proceedings of the Thirty-Fifth International Conference on Machine Learning (ICML-2018), pp. 979-988, Stockholm, Sweden, July 2018. pdf
Mehrdad Farajtabar, Yinlam Chow, & Mohammad Ghavamzadeh. “More Robust Doubly Robust Off-policy Evaluation”. Proceedings of the Thirty-Fifth International Conference on Machine Learning (ICML-2018), pp. 1447-1456, Stockholm, Sweden, July 2018. pdf
Ershad Banijamali, Rui Shu, Mohammad Ghavamzadeh, Hung Bui & Ali Ghodsi. “Robust Locally-Linear Controllable Embedding”. Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics (AISTATS-2018), pp. 1751-1759, 2018. pdf
Yahel David, Balázs Szörényi, Mohammad Ghavamzadeh, Shie Mannor, & Nahum Shimkin. “PAC Bandits with Risk Constraints”. International Symposium on Artificial Intelligence and Mathematics (ISAIM-2018), Special Session on Theory of Machine Learning, 2018. pdf

Workshop

Jonathan Lacotte, Mohammad Ghavamzadeh, Yinlam Chow, & Marco Pavone. “Risk-Sensitive Generative Adversarial Imitation Learning”. Workshop on “Safety, Risk, and Uncertainty in Reinforcement Learning”, Thirty-Fourth Conference on Uncertainty in Artificial Intelligence (UAI-2018), 2018.

2017

Journal

Aviv Tamar, Yinlam Chow, Mohammad Ghavamzadeh, & Shie Mannor. “Sequential Decision-making with Coherent Risk”. IEEE Transaction on Automatic Control (TAC), 62(7):3323-3338, 2017 (DOI: 10.1109/TAC.2016.2644871). pdf

Conference

Abbas Kazerouni, Mohammad Ghavamzadeh, Yasin Abbasi-Yadkori, & Ben Van Roy. “Conservative Contextual Linear Bandits”. Proceedings of the Thirty-First Annual Conference on Neural Information Processing Systems (NIPS-2017), pp. 3913-3922, 2017. pdf
Carlos Riquelme, Mohammad Ghavamzadeh, & Alessandro Lazaric. “Active Learning for Accurate Estimation of Linear Models”. Proceedings of the Thirty-Fourth International Conference on Machine Learning (ICML-2017), pp. 2931-2939, Sydney, Australia, August 2017. pdf
Rui Shu, Hung Bui, & Mohammad Ghavamzadeh. “Bottleneck Conditional Density Estimation”. Proceedings of the Thirty-Fourth International Conference on Machine Learning (ICML-2017), pp. 3164-3172, Sydney, Australia, August 2017. pdf
Sharan Vaswani, Branislav Kveton, Zheng Wen, Mohammad Ghavamzadeh, Laks Lakshmanan, & Mark Schmidt. “Model-Independent Online Learning for Influence Maximization”. Proceedings of the Thirty-Fourth International Conference on Machine Learning (ICML-2017), pp. 3530-3539, Sydney, Australia, August 2017. pdf
Masrour Zhoghi, Tomas Tunys, Mohammad Ghavamzadeh, Branislav Kveton, Csaba Szepesvári, & Zheng Wen. “Online Learning to Rank in Stochastic Click Models”. Proceedings of the Thirty-Fourth International Conference on Machine Learning (ICML-2017), pp. 4199-4208, Sydney, Australia, August 2017. pdf
Alan Malek, Yinlam Chow, Sumeet Katariya, & Mohammad Ghavamzadeh. “Sequential Multiple Hypothesis Testing with Type I Error Control”. Proceedings of the Twentieth International Conference on Artificial Intelligence and Statistics (AISTATS-2017), pp. 1468-1476, 2017. pdf
Philip Thomas, Georgios Theocharous, Mohammad Ghavamzadeh, Ishan Durugkar, & Emma Brunskill. “Predictive Off-Policy Evaluation for Nonstationary Decision Problems”. Proceedings of the Twenty-Ninth Conference on Innovative Applications of Artificial Intelligence (IAAI-2017), pp. 4740-4745, 2017. pdf
Ian Gemp, Georgios Theocharous, & Mohammad Ghavamzadeh. “Automated Data Cleansing through Meta-Learning”. Proceedings of the Twenty-Ninth Conference on Innovative Applications of Artificial Intelligence (IAAI-2017), pp. 4760-4761, 2017. pdf

Workshop

Ershad Banijamali, Ahmad Khajenezhad, Ali Ghodsi, & Mohammad Ghavamzadeh. “Disentangling Dynamics and Content for Control and Planning”. Workshop on “Learning Disentangled Representations: from Perception to Control”, Thirty-First Annual Conference on Neural Information Processing Systems (NIPS-2017), 2017.
Ershad Banijamali, Rui Shu, Mohammad Ghavamzadeh, & Hung Bui. “Robust Controlable Embedding of High-Dimensional Observations of Markov Decision Processes”. Workshop on “Implicit Models”, Thirty-Fourth International Conference on Machine Learning (ICML-2017), Sydney, Australia, August 2017.

2016

Journal

Amir massoud Farahmand, Mohammad Ghavamzadeh, Csaba Szepesvári, & Shie Mannor. “Regularized Policy Iteration for Non-Parametric Function Spaces”. Journal of Machine Learning Research (JMLR), 17(139):1-66, 2016. pdf
Prashanth L. A. and Mohammad Ghavamzadeh. “Variance-constrained Actor-Critic Algorithms for Discounted and Average Reward MDPs”. Machine Learning Journal (MLJ), 105(3):367-417, 2016 (DOI: 10.1007/s10994-016-5569-5). pdf
Mohammad Ghavamzadeh, Yaakov Engel, & Michal Valko. “Bayesian Policy Gradient and Actor-Critic Algorithms”. Journal of Machine Learning Research (JMLR), 17(66):1-53, 2016. pdf
CODE IS AVAILABLE AT 1 2
Alessandro Lazaric, Mohammad Ghavamzadeh, & Rémi Munos. “Analysis of Classification-based Policy Iteration Algorithms”. Journal of Machine Learning Research (JMLR), 17(19):1-30, 2016. pdf

Conference

Marek Petrik, Mohammad Ghavamzadeh, & Yinlam Chow. “Safe Policy Improvement by Minimizing Robust Baseline Regret”. Proceedings of the Thirtieth Annual Conference on Neural Information Processing Systems (NIPS-2016), pp. 2298-2306, 2016. pdf
Branislav Kveton, Hung Bui, Mohammad Ghavamzadeh, Georgios Theocharous, S. Muthukrishnan, & Siqi Sun. “Graphical Model Sketch”. Proceedings of the European Conference on Machine Learning (ECML-2016), Riva del Garda, Italy, 2016. pdf
Bo Liu, Mohammad Ghavamzadeh, Ian Gemp, Ji Liu, & Sridhar Mahadevan. “Proximal Gradient Temporal Difference Learning Algorithms”. Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence (IJCAI-2016), pp. 4195-4199, New York City, NY, July 2016. pdf
Victor Gabillon, Alessandro Lazaric, Mohammad Ghavamzadeh, Ronald Ortner, & Peter Bartlett. “Improved Learning Complexity in Combinatorial Pure Exploration Bandits”. Proceedings of the Nineteenth International Conference on Artificial Intelligence and Statistics (AISTATS-2016), pp. 1004-1012, Cadiz, Spain, May 2016. pdf

Workshop

Abbas Kazerouni, Mohammad Ghavamzadeh, & Ben VanRoy. “Safety in Contextual Linear Bandits”. Workshop on “Reliable Machine Learning in the Wild”, Thirtieth Annual Conference on Advances in Neural Information Processing Systems (NIPS-2016), Barcelona, Spain, December 2016.
Rui Shu, Hung Bui, & Mohammad Ghavamzadeh. “Bottleneck Conditional Density Estimators”. Workshop on “Bayesian Deep Learning”, Thirtieth Annual Conference on Advances in Neural Information Processing Systems (NIPS-2016), Barcelona, Spain, December 2016.
Rui Shu, James Brofos, Frank Zhang, Hung Bui, Mohammad Ghavamzadeh, & Mykel Kochenderfer. “Stochastic Video Prediction with Conditional Density Estimation”. Workshop on “Action and Anticipation for Visual Learning”, Fourteenth European Conference on Computer Vision (ECCV-2016), Amsterdam, The Netherlands, October 2016.
Marek Petrik, Yinlam Chow, & Mohammad Ghavamzadeh. “Optimally Robust Policy Improvement with Baseline Guarantees”. Workshop on “Reliable Machine Learning in the Wild”, Thirty-Third International Conference on Machine Learning (ICML-2016), New York City, NY, June 2016.

2015

Journal

Mohammad Ghavamzadeh, Shie Mannor, Joelle Pineau, & Aviv Tamar. “Bayesian Reinforcement Learning: A Survey”. Foundations and Trends in Machine Learning, 8(5-6):359-483, 2015 (DOI: 10.1561/2200000049). pdf
Bruno Scherrer, Mohammad Ghavamzadeh, Victor Gabillon, Boris Lesner, & Matthieu Geist. “Approximate Modified Policy Iteration and its Application to the Game of Tetris”. Journal of Machine Learning Research (JMLR), 16:1629-1676, 2015. pdf
Amir massoud Farahmand, Doina Precup, André Barreto, & Mohammad Ghavamzadeh. “Classification-based Approximate Policy Iteration”. IEEE Transactions on Automatic Control (TAC), 60(11) 2989-2993, 2015. pdf

Conference

Aviv Tamar, Yinlam Chow, Mohammad Ghavamzadeh, & Shie Mannor. “Policy Gradient for Coherent Risk Measures”. Proceedings of the Twenty-Ninth Annual Conference on Advances in Neural Information Processing Systems (NIPS-2015), pp. 1468-1476, 2015. pdf
Bo Liu, Mohammad Ghavamzadeh, Sridhar Mahadevan, & Marek Petrik. “Finite-Sample Analysis of Proximal Gradient TD Algorithms”. Proceedings of the Thirty-First Conference on Uncertainty in Artificial Intelligence (UAI-2015), pp. 504-513, Amsterdam, Netherlands, July 2015 (winner of the Facebook best student paper award). pdf
Philip Thomas, Georgios Theocharous, & Mohammad Ghavamzadeh. “High Confidence Policy Improvement”. Proceedings of the Thirty-Second International Conference on Machine Learning (ICML-2015), pp. 2380-2388, Lille, France, July 2015. pdf
Julien Audiffren, Michal Valko, Alessandro Lazaric, & Mohammad Ghavamzadeh. “Maximum Entropy Semi-Supervised Inverse Reinforcement Learning”. Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence (IJCAI-2015), pp. 3315-3321, Buenos Aires, Argentina, July 2015. pdf
Georgios Theocharous, Philip Thomas, & Mohammad Ghavamzadeh. “Building Personalized Ad Recommendation Systems for Life-Time Value Optimization with Guarantees”. Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence (IJCAI-2015), pp. 1806-1812, Buenos Aires, Argentina, July 2015. pdf
Philip Thomas, Georgios Theocharous, & Mohammad Ghavamzadeh. “High Confidence Off-Policy Evaluation”. Proceedings of the Twenty-Ninth Conference on Artificial Intelligence (AAAI-2015), pp. 3000-3006, Austin, TX, January 2015. pdf

Workshop

Sougata Chaudhuri, Georgios Theocharous, & Mohammad Ghavamzadeh. “A Ranking Approach to Address the Click Sparsity Problem in Personalized Ad Recommendation”. Workshop on “Machine Learning for eCommerce”. Twenty-Ninth Annual Conference on Advances in Neural Information Processing Systems (NIPS-2015), Montreal, Canada, December 2015.
Aviv Tamar, Yinlam Chow, Mohammad Ghavamzadeh, & Shie Mannor. “Policy Gradient for Coherent Risk Measures”. Twelfth European Workshop on Reinforcement Learning (EWRL-12) at the Thirty-Second International Conference on Machine Learning (ICML), Lille, France, July 2015.
Georgios Theocharous, Philip Thomas, & Mohammad Ghavamzadeh. “Ad Recommendation Systems for Life-Time Value Optimization”. Workshop on “Ad Targeting at Scale”, Twenty-Fourth International World Wide Web Conference (WWW-2015), Florence, Italy, May 2015.

Tech Report

Aviv Tamar, Yinlam Chow, Mohammad Ghavamzadeh, and Shie Mannor. “Policy Gradient for Coherent Risk Measures”. arXiv:1502.03919, 2015.

2014

Conference

Yinlam Chow and Mohammad Ghavamzadeh. “Algorithms for CVaR Optimization in MDPs”. Proceedings of the Twenty-Eighth Annual Conference on Advances in Neural Information Processing Systems (NIPS-2014), pp. 3509-3517, 2014. pdf

Workshop

Yinlam Chow and Mohammad Ghavamzadeh. “Constrained Stochastic Optimal Control with a Baseline Performance Guarantee”. Workshop on “From Bad Models to Good Policies”, Twenty-Eight Annual Conference on Advances in Neural Information Processing Systems (NIPS-2014), Montreal, Canada, December 2014.
Julien Audiffren, Michal Valko, Alessandro Lazaric, & Mohammad Ghavamzadeh. “Maximum Entropy Semi-Supervised Inverse Reinforcement Learning”. Workshop on “Novel Trends and Applications in Reinforcement Learning”, Twenty-Eight Annual Conference on Advances in Neural Information Processing Systems (NIPS-2014), Montreal, Canada, December 2014.
Philip Thomas, Georgios Theocharous, & Mohammad Ghavamzadeh. “Safe Policy Search”. Workshop on “Customer Life-Time Value Optimization in Digital Marketing”, Thirty-First International Conference on Machine Learning (ICML-2014), Beijing, China, June 2014.

Tech Report

Yinlam Chow and Mohammad Ghavamzadeh. “Constrained Stochastic Optimal Control with a Baseline Performance Guarantee”. arXiv:1410.2726, 2014.
Yinlam Chow and Mohammad Ghavamzadeh. “Algorithms for CVaR Optimization in MDPs”. arXiv:1406.3339, 2014.

Habilitation Thesis

Mohammad Ghavamzadeh. “Sample Complexity in Sequential Decision-Making”. Department of Mathematics, Université Lille 1 - Sciences et Technologies, France, June 2014. pdf

2013

Conference

Prashanth L. A. and Mohammad Ghavamzadeh. “Actor-Critic Algorithms for Risk-Sensitive MDPs”. Accepted for Oral Presentation (%1.4 acceptance - 20 out of 1420 submissions). Proceedings of the Twenty-Seventh Annual Conference on Advances in Neural Information Processing Systems (NIPS-2013), pp. 252-260, 2013. pdf
Victor Gabillon, Mohammad Ghavamzadeh, & Bruno Scherrer. “Approximate Dynamic Programming Finally Performs Well in the Game of Tetris”. Proceedings of the Twenty-Seventh Annual Conference on Advances in Neural Information Processing Systems (NIPS-2013), pp. 1754-1762, 2013. pdf
Bernardo Ávila Pires, Mohammad Ghavamzadeh, & Csaba Szepesvári. “Cost-sensitive Multiclass Classification Risk Bounds “. Proceedings of the Thirtieth International Conference on Machine Learning (ICML-2013), pp. 28(3):1391-1399, Atlanta, GA, 2013. pdf
Hachem Kadri, Mohammad Ghavamzadeh, & Philippe Preux. “A Generalized Kernel Approach to Structured Output Learning”. Proceedings of the Thirtieth International Conference on Machine Learning (ICML-2013), pp. 28(1):471-479, Atlanta, GA, 2013. pdf
Amir massoud Farahmand, Doina Precup, André Barreto, & Mohammad Ghavamzadeh. “CAPI: Generalized Classification-based Approximate Policy Iteration”. The First Multidisciplinary Conference on Reinforcement Learning and Decision Making (RLDM-2013), Princeton, NJ, 2013.

Tech Report

Prashanth L. A. and Mohammad Ghavamzadeh. “Actor-Critic Algorithms for Risk-Sensitive MDPs” Technical Report inria-00794721, INRIA, 2013.

2012

Journal

Alessandro Lazaric, Mohammad Ghavamzadeh, & Rémi Munos. “Finite-Sample Analysis of Least-Squares Policy Iteration’’. Journal of Machine Learning Research (JMLR), 13:3041-3074, 2012. pdf

Conference

Victor Gabillon, Mohammad Ghavamzadeh, & Alessandro Lazaric. “A Unified Approach to Fixed Budget and Fixed Confidence”. Proceedings of the Twenty-Sixth Annual Conference on Advances in Neural Information Processing Systems (NIPS-2012), pp. 3221-3229, 2012. pdf
Bruno Scherrer, Mohammad Ghavamzadeh, Victor Gabillon, & Matthieu Geist. “Approximate Modified Policy Iteration”. Proceedings of the Twenty-Ninth International Conference on Machine Learning (ICML-2012), pp. 1207-1214, Edinburgh, Scotland, 2012. pdf
Matthieu Geist, Bruno Scherrer, Alessandro Lazaric, &Mohammad Ghavamzadeh. “A Dantzig Selector Approach to Temporal Difference Learning”. Proceedings of the Twenty-Ninth International Conference on Machine Learning (ICML-2012), pp. 1399-1406, Edinburgh, Scotland, 2012. pdf
Mohammad Ghavamzadeh & Alessandro Lazaric. “Conservative and Greedy Approaches to Classification-based Policy Iteration”. Proceedings of the Twenty-Sixth Conference on Artificial Intelligence (AAAI-2012), 914-920, Toronto, ON, Canada, 2012. pdf

Book Chapter

Nikos Vlassis, Mohammad Ghavamzadeh, Shie Mannor, & Pascal Poupart. “Bayesian Reinforcement Learning”. Reinforcement Learning: State of the Art, Edited by Marco Wiering and Martijn van Otterlo, Springer Verlag, 2012.
Lucian Busoniu, Alessandro Lazaric, Mohammad Ghavamzadeh, Rémi Munos, Robert Babuska, & Bert De Schutter. “Least-Squares Methods for Policy Iteration”. Reinforcement Learning: State of the Art, Edited by Marco Wiering and Martijn van Otterlo, Springer Verlag, 2012.

Workshop

Michal Valko, Mohammad Ghavamzadeh, & Alessandro Lazaric. “Semi-Supervised Inverse Reinforcement Learning “. Ninth European Workshop on Reinforcement Learning (EWRL-2012), Edinburgh, Scotland, 2012.

Tech Report

Hachem Kadri, Mohammad Ghavamzadeh, & Philippe Preux. “A Generalized Kernel Approach to Structured Output Learning” Technical Report inria-00695631, INRIA, 2012.
Victor Gabillon, Mohammad Ghavamzadeh, & Alessandro Lazaric. “Best Arm Identification: A Unified Approach to Fixed Budget and Fixed Confidence” Technical Report inria-00747005, INRIA, 2012.
Bruno Scherrer, Mohammad Ghavamzadeh, Victor Gabillon, & Matthieu Geist. “Approximate Modified Policy Iteration” Technical Report inria-00697169, INRIA, 2012.

2011

Conference

Victor Gabillon, Mohammad Ghavamzadeh, Alessandro Lazaric, & Sebastien Bubeck. “Multi-Bandit Best Arm Identification”. Proceedings of the Twenty-Fifth Annual Conference on Advances in Neural Information Processing Systems (NIPS-2011), pp. 2222-2230, 2011. pdf
Mohammad Azar, Rémi Munos, Mohammad Ghavamzadeh, & Hilbert Kappen. “Speedy Q-Learning”. Proceedings of the Twenty-Fifth Annual Conference on Advances in Neural Information Processing Systems (NIPS-2011), pp. 2411-2419, 2011. pdf
Alexandra Carpentier, Alessandro Lazaric, Mohammad Ghavamzadeh, Rémi Munos, & Peter Auer. “Upper-Confidence-Bound Algorithms for Active Learning in Multi-Armed Bandits”. Selected for a special issue of the Journal of Theoretical Computer Science. Proceedings of the Twenty-Second International Conference on Algorithmic Learning Theory (ALT-2011), pp. 189-203, Espoo, Finland, October 2011. pdf
Mohammad Ghavamzadeh, Alessandro Lazaric, Rémi Munos, & Matthew Hoffman. “Finite-Sample Analysis of Lasso-TD”. Proceedings of the Twenty-Eighth International Conference on Machine Learning (ICML-2011), pp. 1177-1184, Bellevue, WA, June 2011. pdf
Victor Gabillon, Alessandro Lazaric, Mohammad Ghavamzadeh, & Bruno Scherrer. “Classification-based Policy Iteration with a Critic”. Proceedings of the Twenty-Eighth International Conference on Machine Learning (ICML-2011), pp. 1049-1056, Bellevue, WA, June 2011. pdf

Workshop

Matthew Hoffman, Alessandro Lazaric, Mohammad Ghavamzadeh, & Rémi Munos. “Regularized Least Squares Temporal Difference Learning with Nested L2 and L1 Penalization”. Ninth European Workshop on Reinforcement Learning (EWRL-2011), Athens, Greece, September 2011.

Tech Report

Mohammad Azar, Rémi Munos, Mohammad Ghavamzadeh, & Hilbert Kappen. “Reinforcement Learning with a Near Optimal Rate of Convergence” Technical Report inria-00636615, INRIA, 2011.
Victor Gabillon, Mohammad Ghavamzadeh, Alessandro Lazaric, & Sébastien Bubeck. “Multi-Bandit Best Arm Identification” Technical Report inria-00632523, INRIA, 2011.
Alexandra Carpentier, Alessandro Lazaric, Mohammad Ghavamzadeh, Rémi Munos, & Peter Auer. “Upper-Confidence-Bound Algorithms for Active Learning in Multi-Armed Bandits” Technical Report inria-00594131, INRIA, 2011.
Victor Gabillon, Alessandro Lazaric, Mohammad Ghavamzadeh, & Bruno Scherrer. “Classification-based Policy Iteration with a Critic” Technical Report inria-00590972, INRIA, 2011.

2010

Conference

Mohammad Ghavamzadeh, Alessandro Lazaric, Odalric Maillard, & Rémi Munos. “LSTD with Random Projections”. Accepted for Spotlight Presentation (%6 acceptance - 73 out of 1219 submissions). Proceedings of the Twenty-Fourth Annual Conference on Advances in Neural Information Processing Systems (NIPS-2010), pp. 721-729, 2010. pdf
Odalric Maillard, Rémi Munos, Alessandro Lazaric, & Mohammad Ghavamzadeh. “Finite-Sample Analysis of Bellman Residual Minimization’’. Proceedings of the Second Asian Conference on Machine Learning (ACML-2010), pp. 299-314, Tokyo, Japan, November 2010. pdf
Alessandro Lazaric & Mohammad Ghavamzadeh. “Bayesian Multi-Task Reinforcement Learning”. Proceedings of the Twenty-Seventh International Conference on Machine Learning (ICML-2010), pp. 599-606, Haifa, Israel, June 2010. pdf
Alessandro Lazaric, Mohammad Ghavamzadeh, & Rémi Munos. “Analysis of a Classification-based Policy Iteration Algorithm”. Proceedings of the Twenty-Seventh International Conference on Machine Learning (ICML-2010), pp. 607-614, Haifa, Israel, June 2010. pdf
Alessandro Lazaric, Mohammad Ghavamzadeh, & Rémi Munos. “Finite-Sample Analysis of LSTD”. Proceedings of the Twenty-Seventh International Conference on Machine Learning (ICML-2010), pp. 615-622, Haifa, Israel, June 2010. pdf

Workshop

Victor Gabillon, Alessandro Lazaric, & Mohammad Ghavamzadeh. “Rollout Allocation Strategies for Classification-based Policy Iteration”. Workshop on “Reinforcement Learning and Search in Very Large Spaces”, Twenty-Seventh International Conference on Machine Learning (ICML-2010), Haifa, Israel, June 2010.

Tech Report

Mohammad Ghavamzadeh, Alessandro Lazaric, Odalric Maillard, & Rémi Munos. “LSPI with Random Projections,” Technical Report inria-00530762, INRIA, 2010.,
Alessandro Lazaric, Mohammad Ghavamzadeh, & Rémi Munos. “Finite-Sample Analysis of Least-Squares Policy Iteration,’’ Technical Report inria-00528596, INRIA, 2010.
Alessandro Lazaric & Mohammad Ghavamzadeh. “Bayesian Multi-Task Reinforcement Learning,’’ Technical Report inria-00475214, INRIA, 2010.
Alessandro Lazaric, Mohammad Ghavamzadeh, & Rémi Munos. “Analysis of a Classification-based Policy Iteration Algorithm,’’ Technical Report inria-00482065, INRIA, 2010.
Alessandro Lazaric, Mohammad Ghavamzadeh, & Rémi Munos. “Finite-Sample Analysis of LSTD,’’ Technical Report inria-00482189, INRIA, 2010.

2009

Journal

Shalabh Bhatnagar, Richard Sutton, Mohammad Ghavamzadeh, & Mark Lee. “Natural Actor-Critic Algorithms”. Automatica, 45(11):2471-2482, 2009 (DOI: 10.1016/j.automatica.2009.07.008). (the longer version is available as a UAlberta Tech-Report pdf

Conference

Amir massoud Farahmand, Mohammad Ghavamzadeh, Csaba Szepesvári, & Shie Mannor. “Regularized Fitted Q-iteration for Planning in Continuous-Space Markovian Decision Problems”. Proceedings of the 2009 American Control Conference (ACC-2009), pp. 725-730, St. Louis, MO, June 2009. pdf

Workshop

Mohammad Ghavamzadeh. “Hierarchical Hybrid Reinforcement Learning Algorithms”. Workshop on “Bridging the Gap between High-level Discrete Representations and Low-level Continuous Behaviors”, Robotics: Science and Systems Conference (RSS-2009), Seattle, WA, June 2009. pdf
Amir massoud Farahmand, Mohammad Ghavamzadeh, Csaba Szepesvári, & Shie Mannor. “Robot Learning with Regularized Reinforcement Learning”. Workshop on “Regression in Robotics—Approaches and Applications”, Robotics: Science and Systems Conference (RSS-2009), Seattle, WA, June 2009. pdf
Mohammad Ghavamzadeh & Yaakov Engel. “Bayesian Actor Critic: A Bayesian Model for Value Function Approximation and Policy Learning”. Workshop on “Regression in Robotics—Approaches and Applications”, Robotics: Science and Systems Conference (RSS-2009), Seattle, WA, June 2009. pdf
Amir massoud Farahmand, Mohammad Ghavamzadeh, Csaba Szepesvári, & Shie Mannor. “Regularization in Reinforcement Learning”. Multidisciplinary Symposium on Reinforcement Learning (MSRL-2009), Montreal, QC, Canada, June 2009. pdf

Tech Report

Shalabh Bhatnagar, Richard Sutton, Mohammad Ghavamzadeh, & Mark Lee. “Natural Actor-Critic Algorithms,” Technical Report TR09-10, Department of Computing Science, University of Alberta, 2009.

2008

Conference

Amir massoud Farahmand, Mohammad Ghavamzadeh, Csaba Szepesvári, & Shie Mannor. “Regularized Policy Iteration”. Proceedings of the Twenty-Second Annual Conference on Advances in Neural Information Processing Systems (NIPS-2008), pp. 441-448, 2008. pdf

Workshop

Amir massoud Farahmand, Mohammad Ghavamzadeh, Csaba Szepesvári, & Shie Mannor. “Regularized Fitted Q-iteration: Application to Bounded Resource Planning”. Proceedings of the Eighth European Workshop on Reinforcement Learning (EWRL-2008), volume 5323 of Lecture Notes in Artificial Intelligence, pp. 55-68, Villeneuve d’Ascq, France, July 2008. pdf
Amir massoud Farahmand, Mohammad Ghavamzadeh, Csaba Szepesvári, & Shie Mannor. “Regularized Policy Iteration”. Eighth European Workshop on Reinforcement Learning (EWRL-2008), Villeneuve d’Ascq, France, July 2008.

2007

Journal

Mohammad Ghavamzadeh & Sridhar Mahadevan. “Hierarchical Average Reward Reinforcement Learning”. Journal of Machine Learning Research (JMLR), 8:2629-2669, 2007. pdf

Conference

Shalabh Bhatnagar, Richard Sutton, Mohammad Ghavamzadeh, & Mark Lee. “Incremental Natural Actor-Critic Algorithms”. Accepted for Spotlight Presentation (%10 acceptance - 101 out of 975 submissions). Proceedings of the Twenty-First Annual Conference on Advances in Neural Information Processing Systems (NIPS-2007), pp. 105-112, 2007. pdf
Mohammad Ghavamzadeh & Yaakov Engel. “Bayesian Actor-Critic Algorithms”. Proceedings of the Twenty-Fourth International Conference on Machine Learning (ICML-2007), pp. 297-304, Oregon State University, Corvallis, OR, June 2007. pdf

2006

Journal

Mohammad Ghavamzadeh, Sridhar Mahadevan, & Rajbala Makar. “Hierarchical Multiagent Reinforcement Learning”. Journal of Autonomous Agents and Multi-Agent Systems (JAAMAS), 13(2):197-229, 2006 (DOI: 10.1007/s10458-006-7035-4). pdf

Conference

Mohammad Ghavamzadeh & Yaakov Engel. “Bayesian Policy Gradient Algorithms”. Accepted for Spotlight Presentation (%7.5 acceptance - 63 out of 833 submissions). Proceedings of the Twentieth Annual Conference on Advances in Neural Information Processing Systems (NIPS-2006), pp. 457-464, 2006. pdf

Workshop

Mohammad Ghavamzadeh & Yaakov Engel. “Bayesian Policy Gradient”. Workshop on “Kernel Machines and Reinforcement Learning” (KRL), Twenty-Thrid International Conference on Machine Learning (ICML-2006), Pittsburgh, PA, June 2006. pdf
Mohammad Ghavamzadeh & Sridhar Mahadevan. “Learning to Cooperate using Hierarchical Reinforcement Learning”. Workshop on “Hierarchical Autonomous Agents and Multi-Agent Systems” (H-AAMAS), Fifth International Joint Conference on Autonomous Agents and Multi-Agent Systems (AAMAS-2006), Hakodate, Japan, May 2006. pdf

2005

PhD Thesis

Mohammad Ghavamzadeh. “Hierarchical Reinforcement Learning in Continuous State and Multi-Agent Environments”. Department of Computer Science, University of Massachusetts Amherst, May 2005.

2004

Conference

Mohammad Ghavamzadeh & Sridhar Mahadevan. “Learning to Communicate and Act using Hierarchical Reinforcement Learning”. Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS-2004), pp. 1114-1121, New York City, NY, July 2004. pdf

Book Chapter

Sridhar Mahadevan, Mohammad Ghavamzadeh, Khashayar Rohanimanesh, & Georgios Theocharous. “Hierarchical Approaches to Concurrency, Multiagency, and Partial Observability”. Learning and Approximate Dynamic Programming: Scaling up to the Real World, Edited by Jennie Si, Andrew Barto, Warren Powell and Donald Wunsch, John Wiley & Sons, New York, pp. 285-310, 2004. pdf

Tech Report

Mohammad Ghavamzadeh & Sridhar Mahadevan. “Hierarchical Multiagent Reinforcement Learning”. Technical Report UM-CS-2004-02. Department of Computer Science, University of Massachusetts Amherst, 2004.

2003

Conference

Mohammad Ghavamzadeh & Sridhar Mahadevan. “Hierarchical Policy Gradient Algorithms”. Proceedings of the Twentieth International Conference on Machine Learning (ICML-2003), pp. 226-233, Washington, D.C., August 2003. pdf

Tech Report

Mohammad Ghavamzadeh, Sridhar Mahadevan, & Rajbala Makar. “Extending Hierarchical Reinforcement Learning to Continuous-Time, Average-Reward, and Multi-Agent Models”. Technical Report UM-CS-2003-23, Department of Computer Science, University of Massachusetts Amherst, 2003.
Mohammad Ghavamzadeh & Sridhar Mahadevan. “Hierarchical Average Reward Reinforcement Learning”. Technical Report UM-CS-2003-19, Department of Computer Science, University of Massachusetts Amherst, 2003.

2002

Conference

Mohammad Ghavamzadeh & Sridhar Mahadevan. “Hierarchically Optimal Average Reward Reinforcement Learning”. Proceedings of the Nineteenth International Conference on Machine Learning (ICML-2002), pp. 195-202, Sydney, Australia, July 2002. pdf
Mohammad Ghavamzadeh & Sridhar Mahadevan. “A Multiagent Reinforcement Learning Algorithm by Dynamically Merging Markov Decision Processes”. Proceedings of the First International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS-2002), pp. 845-846, Bologna, Italy, July 2002. pdf

2001

Journal

Ali M. Eydgahi & Mohammad Ghavamzadeh. “Complementary Root Locus Revisited”. IEEE Transactions on Education, 44(2):137-143, 2001. pdf

Conference

Mohammad Ghavamzadeh & Sridhar Mahadevan. “Continuous-Time Hierarchical Reinforcement Learning”. Proceedings of the Eighteenth International Conference on Machine Learning (ICML-2001), pp. 186-193, Williams College, MA, July 2001. pdf
Rajbala Makar, Sridhar Mahadevan, & Mohammad Ghavamzadeh. “Hierarchical Multi-Agent Reinforcement Learning”. Proceedings of the Fifth International Conference on Autonomous Agents (Agents-2001), pp. 246-253, Montreal, Canada, June 2001 (winner of the best student paper award). pdf

Before 2001

Conference

Ali M. Eydgahi & Mohammad Ghavamzadeh. “Complementary Root Locus Revisited”. IEEE Transactions on Education, 44(2):137-143, 2001. pdf
Mohammad Ghavamzadeh, Caro Lucas, & Shahin Shayan Arani. “Forecasting the International Oil and Gold Prices Using Artificial Neural Networks”. Proceedings of the Conference on Computer Science and Information Technologies (CSIT-1997), Yerevan, Armenia, September 1997.
Ali M. Eydgahi & Mohammad Ghavamzadeh. (in Farsi) “Properties of Branches Passing through Infinity in Root Locus Method”. Journal of Faculty of Engineering University of Tehran, pp. 1-10, December 1996.
Ali M. Eydgahi & Mohammad Ghavamzadeh. (in Farsi) “Branches Passing through Infinity in Root Locus Method”. Journal of Faculty of Engineering University of Tehran, pp. 9-15, June 1996.
Mohammad Ghavamzadeh & Ali M. Eydgahi. (in Farsi) “An Adaptive Fuzzy Controller for Flexible Joint Robots”. Proceedings of the International Conference on Intelligent & Cognitive Systems, pp. 88-92, Tehran, Iran, September 1996.
Mohammad Ghavamzadeh, Khashayar Rohanimanesh, Ali M. Eydgahi, & Bahram Poorali. “Design of an ISDN Terminal”. Proceedings of the Twenty First. IEEE International Conference on Industrial Electronics, Control, Instrumentation and Automation (IECON-1995), pp. 1598-1601, Orlando FL, November 1995.
Mohammad Ghavamzadeh & Ali M. Eydgahi. (in Farsi) “A New Approach to Root Locus for Positive Feedback Systems”. Proceedings of the Third Iranian Conference on Electrical Engineering, pp. 23-30, Tehran, Iran, May 1995.
Mohammad Ghavamzadeh, Khashayar Rohanimanesh, Ali M. Eydgahi, & Bahram Poorali. (in Farsi) “Design of an ISDN Telephone Terminal with 8751H Intel Micro-Controller”. Proceedings of the Third Iranian Conference on Electrical Engineering, Tehran, Iran, May 1995.

Mohammad Ghavamzadeh

Publications By Year

JOURNALS

CONFERENCES

2026

Conference

2025

Journal

Conference

2024

Journal

Conference

2023

Conference

Workshop

2022

Conference

2021

Journal

Conference

2020

Conference

Workshop

2019

Conference

Workshop

2018

Journal

Conference

Workshop

2017

Journal

Conference

Workshop

2016

Journal

Conference

Workshop

2015

Journal

Conference

Workshop

Tech Report

2014

Conference

Workshop

Tech Report

Habilitation Thesis

2013

Conference

Tech Report

2012

Journal

Conference

Book Chapter

Workshop

Tech Report

2011

Conference

Workshop

Tech Report

2010

Conference

Workshop

Tech Report

2009

Journal

Conference

Workshop

Tech Report

2008

Conference

Workshop

2007

Journal

Conference

2006

Journal

Conference

Workshop