Publications By Year
JOURNALS
JMLR = 7, IEEE Trans. & Journals = 4, MLJ = 1, JAIR = 1, Automatica = 1, Foundation & Trends = 1
JAAMAS = 1
CONFERENCES
ICML = 29, NeurIPS = 26, AISTATS = 12, IJCAI = 6, AAAI = 5, ICLR = 6, UAI = 3, AAMAS = 3
ALT = 1, ACC = 2, CDC = 1
2024
Journal
- Christina Gopfert, Alex Haig, Chih-wei Hsu, Yinlam Chow, Ivan Vendrov, Tyler Lu, Deepak Ra- machandran, Hubert Pham, Mohammad Ghavamzadeh, & Craig Boutilier. “Discovering Personalized Semantics for Soft Attributes in Recommender Systems using Concept Activation Vectors”. ACM Transactions on Recommender Systems, 2024 (DOI: 10.1145/3658675)
Conference
Kyuyoung Kim, Jongheon Jeong, Minyong An, Mohammad Ghavamzadeh, Krishnamurthy Dvijotham, Jinwoo Shin, & Kimin Lee. “Confidence-aware Reward Optimization for Fine-tuning Text-to-Image Models”. Proceedings of the Twelfth International Conference on Learning Representations (ICLR-2024), 2024. pdf
Amin Rakhsha, Mete Kemertas, Mohammad Ghavamzadeh, & Amir-massoud Farahmand, “Maxi- mum Entropy Model Correction in Reinforcement Learning”. Proceedings of the Twelfth International Conference on Learning Representations (ICLR-2024), 2024. pdf
Marek Petrik, Guy Tennenholtz, & Mohammad Ghavamzadeh. “Bayesian Regret Minimization in Offline Bandits”. Proceedings of the Forty-First International Conference on Machine Learning (ICML-2024), pp. 40502-40522, 2024. pdf
Jihwan Jeong, Yinlam Chow, Guy Tennenholtz, Chih-wei Hsu, Aza Tulepbergenov, Mohammad Ghavamzadeh, & Craig Boutilier. “Factual and Tailored Recommendation Endorsements using Language Models and Reinforcement Learning”. First Conference on Language Modeling (COLM-2024), 2024. pdf
Audrey Huang, Nan Jiang, Marek Petrik, & Mohammad Ghavamzadeh. “Non-adaptive Online Fine-tuning for Offline Reinforcement Learning”. First Reinforcement Learning Conference (RLC-2024), 2024. pdf
Mohammad Javad Azizi, Thang Nhat Duong, Yasin Abbasi Yadkori, Andras Gyorgy, Claire Vernade, & Mohammad Ghavamzadeh. “Non-Stationary Bandits and Meta-Learning with a Small Set of Optimal Arms”. First Reinforcement Learning Conference (RLC-2024), 2024. pdf
2023
Conference
Jincheng Mei, Bo Dai, Alekh Agarwal, Mohammad Ghavamzadeh, Csaba Szepevari, & Dale Schuurmans. “Ordering-based Conditions for Global Convergence of Policy Gradient Methods”. Accepted for Oral Presentation (%0.54 acceptance - 67 out of 12345 submissions).Proceedings of the Thirty-Seventh Annual Conference on Advances in Neural Information Processing Systems (NeurIPS-2023), 2023. pdf
Ying Fan, Olivia Watkins, Yuqing Du, Hao Liu, Moonkyung Ryu, Craig Boutilier, Pieter Abbeel, Mohammad Ghavamzadeh, Kangwook Lee, & Kimin Lee. “DPOK: Reinforcement Learning for Fine-tuning Text-to-Image Diffusion Models”. Proceedings of the Thirty-Seventh Annual Conference on Advances in Neural Information Processing Systems (NeurIPS-2023), 2023. pdf
Jia Lin Hau, Erick Delage, Mohammad Ghavamzadeh, & Marek Petrik. “On Dynamic Programming Decompositions of Static Risk Measures in Markov Decision Processes”. Proceedings of the Thirty-Seventh Annual Conference on Advances in Neural Information Processing Systems (NeurIPS-2023), 2023. pdf
Dhawal Gupta, Yinlam Chow, Aza Tulepbergenov, Mohammad Ghavamzadeh, & Craig Boutilier. “Offline Reinforcement Learning for Mixture-of-Expert Dialogue Management”. Proceedings of the Thirty-Seventh Annual Conference on Advances in Neural Information Processing Systems (NeurIPS-2023), 2023. pdf
Kishan Panaganti, Zaiyan Xu, Dileep Kalathil, & Mohammad Ghavamzadeh. “Distributionally Robust Behavioral Cloning for Robust Imitation Learning”. Proceedings of the Sixty Second IEEE Conference on Decision and Control (CDC-2023), 2023. pdf
Joey Hong, Branislav Kveton, Manzil Zaheer, Sumeet Katariya, & Mohammad Ghavamzadeh. “Multi-Task Off-Policy Learning from Bandit Feedback”. Proceedings of the Fortieth International International Conference on Machine Learning (ICML-2023), 2023. pdf
Yinlam Chow, Aza Tulepbergenov, Ofir Nachum, MoonKyung Ryu, Mohammad Ghavamzadeh, & Craig Boutilier. “A Mixture-of-Expert Approach to RL-based Dialogue Management”. Proceedings of the Eleventh International Conference on Learning Representations (ICLR-2023), 2023. pdf
Jia Lin Hau, Marek Petrik, & Mohammad Ghavamzadeh. “Entropic Risk Optimization in Discounted MDPs”. Proceedings of the Twenty-Sixth International Conference on Artificial Intelligence and Statistics (AISTATS-2023), 2023. pdf
Chris Dunn, Mohammad Ghavamzadeh, & Teodor Marinov. “Multiple-policy High-confidence Policy Evaluation”. Proceedings of the Twenty-Sixth International Conference on Artificial Intelligence and Statistics (AISTATS-2023), 2023. pdf
Javad Azizi, Branislav Kveton, Mohammad Ghavamzadeh, & Sumeet Katariya. “Meta-Learning for Simple Regret Minimization”. Proceedings of the Thirty-Seventh Conference on Artificial Intelligence (AAAI-2023), 2023. pdf
Workshop
- Audrey Huang, Mohammad Ghavamzadeh, Nan Jiang, & Marek Petrik. “Non-adaptive Online Fine- tuning for Offline Reinforcement Learning”. Seventh Workshop on “Generalization in Planning” (GenPlan-2023), Thirty-Seventh Annual Conference on Advances in Neural Information Processing Systems (NeurIPS-2023), 2023.
2022
Conference
Kishan Panaganti, Zaiyan Xu, Dileep Kalathil, & Mohammad Ghavamzadeh. “Robust Reinforcement Learning using Offline Data”. Proceedings of the Thirty-Sixth Annual Conference on Advances in Neural Information Processing Systems (NeurIPS-2022), 2022. pdf
Amin Rakhsha, Andrew Wang, Mohammad Ghavamzadeh, & Amir-massoud Farahmand. “Operator Splitting Value Iteration”. Proceedings of the Thirty-Sixth Annual Conference on Advances in Neural Information Processing Systems (NeurIPS-2022), 2022. pdf
Ido Greenberg, Yinlam Chow, Mohammad Ghavamzadeh, & Shie Mannor. “Efficient Risk-Averse Reinforcement Learning”. Proceedings of the Thirty-Sixth Annual Conference on Advances in Neural Information Processing Systems (NeurIPS-2022), 2022. pdf
Gecia Bravo-Hermsdorff, Robert Busa-Fekete, Mohammad Ghavamzadeh, Andres Munoz medina, & Umar Syed. “Private and Communication-Efficient Algorithms for Entropy Estimation”. Proceedings of the Thirty-Sixth Annual Conference on Advances in Neural Information Processing Systems (NeurIPS-2022), 2022. pdf
Ahmadreza Moradipari, Berkay Turan, Yasin Abbasi-Yadkori, Mahnoosh Alizadeh, & Mohammad Ghavamzadeh. “Feature and Parameter Selection in Stochastic Linear Bandits”. Proceedings of the Thirty-Ninth International International Conference on Machine Learning (ICML-2022), 2022. pdf
Joey Hong, Branislav Kveton, Manzil Zaheer, Sumeet Katariya, & Mohammad Ghavamzadeh. “Deep Hierarchy in Bandits”. Proceedings of the Thirty-Ninth International International Conference on Machine Learning (ICML-2022), 2022. pdf
Mohammad Javad Azizi, Branislav Kveton, & Mohammad Ghavamzadeh. “Fixed-Budget Best-Arm Identification in Structured Bandits”. Selected for a long oral presentation (%4 acceptance). Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence (IJCAI-2022), 2022. pdf
Manan Tomar, Lior Shani, Yonathan Efroni, & Mohammad Ghavamzadeh. “Mirror Descent Policy Optimization”. Proceedings of the Tenth International Conference on Learning Representations (ICLR-2022), 2022. pdf
Ahmadreza Moradipari, Mohammad Ghavamzadeh, Taha Rajabzadeh, Christos Thrampoulidis, & Mahnoosh Alizadeh. “Multi-Environment Meta-Learning in Stochastic Linear Bandits”. Proceedings of IEEE International Symposium on Information Theory (ISIT-2022), 2022. pdf
Ahmadreza Moradipari, Mohammad Ghavamzadeh, & Mahnoosh Alizadeh. “Collaborative Multi-agent Stochastic Linear Bandits”. Proceedings of the 2022 American Control Conference (ACC-2022), 2022. pdf
Joey Hong, Branislav Kveton, Manzil Zaheer, & Mohammad Ghavamzadeh. “Hierarchical Bayesian Bandits”. Proceedings of the Twenty-Fifth International Conference on Artificial Intelligence and Statistics (AISTATS-2022), 2022. pdf
Joey Hong, Branislav Kveton, Manzil Zaheer, Mohammad Ghavamzadeh, & Craig Boutilier. “Thompson Sampling with a Mixture Prior”. Proceedings of the Twenty-Fifth International Conference on Artificial Intelligence and Statistics (AISTATS-2022), 2022. pdf
2021
Journal
Shubhanshu Shekhar, Mohammad Ghavamzadeh, & Tara Javidi. “Active Learning for Classification with Abstention”. IEEE Journal on Selected Areas in Information Theory (JSAIT), 2(2):705-719, 2021 (DOI: 10.1109/JSAIT.2021.3081433). pdf
Moloud Abdar, Farhad Pourpanah, Sadiq Hussain, Dana Rezazadegan, Li Liu, Mohammad Ghavamzadeh, Paul Fieguth, Xiaochun Cao, Abbas Khosravi, Rajendra Acharya, Vladimir Makarenkov, & Saeid Nahavandi. “A Review on Uncertainty Quantification in Deep Learning: Techniques, Applications, and Challenges”. Elsevier Journal on Information Fusion, 76:243-297, 2021 (DOI: 10.1016/j.inffus.2021.05.008) (winner of the Information Fusion Journal 2022 best survey award). pdf
Conference
Shubhanshu Shekhar, Greg Fields, Mohammad Ghavamzadeh, & Tara Javidi. “Adaptive Sampling for Minimax Fair Classification”. Proceedings of the Thirty-Fifth Annual Conference on Advances in Neural Information Processing Systems (NeurIPS-2021), 2021. pdf
Amir-massoud Farahmand and Mohammad Ghavamzadeh. “PID Accelerated Value Iteration Algorithm”. Proceedings of the Thirty-Eighth International Conference on Machine Learning (ICML-2021), 2021. pdf
Yinlam Chow, Brandon Cui, Moonkyung Ryu, & Mohammad Ghavamzadeh. “Variational Model-based Policy Optimization”. Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence (IJCAI-2021), 2021. pdf
Arash Mehrjou, Mohammad Ghavamzadeh, & Bernhard Schölkopf. “Neural Lyapunov Redesign”. Proceedings of the Third Annual Learning for Dynamics & Control Conference (L4DC-2021), 2021. pdf
Aldo Pacchiano, Mohammad Ghavamzadeh, Peter Bartlett, & Heinrich Jiang. “Stochastic Bandits with Linear Constraints”. Proceedings of the Twenty-Fourth International Conference on Artificial Intelligence and Statistics (AISTATS-2021), 2021. pdf
Brandon Cui, Yinlam Chow, & Mohammad Ghavamzadeh. “Control-aware Representations for Model-based Reinforcement Learning”. Proceedings of the Ninth International Conference on Learning Representations (ICLR-2021), 2021. pdf
Ravi-Tej Akella, Kamyar Azizzadenesheli, Mohammad Ghavamzadeh, Yisong Yue, & Anima Anandkumar. “Deep Bayesian Quadrature Policy Optimization”. Proceedings of the Thirty-Fifth Conference on Artificial Intelligence (AAAI-2021), 2021. pdf
2020
Conference
Yonathan Efroni, Mohammad Ghavamzadeh, & Shie Mannor. “Online Planning with Lookahead Policies”. Proceedings of the Thirty-Fourth Annual Conference on Advances in Neural Information Processing Systems (NeurIPS-2020), 2020. pdf
Yinlam Chow, Ofir Nachum, Aleksandra Faust, Edgar Duenez-Guzman, & Mohammad Ghavamzadeh. “Safe Policy Learning for Continuous Control”. Proceedings of the Fourth Conference on Robot Learning (CoRL-2020), 2020. pdf
Shubhanshu Shekhar, Tara Javidi, & Mohammad Ghavamzadeh. “Adaptive Sampling for Estimating Probability Distributions”. Proceedings of the Thirty-Seventh International Conference on Machine Learning (ICML-2020), 2020. pdf
Manan Tomar, Yonathan Efroni, & Mohammad Ghavamzadeh. “Multi-Step Greedy Reinforcement Learning Algorithms”. Proceedings of the Thirty-Seventh International Conference on Machine Learning (ICML-2020), 2020. pdf
Rui Shu, Tung Nguyen, Yinlam Chow, Tuan Pham, Khoat Than, Mohammad Ghavamzadeh, Stefano Ermon, & Hung Bui. “Predictive Coding for Locally-Linear Control”. Proceedings of the Thirty-Seventh International Conference on Machine Learning (ICML-2020), 2020. pdf
Jean Tarbouriech, Shubhanshu Shekhar, Mohammad Ghavamzadeh, Matteo Pirotta, & Alessandro Lazaric. “Active Model Estimation in Markov Decision Processes”. Proceedings of the Thirty-Sixth Conference on Uncertainty in Artificial Intelligence (UAI-2020), 2020. pdf
Shubhanshu Shekhar, Mohammad Ghavamzadeh, & Tara Javidi. “Active Learning for Classification with Abstention”. Proceedings of IEEE International Symposium on Information Theory (ISIT-2020), 2020 (short-listed as one of the six finalists for the Jack Keil Wolf student paper award). pdf
Evrard Garcelon, Mohammad Ghavamzadeh, Alessandro Lazaric, & Matteo Pirotta. “Conservative Exploration in Reinforcement Learning”. Proceedings of the Twenty-Third International Conference on Artificial Intelligence and Statistics (AISTATS-2020), 2020. pdf
Branislav Kveton, Manzeel Zaheer, Csaba Szepesvári, Lihong Li, Mohammad Ghavamzadeh, & Craig Boutilier. “Randomized Exploration in Generalized Linear Bandits” Proceedings of the Twenty-Third International Conference on Artificial Intelligence and Statistics (AISTATS-2020), 2020. pdf
Nir Levine, Yinlam Chow, Rui Shu, Ang Li, Mohammad Ghavamzadeh, & Hung Bui. “Prediction, Consistency, Curvature: Representation Learning for Locally Linear Control”. Proceedings of the Eighth International Conference on Learning Representations (ICLR-2020), 2020. pdf
Evrard Garcelon, Mohammad Ghavamzadeh, Alessandro Lazaric, & Matteo Pirotta. “Improved Algorithms for Conservative Exploration in Bandits”. Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI-2020), 2020. pdf
Workshop
Manan Tomar, Lior Shani, Yonathan Efroni, & Mohammad Ghavamzadeh. “Mirror Descent Policy Optimization”. Workshop on “Deep Reinforcement Learning”, Thirty-Fourth Annual Conference on Advances in Neural Information Processing Systems (NeurIPS-2020), 2020 (selected for a contributed talk – 8 out of over 250 submissions).
Ravi-Tej Akella, Kamyar Azizzadenasheli, Mohammad Ghavamzadeh, Yisong Yue, & Anima Anandkumar. “Deep Bayesian Quadrature Policy Gradient”. Workshops on “Deep Reinforcement Learning” and “Challenges of Real-World Reinforcement Learning”, Thirty-Fourth Annual Conference on Advances in Neural Information Processing Systems (NeurIPS-2020), 2020.
2019
Conference
Yonathan Effroni, Nadav Merlis, Mohammad Ghavamzadeh, & Shie Mannor. “Tight Regret Bounds for Model-based Reinforcement Learning with Greedy Policies”. Accepted for Spotlight Presentation. Proceedings of the Thirty-Second Annual Conference on Advances in Neural Information Processing Systems (NeurIPS-2019), pp. 12224-12234, 2019. pdf
Branislav Kveton, Csaba Szepesvári, Mohammad Ghavamzadeh, & Craig Boutilier. “Perturbed-History Exploration in Stochastic Linear Bandits”. Proceedings of the Thirty-Fifth Conference on Uncertainty in Artificial Intelligence (UAI-2019), 2019. pdf
Branislav Kveton, Csaba Szepesvári, Mohammad Ghavamzadeh, & Craig Boutilier. “Perturbed-History Exploration in Stochastic Multi-Armed Bandits”. Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence (IJCAI-2019), pp. 2786-2793, 2019. pdf
Branislav Kveton, Csaba Szepesvári, Sharan Vaswani, Zheng Wen, Mohammad Ghavamzadeh, & Tor Lattimore. “Garbage In, Reward Out: Bootstrapping Exploration in Multi-Armed Bandits”. Proceedings of the Thirty-Sixth International Conference on Machine Learning (ICML-2019), pp. 3601-3610, 2019. pdf
Ershad Banijamali, Yasin Abbasi-Yadkori, Mohammad Ghavamzadeh, & Nikos Vlassis. “Optimizing over a Restricted Policy Class in MDPs”. Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics (AISTATS-2019), pp. 3042-3050, 2019. pdf
Jonathan Pierre Lacotte, Mohammad Ghavamzadeh, Yinlam Chow, & Marco Pavone. “Risk-sensitive Generative Adversarial Imitation Learning”. Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics (AISTATS-2019), pp. 2154-2163, 2019. pdf
Workshop
Jorge Méndez, Alborz Geramifard, Mohammad Ghavamzadeh, & Bing Liu. “Reinforcement Learning of Multi-Domain Dialog Policies via Action Embeddings”. Workshop on “Conversational AI”, Thirty-Third Annual Conference on Advances in Neural Information Processing Systems (NeurIPS-2019), 2019.
Evrard Garcelon, Mohammad Ghavamzadeh, Alessandro Lazaric, & Matteo Pirotta. “Conservative Exploration in Finite Horizon Markov Decision Processes”. Workshop on “Safety and Robustness in Decision-making”, Thirty-Third Annual Conference on Advances in Neural Information Processing Systems (NeurIPS-2019), 2019.
Evrard Garcelon, Mohammad Ghavamzadeh, Alessandro Lazaric, & Matteo Pirotta. “Improved Algorithms for Conservative Exploration in Bandits”. Workshop on “Deep Reinforcement Learning”, Thirty-Third Annual Conference on Advances in Neural Information Processing Systems (NeurIPS-2019), 2019.
Scott Fujimoto, Edoardo Conti, Mohammad Ghavamzadeh, & Joelle Pineau. “Benchmarking Batch Deep Reinforcement Learning Algorithms”. Workshop on “Conversational AI”, Thirty-Third Annual Conference on Advances in Neural Information Processing Systems (NeurIPS-2019), 2019.
Yinlam Chow, Ofir Nachum, Aleksandra Faust, and Mohammad Ghavamzadeh. “Lyapunov-based Safe Policy Optimization for Continuous Control”. Workshop on “Reinforcement Learning for Real Life”, Thirty-Sixth International Conference on Machine Learning (ICML-2019), 2019 (winner of the best paper award).
2018
Journal
Bo Liu, Ian Gemp, Mohammad Ghavamzadeh, Ji Liu, Sridhar Mahadevan, & Marek Petrik. “Proximal Gradient Temporal Difference Learning: Stable Reinforcement Learning with Polynomial Sample Complexity”. Journal of Artificial Intelligence Research (JAIR), 63:461-494, 2018. pdf
Yinlam Chow, Mohammad Ghavamzadeh, Lucas Janson, & Marco Pavone. “Risk-Constrained Reinforcement Learning with Percentile Risk Criteria”. Journal of Machine Learning Research (JMLR), 18(167):1-51, 2018. pdf
Conference
Yinlam Chow, Ofir Nachum, Mohammad Ghavamzadeh, & Edgar Duenez-Guzman. “A Lyapunov-based Approach to Safe Reinforcement Learning”. Proceedings of the Thirty-Second Annual Conference on Advances in Neural Information Processing Systems (NeurIPS-2018), 2018. pdf
Tengyang Xie, Bo Liu, Yangyang Xu, Mohammad Ghavamzadeh, Yinlam Chow, Daoming Lyu, & Daesub Yoon. “A Block Coordinate Ascent Algorithm for Mean-Variance Optimization”. Proceedings of the Thirty-Second Annual Conference on Advances in Neural Information Processing Systems (NeurIPS-2018), pp. 1073-1083, 2018. pdf
Ofir Nachum, Yinlam Chow, & Mohammad Ghavamzadeh. “Path Consistency Learning in Tsallis Entropy Regularized MDPs”. Proceedings of the Thirty-Fifth International Conference on Machine Learning (ICML-2018), pp. 979-988, Stockholm, Sweden, July 2018. pdf
Mehrdad Farajtabar, Yinlam Chow, & Mohammad Ghavamzadeh. “More Robust Doubly Robust Off-policy Evaluation”. Proceedings of the Thirty-Fifth International Conference on Machine Learning (ICML-2018), pp. 1447-1456, Stockholm, Sweden, July 2018. pdf
Ershad Banijamali, Rui Shu, Mohammad Ghavamzadeh, Hung Bui & Ali Ghodsi. “Robust Locally-Linear Controllable Embedding”. Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics (AISTATS-2018), pp. 1751-1759, 2018. pdf
Yahel David, Balázs Szörényi, Mohammad Ghavamzadeh, Shie Mannor, & Nahum Shimkin. “PAC Bandits with Risk Constraints”. International Symposium on Artificial Intelligence and Mathematics (ISAIM-2018), Special Session on Theory of Machine Learning, 2018. pdf
Workshop
- Jonathan Lacotte, Mohammad Ghavamzadeh, Yinlam Chow, & Marco Pavone. “Risk-Sensitive Generative Adversarial Imitation Learning”. Workshop on “Safety, Risk, and Uncertainty in Reinforcement Learning”, Thirty-Fourth Conference on Uncertainty in Artificial Intelligence (UAI-2018), 2018.
2017
Journal
- Aviv Tamar, Yinlam Chow, Mohammad Ghavamzadeh, & Shie Mannor. “Sequential Decision-making with Coherent Risk”. IEEE Transaction on Automatic Control (TAC), 62(7):3323-3338, 2017 (DOI: 10.1109/TAC.2016.2644871). pdf
Conference
Abbas Kazerouni, Mohammad Ghavamzadeh, Yasin Abbasi-Yadkori, & Ben Van Roy. “Conservative Contextual Linear Bandits”. Proceedings of the Thirty-First Annual Conference on Neural Information Processing Systems (NIPS-2017), pp. 3913-3922, 2017. pdf
Carlos Riquelme, Mohammad Ghavamzadeh, & Alessandro Lazaric. “Active Learning for Accurate Estimation of Linear Models”. Proceedings of the Thirty-Fourth International Conference on Machine Learning (ICML-2017), pp. 2931-2939, Sydney, Australia, August 2017. pdf
Rui Shu, Hung Bui, & Mohammad Ghavamzadeh. “Bottleneck Conditional Density Estimation”. Proceedings of the Thirty-Fourth International Conference on Machine Learning (ICML-2017), pp. 3164-3172, Sydney, Australia, August 2017. pdf
Sharan Vaswani, Branislav Kveton, Zheng Wen, Mohammad Ghavamzadeh, Laks Lakshmanan, & Mark Schmidt. “Model-Independent Online Learning for Influence Maximization”. Proceedings of the Thirty-Fourth International Conference on Machine Learning (ICML-2017), pp. 3530-3539, Sydney, Australia, August 2017. pdf
Masrour Zhoghi, Tomas Tunys, Mohammad Ghavamzadeh, Branislav Kveton, Csaba Szepesvári, & Zheng Wen. “Online Learning to Rank in Stochastic Click Models”. Proceedings of the Thirty-Fourth International Conference on Machine Learning (ICML-2017), pp. 4199-4208, Sydney, Australia, August 2017. pdf
Alan Malek, Yinlam Chow, Sumeet Katariya, & Mohammad Ghavamzadeh. “Sequential Multiple Hypothesis Testing with Type I Error Control”. Proceedings of the Twentieth International Conference on Artificial Intelligence and Statistics (AISTATS-2017), pp. 1468-1476, 2017. pdf
Philip Thomas, Georgios Theocharous, Mohammad Ghavamzadeh, Ishan Durugkar, & Emma Brunskill. “Predictive Off-Policy Evaluation for Nonstationary Decision Problems”. Proceedings of the Twenty-Ninth Conference on Innovative Applications of Artificial Intelligence (IAAI-2017), pp. 4740-4745, 2017. pdf
Ian Gemp, Georgios Theocharous, & Mohammad Ghavamzadeh. “Automated Data Cleansing through Meta-Learning”. Proceedings of the Twenty-Ninth Conference on Innovative Applications of Artificial Intelligence (IAAI-2017), pp. 4760-4761, 2017. pdf
Workshop
Ershad Banijamali, Ahmad Khajenezhad, Ali Ghodsi, & Mohammad Ghavamzadeh. “Disentangling Dynamics and Content for Control and Planning”. Workshop on “Learning Disentangled Representations: from Perception to Control”, Thirty-First Annual Conference on Neural Information Processing Systems (NIPS-2017), 2017.
Ershad Banijamali, Rui Shu, Mohammad Ghavamzadeh, & Hung Bui. “Robust Controlable Embedding of High-Dimensional Observations of Markov Decision Processes”. Workshop on “Implicit Models”, Thirty-Fourth International Conference on Machine Learning (ICML-2017), Sydney, Australia, August 2017.
2016
Journal
Amir massoud Farahmand, Mohammad Ghavamzadeh, Csaba Szepesvári, & Shie Mannor. “Regularized Policy Iteration for Non-Parametric Function Spaces”. Journal of Machine Learning Research (JMLR), 17(139):1-66, 2016. pdf
Prashanth L. A. and Mohammad Ghavamzadeh. “Variance-constrained Actor-Critic Algorithms for Discounted and Average Reward MDPs”. Machine Learning Journal (MLJ), 105(3):367-417, 2016 (DOI: 10.1007/s10994-016-5569-5). pdf
Mohammad Ghavamzadeh, Yaakov Engel, & Michal Valko. “Bayesian Policy Gradient and Actor-Critic Algorithms”. Journal of Machine Learning Research (JMLR), 17(66):1-53, 2016. pdf
CODE IS AVAILABLE AT 1 2Alessandro Lazaric, Mohammad Ghavamzadeh, & Rémi Munos. “Analysis of Classification-based Policy Iteration Algorithms”. Journal of Machine Learning Research (JMLR), 17(19):1-30, 2016. pdf
Conference
Marek Petrik, Mohammad Ghavamzadeh, & Yinlam Chow. “Safe Policy Improvement by Minimizing Robust Baseline Regret”. Proceedings of the Thirtieth Annual Conference on Neural Information Processing Systems (NIPS-2016), pp. 2298-2306, 2016. pdf
Branislav Kveton, Hung Bui, Mohammad Ghavamzadeh, Georgios Theocharous, S. Muthukrishnan, & Siqi Sun. “Graphical Model Sketch”. Proceedings of the European Conference on Machine Learning (ECML-2016), Riva del Garda, Italy, 2016. pdf
Bo Liu, Mohammad Ghavamzadeh, Ian Gemp, Ji Liu, & Sridhar Mahadevan. “Proximal Gradient Temporal Difference Learning Algorithms”. Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence (IJCAI-2016), pp. 4195-4199, New York City, NY, July 2016. pdf
Victor Gabillon, Alessandro Lazaric, Mohammad Ghavamzadeh, Ronald Ortner, & Peter Bartlett. “Improved Learning Complexity in Combinatorial Pure Exploration Bandits”. Proceedings of the Nineteenth International Conference on Artificial Intelligence and Statistics (AISTATS-2016), pp. 1004-1012, Cadiz, Spain, May 2016. pdf
Workshop
Abbas Kazerouni, Mohammad Ghavamzadeh, & Ben VanRoy. “Safety in Contextual Linear Bandits”. Workshop on “Reliable Machine Learning in the Wild”, Thirtieth Annual Conference on Advances in Neural Information Processing Systems (NIPS-2016), Barcelona, Spain, December 2016.
Rui Shu, Hung Bui, & Mohammad Ghavamzadeh. “Bottleneck Conditional Density Estimators”. Workshop on “Bayesian Deep Learning”, Thirtieth Annual Conference on Advances in Neural Information Processing Systems (NIPS-2016), Barcelona, Spain, December 2016.
Rui Shu, James Brofos, Frank Zhang, Hung Bui, Mohammad Ghavamzadeh, & Mykel Kochenderfer. “Stochastic Video Prediction with Conditional Density Estimation”. Workshop on “Action and Anticipation for Visual Learning”, Fourteenth European Conference on Computer Vision (ECCV-2016), Amsterdam, The Netherlands, October 2016.
Marek Petrik, Yinlam Chow, & Mohammad Ghavamzadeh. “Optimally Robust Policy Improvement with Baseline Guarantees”. Workshop on “Reliable Machine Learning in the Wild”, Thirty-Third International Conference on Machine Learning (ICML-2016), New York City, NY, June 2016.
2015
Journal
Mohammad Ghavamzadeh, Shie Mannor, Joelle Pineau, & Aviv Tamar. “Bayesian Reinforcement Learning: A Survey”. Foundations and Trends in Machine Learning, 8(5-6):359-483, 2015 (DOI: 10.1561/2200000049). pdf
Bruno Scherrer, Mohammad Ghavamzadeh, Victor Gabillon, Boris Lesner, & Matthieu Geist. “Approximate Modified Policy Iteration and its Application to the Game of Tetris”. Journal of Machine Learning Research (JMLR), 16:1629-1676, 2015. pdf
Amir massoud Farahmand, Doina Precup, André Barreto, & Mohammad Ghavamzadeh. “Classification-based Approximate Policy Iteration”. IEEE Transactions on Automatic Control (TAC), 60(11) 2989-2993, 2015. pdf
Conference
Aviv Tamar, Yinlam Chow, Mohammad Ghavamzadeh, & Shie Mannor. “Policy Gradient for Coherent Risk Measures”. Proceedings of the Twenty-Ninth Annual Conference on Advances in Neural Information Processing Systems (NIPS-2015), pp. 1468-1476, 2015. pdf
Bo Liu, Mohammad Ghavamzadeh, Sridhar Mahadevan, & Marek Petrik. “Finite-Sample Analysis of Proximal Gradient TD Algorithms”. Proceedings of the Thirty-First Conference on Uncertainty in Artificial Intelligence (UAI-2015), pp. 504-513, Amsterdam, Netherlands, July 2015 (winner of the Facebook best student paper award). pdf
Philip Thomas, Georgios Theocharous, & Mohammad Ghavamzadeh. “High Confidence Policy Improvement”. Proceedings of the Thirty-Second International Conference on Machine Learning (ICML-2015), pp. 2380-2388, Lille, France, July 2015. pdf
Julien Audiffren, Michal Valko, Alessandro Lazaric, & Mohammad Ghavamzadeh. “Maximum Entropy Semi-Supervised Inverse Reinforcement Learning”. Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence (IJCAI-2015), pp. 3315-3321, Buenos Aires, Argentina, July 2015. pdf
Georgios Theocharous, Philip Thomas, & Mohammad Ghavamzadeh. “Building Personalized Ad Recommendation Systems for Life-Time Value Optimization with Guarantees”. Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence (IJCAI-2015), pp. 1806-1812, Buenos Aires, Argentina, July 2015. pdf
Philip Thomas, Georgios Theocharous, & Mohammad Ghavamzadeh. “High Confidence Off-Policy Evaluation”. Proceedings of the Twenty-Ninth Conference on Artificial Intelligence (AAAI-2015), pp. 3000-3006, Austin, TX, January 2015. pdf
Workshop
Sougata Chaudhuri, Georgios Theocharous, & Mohammad Ghavamzadeh. “A Ranking Approach to Address the Click Sparsity Problem in Personalized Ad Recommendation”. Workshop on “Machine Learning for eCommerce”. Twenty-Ninth Annual Conference on Advances in Neural Information Processing Systems (NIPS-2015), Montreal, Canada, December 2015.
Aviv Tamar, Yinlam Chow, Mohammad Ghavamzadeh, & Shie Mannor. “Policy Gradient for Coherent Risk Measures”. Twelfth European Workshop on Reinforcement Learning (EWRL-12) at the Thirty-Second International Conference on Machine Learning (ICML), Lille, France, July 2015.
Georgios Theocharous, Philip Thomas, & Mohammad Ghavamzadeh. “Ad Recommendation Systems for Life-Time Value Optimization”. Workshop on “Ad Targeting at Scale”, Twenty-Fourth International World Wide Web Conference (WWW-2015), Florence, Italy, May 2015.
Tech Report
- Aviv Tamar, Yinlam Chow, Mohammad Ghavamzadeh, and Shie Mannor. “Policy Gradient for Coherent Risk Measures”. arXiv:1502.03919, 2015.
2014
Conference
- Yinlam Chow and Mohammad Ghavamzadeh. “Algorithms for CVaR Optimization in MDPs”. Proceedings of the Twenty-Eighth Annual Conference on Advances in Neural Information Processing Systems (NIPS-2014), pp. 3509-3517, 2014. pdf
Workshop
Yinlam Chow and Mohammad Ghavamzadeh. “Constrained Stochastic Optimal Control with a Baseline Performance Guarantee”. Workshop on “From Bad Models to Good Policies”, Twenty-Eight Annual Conference on Advances in Neural Information Processing Systems (NIPS-2014), Montreal, Canada, December 2014.
Julien Audiffren, Michal Valko, Alessandro Lazaric, & Mohammad Ghavamzadeh. “Maximum Entropy Semi-Supervised Inverse Reinforcement Learning”. Workshop on “Novel Trends and Applications in Reinforcement Learning”, Twenty-Eight Annual Conference on Advances in Neural Information Processing Systems (NIPS-2014), Montreal, Canada, December 2014.
Philip Thomas, Georgios Theocharous, & Mohammad Ghavamzadeh. “Safe Policy Search”. Workshop on “Customer Life-Time Value Optimization in Digital Marketing”, Thirty-First International Conference on Machine Learning (ICML-2014), Beijing, China, June 2014.
Tech Report
Yinlam Chow and Mohammad Ghavamzadeh. “Constrained Stochastic Optimal Control with a Baseline Performance Guarantee”. arXiv:1410.2726, 2014.
Yinlam Chow and Mohammad Ghavamzadeh. “Algorithms for CVaR Optimization in MDPs”. arXiv:1406.3339, 2014.
Habilitation Thesis
- Mohammad Ghavamzadeh. “Sample Complexity in Sequential Decision-Making”. Department of Mathematics, Université Lille 1 - Sciences et Technologies, France, June 2014. pdf
2013
Conference
Prashanth L. A. and Mohammad Ghavamzadeh. “Actor-Critic Algorithms for Risk-Sensitive MDPs”. Accepted for Oral Presentation (%1.4 acceptance - 20 out of 1420 submissions). Proceedings of the Twenty-Seventh Annual Conference on Advances in Neural Information Processing Systems (NIPS-2013), pp. 252-260, 2013. pdf
Victor Gabillon, Mohammad Ghavamzadeh, & Bruno Scherrer. “Approximate Dynamic Programming Finally Performs Well in the Game of Tetris”. Proceedings of the Twenty-Seventh Annual Conference on Advances in Neural Information Processing Systems (NIPS-2013), pp. 1754-1762, 2013. pdf
Bernardo Ávila Pires, Mohammad Ghavamzadeh, & Csaba Szepesvári. “Cost-sensitive Multiclass Classification Risk Bounds “. Proceedings of the Thirtieth International Conference on Machine Learning (ICML-2013), pp. 28(3):1391-1399, Atlanta, GA, 2013. pdf
Hachem Kadri, Mohammad Ghavamzadeh, & Philippe Preux. “A Generalized Kernel Approach to Structured Output Learning”. Proceedings of the Thirtieth International Conference on Machine Learning (ICML-2013), pp. 28(1):471-479, Atlanta, GA, 2013. pdf
Amir massoud Farahmand, Doina Precup, André Barreto, & Mohammad Ghavamzadeh. “CAPI: Generalized Classification-based Approximate Policy Iteration”. The First Multidisciplinary Conference on Reinforcement Learning and Decision Making (RLDM-2013), Princeton, NJ, 2013.
Tech Report
- Prashanth L. A. and Mohammad Ghavamzadeh. “Actor-Critic Algorithms for Risk-Sensitive MDPs” Technical Report inria-00794721, INRIA, 2013.
2012
Journal
- Alessandro Lazaric, Mohammad Ghavamzadeh, & Rémi Munos. “Finite-Sample Analysis of Least-Squares Policy Iteration’’. Journal of Machine Learning Research (JMLR), 13:3041-3074, 2012. pdf
Conference
Victor Gabillon, Mohammad Ghavamzadeh, & Alessandro Lazaric. “A Unified Approach to Fixed Budget and Fixed Confidence”. Proceedings of the Twenty-Sixth Annual Conference on Advances in Neural Information Processing Systems (NIPS-2012), pp. 3221-3229, 2012. pdf
Bruno Scherrer, Mohammad Ghavamzadeh, Victor Gabillon, & Matthieu Geist. “Approximate Modified Policy Iteration”. Proceedings of the Twenty-Ninth International Conference on Machine Learning (ICML-2012), pp. 1207-1214, Edinburgh, Scotland, 2012. pdf
Matthieu Geist, Bruno Scherrer, Alessandro Lazaric, &Mohammad Ghavamzadeh. “A Dantzig Selector Approach to Temporal Difference Learning”. Proceedings of the Twenty-Ninth International Conference on Machine Learning (ICML-2012), pp. 1399-1406, Edinburgh, Scotland, 2012. pdf
Mohammad Ghavamzadeh & Alessandro Lazaric. “Conservative and Greedy Approaches to Classification-based Policy Iteration”. Proceedings of the Twenty-Sixth Conference on Artificial Intelligence (AAAI-2012), 914-920, Toronto, ON, Canada, 2012. pdf
Book Chapter
Nikos Vlassis, Mohammad Ghavamzadeh, Shie Mannor, & Pascal Poupart. “Bayesian Reinforcement Learning”. Reinforcement Learning: State of the Art, Edited by Marco Wiering and Martijn van Otterlo, Springer Verlag, 2012.
Lucian Busoniu, Alessandro Lazaric, Mohammad Ghavamzadeh, Rémi Munos, Robert Babuska, & Bert De Schutter. “Least-Squares Methods for Policy Iteration”. Reinforcement Learning: State of the Art, Edited by Marco Wiering and Martijn van Otterlo, Springer Verlag, 2012.
Workshop
- Michal Valko, Mohammad Ghavamzadeh, & Alessandro Lazaric. “Semi-Supervised Inverse Reinforcement Learning “. Ninth European Workshop on Reinforcement Learning (EWRL-2012), Edinburgh, Scotland, 2012.
Tech Report
Hachem Kadri, Mohammad Ghavamzadeh, & Philippe Preux. “A Generalized Kernel Approach to Structured Output Learning” Technical Report inria-00695631, INRIA, 2012.
Victor Gabillon, Mohammad Ghavamzadeh, & Alessandro Lazaric. “Best Arm Identification: A Unified Approach to Fixed Budget and Fixed Confidence” Technical Report inria-00747005, INRIA, 2012.
Bruno Scherrer, Mohammad Ghavamzadeh, Victor Gabillon, & Matthieu Geist. “Approximate Modified Policy Iteration” Technical Report inria-00697169, INRIA, 2012.
2011
Conference
Victor Gabillon, Mohammad Ghavamzadeh, Alessandro Lazaric, & Sebastien Bubeck. “Multi-Bandit Best Arm Identification”. Proceedings of the Twenty-Fifth Annual Conference on Advances in Neural Information Processing Systems (NIPS-2011), pp. 2222-2230, 2011. pdf
Mohammad Azar, Rémi Munos, Mohammad Ghavamzadeh, & Hilbert Kappen. “Speedy Q-Learning”. Proceedings of the Twenty-Fifth Annual Conference on Advances in Neural Information Processing Systems (NIPS-2011), pp. 2411-2419, 2011. pdf
Alexandra Carpentier, Alessandro Lazaric, Mohammad Ghavamzadeh, Rémi Munos, & Peter Auer. “Upper-Confidence-Bound Algorithms for Active Learning in Multi-Armed Bandits”. Selected for a special issue of the Journal of Theoretical Computer Science. Proceedings of the Twenty-Second International Conference on Algorithmic Learning Theory (ALT-2011), pp. 189-203, Espoo, Finland, October 2011. pdf
Mohammad Ghavamzadeh, Alessandro Lazaric, Rémi Munos, & Matthew Hoffman. “Finite-Sample Analysis of Lasso-TD”. Proceedings of the Twenty-Eighth International Conference on Machine Learning (ICML-2011), pp. 1177-1184, Bellevue, WA, June 2011. pdf
Victor Gabillon, Alessandro Lazaric, Mohammad Ghavamzadeh, & Bruno Scherrer. “Classification-based Policy Iteration with a Critic”. Proceedings of the Twenty-Eighth International Conference on Machine Learning (ICML-2011), pp. 1049-1056, Bellevue, WA, June 2011. pdf
Workshop
- Matthew Hoffman, Alessandro Lazaric, Mohammad Ghavamzadeh, & Rémi Munos. “Regularized Least Squares Temporal Difference Learning with Nested L2 and L1 Penalization”. Ninth European Workshop on Reinforcement Learning (EWRL-2011), Athens, Greece, September 2011.
Tech Report
Mohammad Azar, Rémi Munos, Mohammad Ghavamzadeh, & Hilbert Kappen. “Reinforcement Learning with a Near Optimal Rate of Convergence” Technical Report inria-00636615, INRIA, 2011.
Victor Gabillon, Mohammad Ghavamzadeh, Alessandro Lazaric, & Sébastien Bubeck. “Multi-Bandit Best Arm Identification” Technical Report inria-00632523, INRIA, 2011.
Alexandra Carpentier, Alessandro Lazaric, Mohammad Ghavamzadeh, Rémi Munos, & Peter Auer. “Upper-Confidence-Bound Algorithms for Active Learning in Multi-Armed Bandits” Technical Report inria-00594131, INRIA, 2011.
Victor Gabillon, Alessandro Lazaric, Mohammad Ghavamzadeh, & Bruno Scherrer. “Classification-based Policy Iteration with a Critic” Technical Report inria-00590972, INRIA, 2011.
2010
Conference
Mohammad Ghavamzadeh, Alessandro Lazaric, Odalric Maillard, & Rémi Munos. “LSTD with Random Projections”. Accepted for Spotlight Presentation (%6 acceptance - 73 out of 1219 submissions). Proceedings of the Twenty-Fourth Annual Conference on Advances in Neural Information Processing Systems (NIPS-2010), pp. 721-729, 2010. pdf
Odalric Maillard, Rémi Munos, Alessandro Lazaric, & Mohammad Ghavamzadeh. “Finite-Sample Analysis of Bellman Residual Minimization’’. Proceedings of the Second Asian Conference on Machine Learning (ACML-2010), pp. 299-314, Tokyo, Japan, November 2010. pdf
Alessandro Lazaric & Mohammad Ghavamzadeh. “Bayesian Multi-Task Reinforcement Learning”. Proceedings of the Twenty-Seventh International Conference on Machine Learning (ICML-2010), pp. 599-606, Haifa, Israel, June 2010. pdf
Alessandro Lazaric, Mohammad Ghavamzadeh, & Rémi Munos. “Analysis of a Classification-based Policy Iteration Algorithm”. Proceedings of the Twenty-Seventh International Conference on Machine Learning (ICML-2010), pp. 607-614, Haifa, Israel, June 2010. pdf
Alessandro Lazaric, Mohammad Ghavamzadeh, & Rémi Munos. “Finite-Sample Analysis of LSTD”. Proceedings of the Twenty-Seventh International Conference on Machine Learning (ICML-2010), pp. 615-622, Haifa, Israel, June 2010. pdf
Workshop
- Victor Gabillon, Alessandro Lazaric, & Mohammad Ghavamzadeh. “Rollout Allocation Strategies for Classification-based Policy Iteration”. Workshop on “Reinforcement Learning and Search in Very Large Spaces”, Twenty-Seventh International Conference on Machine Learning (ICML-2010), Haifa, Israel, June 2010.
Tech Report
Mohammad Ghavamzadeh, Alessandro Lazaric, Odalric Maillard, & Rémi Munos. “LSPI with Random Projections,” Technical Report inria-00530762, INRIA, 2010.,
Alessandro Lazaric, Mohammad Ghavamzadeh, & Rémi Munos. “Finite-Sample Analysis of Least-Squares Policy Iteration,’’ Technical Report inria-00528596, INRIA, 2010.
Alessandro Lazaric & Mohammad Ghavamzadeh. “Bayesian Multi-Task Reinforcement Learning,’’ Technical Report inria-00475214, INRIA, 2010.
Alessandro Lazaric, Mohammad Ghavamzadeh, & Rémi Munos. “Analysis of a Classification-based Policy Iteration Algorithm,’’ Technical Report inria-00482065, INRIA, 2010.
Alessandro Lazaric, Mohammad Ghavamzadeh, & Rémi Munos. “Finite-Sample Analysis of LSTD,’’ Technical Report inria-00482189, INRIA, 2010.
2009
Journal
- Shalabh Bhatnagar, Richard Sutton, Mohammad Ghavamzadeh, & Mark Lee. “Natural Actor-Critic Algorithms”. Automatica, 45(11):2471-2482, 2009 (DOI: 10.1016/j.automatica.2009.07.008). (the longer version is available as a UAlberta Tech-Report pdf
Conference
- Amir massoud Farahmand, Mohammad Ghavamzadeh, Csaba Szepesvári, & Shie Mannor. “Regularized Fitted Q-iteration for Planning in Continuous-Space Markovian Decision Problems”. Proceedings of the 2009 American Control Conference (ACC-2009), pp. 725-730, St. Louis, MO, June 2009. pdf
Workshop
Mohammad Ghavamzadeh. “Hierarchical Hybrid Reinforcement Learning Algorithms”. Workshop on “Bridging the Gap between High-level Discrete Representations and Low-level Continuous Behaviors”, Robotics: Science and Systems Conference (RSS-2009), Seattle, WA, June 2009. pdf
Amir massoud Farahmand, Mohammad Ghavamzadeh, Csaba Szepesvári, & Shie Mannor. “Robot Learning with Regularized Reinforcement Learning”. Workshop on “Regression in Robotics—Approaches and Applications”, Robotics: Science and Systems Conference (RSS-2009), Seattle, WA, June 2009. pdf
Mohammad Ghavamzadeh & Yaakov Engel. “Bayesian Actor Critic: A Bayesian Model for Value Function Approximation and Policy Learning”. Workshop on “Regression in Robotics—Approaches and Applications”, Robotics: Science and Systems Conference (RSS-2009), Seattle, WA, June 2009. pdf
Amir massoud Farahmand, Mohammad Ghavamzadeh, Csaba Szepesvári, & Shie Mannor. “Regularization in Reinforcement Learning”. Multidisciplinary Symposium on Reinforcement Learning (MSRL-2009), Montreal, QC, Canada, June 2009. pdf
Tech Report
- Shalabh Bhatnagar, Richard Sutton, Mohammad Ghavamzadeh, & Mark Lee. “Natural Actor-Critic Algorithms,” Technical Report TR09-10, Department of Computing Science, University of Alberta, 2009.
2008
Conference
- Amir massoud Farahmand, Mohammad Ghavamzadeh, Csaba Szepesvári, & Shie Mannor. “Regularized Policy Iteration”. Proceedings of the Twenty-Second Annual Conference on Advances in Neural Information Processing Systems (NIPS-2008), pp. 441-448, 2008. pdf
Workshop
Amir massoud Farahmand, Mohammad Ghavamzadeh, Csaba Szepesvári, & Shie Mannor. “Regularized Fitted Q-iteration: Application to Bounded Resource Planning”. Proceedings of the Eighth European Workshop on Reinforcement Learning (EWRL-2008), volume 5323 of Lecture Notes in Artificial Intelligence, pp. 55-68, Villeneuve d’Ascq, France, July 2008. pdf
Amir massoud Farahmand, Mohammad Ghavamzadeh, Csaba Szepesvári, & Shie Mannor. “Regularized Policy Iteration”. Eighth European Workshop on Reinforcement Learning (EWRL-2008), Villeneuve d’Ascq, France, July 2008.
2007
Journal
- Mohammad Ghavamzadeh & Sridhar Mahadevan. “Hierarchical Average Reward Reinforcement Learning”. Journal of Machine Learning Research (JMLR), 8:2629-2669, 2007. pdf
Conference
Shalabh Bhatnagar, Richard Sutton, Mohammad Ghavamzadeh, & Mark Lee. “Incremental Natural Actor-Critic Algorithms”. Accepted for Spotlight Presentation (%10 acceptance - 101 out of 975 submissions). Proceedings of the Twenty-First Annual Conference on Advances in Neural Information Processing Systems (NIPS-2007), pp. 105-112, 2007. pdf
Mohammad Ghavamzadeh & Yaakov Engel. “Bayesian Actor-Critic Algorithms”. Proceedings of the Twenty-Fourth International Conference on Machine Learning (ICML-2007), pp. 297-304, Oregon State University, Corvallis, OR, June 2007. pdf
2006
Journal
- Mohammad Ghavamzadeh, Sridhar Mahadevan, & Rajbala Makar. “Hierarchical Multiagent Reinforcement Learning”. Journal of Autonomous Agents and Multi-Agent Systems (JAAMAS), 13(2):197-229, 2006 (DOI: 10.1007/s10458-006-7035-4). pdf
Conference
- Mohammad Ghavamzadeh & Yaakov Engel. “Bayesian Policy Gradient Algorithms”. Accepted for Spotlight Presentation (%7.5 acceptance - 63 out of 833 submissions). Proceedings of the Twentieth Annual Conference on Advances in Neural Information Processing Systems (NIPS-2006), pp. 457-464, 2006. pdf
Workshop
Mohammad Ghavamzadeh & Yaakov Engel. “Bayesian Policy Gradient”. Workshop on “Kernel Machines and Reinforcement Learning” (KRL), Twenty-Thrid International Conference on Machine Learning (ICML-2006), Pittsburgh, PA, June 2006. pdf
Mohammad Ghavamzadeh & Sridhar Mahadevan. “Learning to Cooperate using Hierarchical Reinforcement Learning”. Workshop on “Hierarchical Autonomous Agents and Multi-Agent Systems” (H-AAMAS), Fifth International Joint Conference on Autonomous Agents and Multi-Agent Systems (AAMAS-2006), Hakodate, Japan, May 2006. pdf
2005
PhD Thesis
- Mohammad Ghavamzadeh. “Hierarchical Reinforcement Learning in Continuous State and Multi-Agent Environments”. Department of Computer Science, University of Massachusetts Amherst, May 2005.
2004
Conference
- Mohammad Ghavamzadeh & Sridhar Mahadevan. “Learning to Communicate and Act using Hierarchical Reinforcement Learning”. Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS-2004), pp. 1114-1121, New York City, NY, July 2004. pdf
Book Chapter
- Sridhar Mahadevan, Mohammad Ghavamzadeh, Khashayar Rohanimanesh, & Georgios Theocharous. “Hierarchical Approaches to Concurrency, Multiagency, and Partial Observability”. Learning and Approximate Dynamic Programming: Scaling up to the Real World, Edited by Jennie Si, Andrew Barto, Warren Powell and Donald Wunsch, John Wiley & Sons, New York, pp. 285-310, 2004. pdf
Tech Report
- Mohammad Ghavamzadeh & Sridhar Mahadevan. “Hierarchical Multiagent Reinforcement Learning”. Technical Report UM-CS-2004-02. Department of Computer Science, University of Massachusetts Amherst, 2004.
2003
Conference
- Mohammad Ghavamzadeh & Sridhar Mahadevan. “Hierarchical Policy Gradient Algorithms”. Proceedings of the Twentieth International Conference on Machine Learning (ICML-2003), pp. 226-233, Washington, D.C., August 2003. pdf
Tech Report
Mohammad Ghavamzadeh, Sridhar Mahadevan, & Rajbala Makar. “Extending Hierarchical Reinforcement Learning to Continuous-Time, Average-Reward, and Multi-Agent Models”. Technical Report UM-CS-2003-23, Department of Computer Science, University of Massachusetts Amherst, 2003.
Mohammad Ghavamzadeh & Sridhar Mahadevan. “Hierarchical Average Reward Reinforcement Learning”. Technical Report UM-CS-2003-19, Department of Computer Science, University of Massachusetts Amherst, 2003.
2002
Conference
Mohammad Ghavamzadeh & Sridhar Mahadevan. “Hierarchically Optimal Average Reward Reinforcement Learning”. Proceedings of the Nineteenth International Conference on Machine Learning (ICML-2002), pp. 195-202, Sydney, Australia, July 2002. pdf
Mohammad Ghavamzadeh & Sridhar Mahadevan. “A Multiagent Reinforcement Learning Algorithm by Dynamically Merging Markov Decision Processes”. Proceedings of the First International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS-2002), pp. 845-846, Bologna, Italy, July 2002. pdf
2001
Journal
- Ali M. Eydgahi & Mohammad Ghavamzadeh. “Complementary Root Locus Revisited”. IEEE Transactions on Education, 44(2):137-143, 2001. pdf
Conference
Mohammad Ghavamzadeh & Sridhar Mahadevan. “Continuous-Time Hierarchical Reinforcement Learning”. Proceedings of the Eighteenth International Conference on Machine Learning (ICML-2001), pp. 186-193, Williams College, MA, July 2001. pdf
Rajbala Makar, Sridhar Mahadevan, & Mohammad Ghavamzadeh. “Hierarchical Multi-Agent Reinforcement Learning”. Proceedings of the Fifth International Conference on Autonomous Agents (Agents-2001), pp. 246-253, Montreal, Canada, June 2001 (winner of the best student paper award). pdf
Before 2001
Conference
Ali M. Eydgahi & Mohammad Ghavamzadeh. “Complementary Root Locus Revisited”. IEEE Transactions on Education, 44(2):137-143, 2001. pdf
Mohammad Ghavamzadeh, Caro Lucas, & Shahin Shayan Arani. “Forecasting the International Oil and Gold Prices Using Artificial Neural Networks”. Proceedings of the Conference on Computer Science and Information Technologies (CSIT-1997), Yerevan, Armenia, September 1997.
Ali M. Eydgahi & Mohammad Ghavamzadeh. (in Farsi) “Properties of Branches Passing through Infinity in Root Locus Method”. Journal of Faculty of Engineering University of Tehran, pp. 1-10, December 1996.
Ali M. Eydgahi & Mohammad Ghavamzadeh. (in Farsi) “Branches Passing through Infinity in Root Locus Method”. Journal of Faculty of Engineering University of Tehran, pp. 9-15, June 1996.
Mohammad Ghavamzadeh & Ali M. Eydgahi. (in Farsi) “An Adaptive Fuzzy Controller for Flexible Joint Robots”. Proceedings of the International Conference on Intelligent & Cognitive Systems, pp. 88-92, Tehran, Iran, September 1996.
Mohammad Ghavamzadeh, Khashayar Rohanimanesh, Ali M. Eydgahi, & Bahram Poorali. “Design of an ISDN Terminal”. Proceedings of the Twenty First. IEEE International Conference on Industrial Electronics, Control, Instrumentation and Automation (IECON-1995), pp. 1598-1601, Orlando FL, November 1995.
Mohammad Ghavamzadeh & Ali M. Eydgahi. (in Farsi) “A New Approach to Root Locus for Positive Feedback Systems”. Proceedings of the Third Iranian Conference on Electrical Engineering, pp. 23-30, Tehran, Iran, May 1995.
Mohammad Ghavamzadeh, Khashayar Rohanimanesh, Ali M. Eydgahi, & Bahram Poorali. (in Farsi) “Design of an ISDN Telephone Terminal with 8751H Intel Micro-Controller”. Proceedings of the Third Iranian Conference on Electrical Engineering, Tehran, Iran, May 1995.