Amir-massoud Farahmand
(SoloGen)
Amir-massoud Farahmand
(SoloGen)
Postdoctoral Fellow
Reasoning and Learning Lab, School of Computer Science, McGill University, Canada
(Supervisor: Doina Precup)
PhD from the Department of Computing Science, University of Alberta, Canada
(Supervisor: Csaba Szepesvári and Martin Jägersand)
Research Goal
Developing adaptive intelligent agents has been my main research goal for the past few years. I study reinforcement learning methods that adapt to the regularities of the problem to reduce the sample complexity of learning in large-scale problems. Before that, I had studied hierarchical behavior-based architectures and evolutionary approaches for agent design. See my publications for more information.
Applications of my research range from robotics and control engineering to operations research, finance, health sciences, and computer games.
Research Interests
Sequential Decision-Making Problems (Reinforcement Learning and Planning): regularization techniques (e.g., regularized fitted Value Iteration, LSTD, and Bellman Residual Minimization), model selection and empirical evaluation, error propagation in API/AVI, RKHS formulation, studying regularities of RL/Planning problems
Machine Learning (supervised and unsupervised learning): Nonparametric statistical methods, statistical learning theory, regularization techniques, concentration of measure inequalities, manifold learning (dimension estimation), non-i.i.d. processes
Robotics: Uncalibrated visual servoing, behavior-based architecture for robot control, multi-agent robotics
Evolutionary Computation: cooperative co-evolution, interaction of evolution and learning
Anti Memoirs (ضدخاطرات) is here!
My ML-related tumblr is here!
My Twitter account (not ML-related most of the time).
News
Many practitioners of reinforcement learning problems have observed that the performance of the agent often reaches very close to the optimal performance even though the estimated (action-)value function is still far from the optimal one. In my Action-Gap Phenomenon in Reinforcement Learning paper, which has been accepted to the NIPS 2011 conference, I explain and formalize this phenomenon by introducing the concept of the action-gap regularity. I show that if the problem has a favorable action-gap regularity, the convergence rate of the performance loss might be much faster than the rate of the error in estimating the optimal action-value function.