Erwan Lecarpentier


View My GitHub Profile

Welcome to my home page

Hi, I just completed my PhD in Computer Sciences on the topic of “Reinforcement Learning in Non-Stationary Markov Decision Processes”. The thesis was carried out at ISAE-SUPAERO in the awesome city of Toulouse. I was honored to work under supervision of Prof. Emmanuel Rachelson, Dr. Guillaume Infantes, and Dr. Charles Lesire.

I am interested in Artificial Intelligence and was introduced to the field via the Reinforcement Learning (RL) paradigm. Currently, I am focusing on several different questions, including the following:


2021 Erwan Lecarpentier, David Abel, Kavosh Asadi, Yuu Jinnai, Emmanuel Rachelson, Michael L. Littman. Lipschitz Lifelong Reinforcement Learning. In Proceedings of the 35th AAAI Conference on Artificial Intelligence, AAAI 2021.
PDF - arXiv

2019 Erwan Lecarpentier and Emmanuel Rachelson. Non-Stationary Markov Decision Processes, a Worst-Case Approach using Model-Based Reinforcement Learning. In Proceedings of the Thirty-third Conference on Neural Information Processing Systems, NeurIPS 2019.
NeurIPS - PDF - arXiv

2018 Erwan Lecarpentier, Guillaume Infantes, Charles Lesire, and Emmanuel Rachelson. Open loop execution of tree-search algorithms. In Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI 2018.
IJCAI - PDF - arXiv

2017 Erwan Lecarpentier, Sebastian Rapp, Marc Melo and Emmanuel Rachelson. Empirical evaluation of a Q-Learning Algorithm for Model-free Autonomous Soaring. In JFPDA 2017.
JFPDA - PDF - arXiv

PhD Thesis: Reinforcement Learning in Non-Stationary Environments, ISAE-SUPAERO - Université de Toulouse.


Some RL environments I created:

Dyna Gym This is a pip package implementing Reinforcement Learning algorithms in non-stationary environments supported by the OpenAI Gym toolkit.

Flatland environment A C++ library for navigation task in a 2D environment. The settings enable the use of different policies, environments and action spaces. Choice of state space is also made available so that the agent can either evolve within a discrete gridworld or a continuous-state world.

Learning2Fly A C++ library simulating the flight of a glider UAV within a non-stationary atmosphere featuring thermal currents. The used dynamics model is borrowed from Beeler et al. 2003.

Traveler Traveler is a graph-based non-stationary MDP simulating travels between waypoints. Each node of the graph represents a location and each edge a route between locations. The travel duration corresponding to an edge is time-dependent, making the environment non-stationary. The goal of an agent is to reach a unique termination node as quickly as possible.