SIAM CT21

Date:

Title: Learning & Controls for Battery Management System

With the aim of minimum time charging without damaging the cells, we propose an optimal-charging procedure based on deep reinforcement learning. In particular, we focus on a policy gradient method to cope with continuous sets of states and actions. First, we assume full state measurements from the Doyle-Fuller-Newman (DFN) model, which is projected to a lower-dimensional feature space via Principal Component Analysis. Subsequently, this assumption is removed and only output measurements are considered as the agent observations. Finally, we show the adaptability of the proposed policy to changes in the environment's parameters. The results are compared with other methodologies presented in the literature, such as the reference governor and proportional-integral-derivative approach.