RT Journal Article
T1 Adaptive Multifactorial Evolutionary Optimization for Multitask Reinforcement Learning
A1 Martinez, Aritz D.
A1 Del Ser, Javier
A1 Osaba, Eneko
A1 Herrera, Francisco
AB Evolutionary computation has largely exhibited its potential to complement conventional learning algorithms in a variety of machine learning tasks, especially those related to unsupervised (clustering) and supervised learning. It has not been until lately when the computational efficiency of evolutionary solvers has been put in prospective for training reinforcement learning models. However, most studies framed so far within this context have considered environments and tasks conceived in isolation, without any exchange of knowledge among related tasks. In this manuscript we present A-MFEA-RL, an adaptive version of the well-known MFEA algorithm whose search and inheritance operators are tailored for multitask reinforcement learning environments. Specifically, our approach includes crossover and inheritance mechanisms for refining the exchange of genetic material, which rely on the multilayered structure of modern deep-learning-based reinforcement learning models. In order to assess the performance of the proposed approach, we design an extensive experimental setup comprising multiple reinforcement learning environments of varying levels of complexity, over which the performance of A-MFEA-RL is compared to that furnished by alternative nonevolutionary multitask reinforcement learning approaches. As concluded from the discussion of the obtained results, A-MFEA-RL not only achieves competitive success rates over the simultaneously addressed tasks, but also fosters the exchange of knowledge among tasks that could be intuitively expected to keep a degree of synergistic relationship.
SN 1089-778X
YR 2022
FD 2022-04-01
LA eng
NO Martinez , A D , Del Ser , J , Osaba , E & Herrera , F 2022 , ' Adaptive Multifactorial Evolutionary Optimization for Multitask Reinforcement Learning ' , IEEE Transactions on Evolutionary Computation , vol. 26 , no. 2 , pp. 233-247 . https://doi.org/10.1109/TEVC.2021.3083362
NO Publisher Copyright: © 1997-2012 IEEE.
NO The work of Aritz D. Martinez and Eneko Osaba was supported by the Basque Government through the ELKARTEK Program (3KIA Project) under Grant KK-2020/00049. The work of Javier Del Ser was supported in part by the Basque Government through the ELKARTEK Program (3KIA Project) under Grant KK-2020/00049, and in part by the Consolidated Research Group MATHMODE (IT1294-19) granted by the Department of Education of the Basque Government. The work of Francisco Herrera was supported in part by the Spanish Government through (SMART-DaSCI) under Grant TIN2017-89517-P, and in part by the BBVA Foundation through Ayudas Fundacion BBVA a Equipos de Investigacion Científica 2018 call (DeepSCOP)
DS TECNALIA Publications
RD 1 sept 2024