On the post-hoc explainability of deep echo state networks for time series forecasting, image and video classification

No Thumbnail Available
Publication date
Journal Title
Journal ISSN
Volume Title
Google Scholar
Research Projects
Organizational Units
Journal Issue
Since their inception, learning techniques under the reservoir computing paradigm have shown a great modeling capability for recurrent systems without the computing overheads required for other approaches, specially deep neural networks. Among them, different flavors of echo state networks have attracted many stares through time, mainly due to the simplicity and computational efficiency of their learning algorithm. However, these advantages do not compensate for the fact that echo state networks remain as black-box models whose decisions cannot be easily explained to the general audience. This issue is even more involved for multi-layered (also referred to as deep) echo state networks, whose more complex hierarchical structure hinders even further the explainability of their internals to users without expertise in machine learning or even computer science. This lack of explainability can jeopardize the widespread adoption of these models in certain domains where accountability and understandability of machine learning models is a must (e.g., medical diagnosis, social politics). This work addresses this issue by conducting an explainability study of echo state networks when applied to learning tasks with time series, image and video data. Among these tasks, we stress on the latter one (video classification) which, to the best of our knowledge, has never been tackled before with echo state networks in the related literature. Specifically, the study proposes three different techniques capable of eliciting understandable information about the knowledge grasped by these recurrent models, namely potential memory, temporal patterns and pixel absence effect. Potential memory addresses questions related to the effect of the reservoir size in the capability of the model to store temporal information, whereas temporal patterns unveil the recurrent relationships captured by the model over time. Finally, pixel absence effect attempts at evaluating the effect of the absence of a given pixel when the echo state network model is used for image and video classification. The benefits of the proposed suite of techniques are showcased over three different domains of applicability: time series modeling, image and, for the first time in the related literature, video classification. The obtained results reveal that the proposed techniques not only allow for an informed understanding of the way these models work, but also serve as diagnostic tools capable of detecting issues inherited from data (e.g., presence of hidden bias).
Barredo Arrieta, Alejandro, Sergio Gil-Lopez, Ibai Laña, Miren Nekane Bilbao, and Javier Del Ser. “On the Post-Hoc Explainability of Deep Echo State Networks for Time Series Forecasting, Image and Video Classification.” Neural Computing and Applications (August 6, 2021). doi:10.1007/s00521-021-06359-y.