TECNALIA Publications Repository :: Browsing by Keyword "Imbalanced data"

Browsing by Keyword "Imbalanced data"

Now showing 1 - 3 of 3

Adaptive Dendritic Cell-Deep Learning Approach for Industrial Prognosis under Changing Conditions
(2021-11) Diez-Olivan, Alberto; Ortego, Patxi; Ser, Javier Del; Landa-Torres, Itziar; Galar, Diego; Camacho, David; Sierra, Basilio; Tecnalia Research & Innovation; IA
Industrial prognosis refers to the prediction of failures of an industrial asset based on data collected by Internet of Things sensors. Prognostic models can experience the undesired effects of concept drift, namely, the presence of nonstationary phenomena that affects the data collected over time. Consequently, fault patterns learned from data become obsolete. To overcome this issue, contextual and operational changes must be detected and managed, triggering rapid model adaptation mechanisms. This article presents an adaptive learning approach based on a dendritic cell algorithm for drift detection and a deep neural network model that dynamically adapts to new operational conditions. A kernel density estimator with drift-based bandwidth is used to generate synthetic data for a faster adaptation, focusing on fine-tuning the lowest neural layers. Experimental results over a real-world industrial problem shed light on the outperforming behavior of the proposed approach when compared to other drift detectors and classification models.
Data Augmentation for Industrial Prognosis Using Generative Adversarial Networks
(Springer, 2020-10-27) Ortego, Patxi; Diez-Olivan, Alberto; Del Ser, Javier; Sierra, Basilio; Analide, Cesar; Novais, Paulo; Camacho, David; Yin, Hujun; Tecnalia Research & Innovation; IA
The Industry 4.0 revolution allows monitoring and intelligent processing of big amounts of data. When monitoring certain assets, very few data is found for operation under faulty conditions because the cost of not operating properly is unacceptable and thus preventive strategies are put in practice. Because machine learning algorithms are data exhaustive, synthetic data can be created for these cases. Deep learning techniques have been proven to work very well for these cases. Generative Adversarial Networks (GANs) have been deployed in numerous applications with data augmentation objectives, but not so much for balancing unidimensional series with few data. In this paper, a GAN is applied in order to augment data for assets operating under faulty conditions. The proposed method is validated on a real industrial case, yielding promising results with respect to the case with no strategy for class imbalance whatsoever.
A Probabilistic Sample Matchmaking Strategy for Imbalanced Data Streams with Concept Drift
(2017) L. Lobo, Jesus; Del Ser, Javier; Bilbao, Miren Nekane; Laña, Ibai; Salcedo-Sanz, Sancho; IA
In the last decade the interest in adaptive models for non-stationary environments has gained momentum within the research community due to an increasing number of application scenarios generating non-stationary data streams. In this context the literature has been specially rich in terms of ensemble techniques, which in their majority have focused on taking advantage of past information in the form of already trained predictive models and other alternatives alike. This manuscript elaborates on a rather different approach, which hinges on extracting the essential predictive information of past trained models and determining therefrom the best candidates (intelligent sample matchmaking) for training the predictive model of the current data batch. This novel perspective is of inherent utility for data streams characterized by short-length unbalanced data batches, situation where the so-called trade-off between plasticity and stability must be carefully met. The approach is evaluated on a synthetic data set that simulates a non-stationary environment with recurrently changing concept drift. The proposed approach is shown to perform competitively when adapting to a sudden and recurrent change with respect to the state of the art, but without storing all the past trained models and by lessening its computational complexity in terms of model evaluations. These promising results motivate future research aimed at validating the proposed strategy on other scenarios under concept drift, such as those characterized by semi-supervised data streams.

Browsing by Keyword "Imbalanced data"

Results Per Page

Sort Options