Browsing by Keyword "machine learning"
Now showing 1 - 8 of 8
Results Per Page
Sort Options
Item Discrimination of Methanol from Ethanol Using Graphene-based Smart Gas Sensors(Institute of Electrical and Electronics Engineers Inc., 2024) Huang, Shirong; Ibarlucea, Bergoi; Cuniberti, Gianaurelio; BiomaterialesMethanol and ethanol are physical-chemically similar volatile organic compounds and are widely used in the industry. Compared with ethanol, methanol is extremely toxic to human health by ingestion or inhalation. Therefore, it is of great importance to develop effective techniques to discriminate methanol from ethanol. The gold standard approaches for methanol and ethanol detection are gas chromatography-mass spectroscopy (GC-MS) and nuclear magnetic resonance (NMR), which are rather expensive and sophisticated. Alternatively, chemiresitive gas sensors show promising applications in volatile organic compounds detection. Here, we present the development of graphene-based smart gas sensors for methanol discrimination from ethanol. By using multiple transient-state features as the fingerprint information of gas, the selectivity of developed gas sensors is enhanced. This proposed strategy enables the graphene-based gas sensors with an excellent discrimination performance (accuracy-98.9%) leveraging supervised machine learning algorithms. This work paves the path to design a low-cost, low-power consumption, facile, highly sensitive, and highly selective smart gas sensor to discriminate methanol from ethanol, which could also be extended to other similar VOCs discrimination.Item Exploring Data Augmentation and Active Learning Benefits in Imbalanced Datasets(2024-06-19) Moles, Luis; Andres, Alain; Echegaray, Goretti; Boto, Fernando; IADespite the increasing availability of vast amounts of data, the challenge of acquiring labeled data persists. This issue is particularly serious in supervised learning scenarios, where labeled data are essential for model training. In addition, the rapid growth in data required by cutting-edge technologies such as deep learning makes the task of labeling large datasets impractical. Active learning methods offer a powerful solution by iteratively selecting the most informative unlabeled instances, thereby reducing the amount of labeled data required. However, active learning faces some limitations with imbalanced datasets, where majority class over-representation can bias sample selection. To address this, combining active learning with data augmentation techniques emerges as a promising strategy. Nonetheless, the best way to combine these techniques is not yet clear. Our research addresses this question by analyzing the effectiveness of combining both active learning and data augmentation techniques under different scenarios. Moreover, we focus on improving the generalization capabilities for minority classes, which tend to be overshadowed by the improvement seen in majority classes. For this purpose, we generate synthetic data using multiple data augmentation methods and evaluate the results considering two active learning strategies across three imbalanced datasets. Our study shows that data augmentation enhances prediction accuracy for minority classes, with approaches based on CTGANs obtaining improvements of nearly 50% in some cases. Moreover, we show that combining data augmentation techniques with active learning can reduce the amount of real data required.Item Melanoma Clinical Decision Support System: An Artificial Intelligence-Based Tool to Diagnose and Predict Disease Outcome in Early-Stage Melanoma Patients(2023-04) Diaz-Ramón, Jose Luis; Gardeazabal, Jesus; Izu, Rosa Maria; Garrote, Estibaliz; Rasero, Javier; Apraiz, Aintzane; Penas, Cristina; Seijo, Sandra; Lopez-Saratxaga, Cristina; De la Peña, Pedro Maria; Sanchez-Diaz, Ana; Cancho-Galan, Goikoane; Velasco, Veronica; Sevilla, Arrate; Fernandez, David; Cuenca, Iciar; Cortes, Jesus María; Alonso, Santos; Asumendi, Aintzane; Boyano, María Dolores; QuantumThis study set out to assess the performance of an artificial intelligence (AI) algorithm based on clinical data and dermatoscopic imaging for the early diagnosis of melanoma, and its capacity to define the metastatic progression of melanoma through serological and histopathological biomarkers, enabling dermatologists to make more informed decisions about patient management. Integrated analysis of demographic data, images of the skin lesions, and serum and histopathological markers were analyzed in a group of 196 patients with melanoma. The interleukins (ILs) IL-4, IL-6, IL-10, and IL-17A as well as IFNγ (interferon), GM-CSF (granulocyte and macrophage colony-stimulating factor), TGFβ (transforming growth factor), and the protein DCD (dermcidin) were quantified in the serum of melanoma patients at the time of diagnosis, and the expression of the RKIP, PIRIN, BCL2, BCL3, MITF, and ANXA5 proteins was detected by immunohistochemistry (IHC) in melanoma biopsies. An AI algorithm was used to improve the early diagnosis of melanoma and to predict the risk of metastasis and of disease-free survival. Two models were obtained to predict metastasis (including “all patients” or only patients “at early stages of melanoma”), and a series of attributes were seen to predict the progression of metastasis: Breslow thickness, infiltrating BCL-2 expressing lymphocytes, and IL-4 and IL-6 serum levels. Importantly, a decrease in serum GM-CSF seems to be a marker of poor prognosis in patients with early-stage melanomas.Item Online Pentane Concentration Prediction System Based on Machine Learning Techniques †(2023) Manjarrés, Diana; Maqueda, Erik; Landa-Torres, Itziar; IA; DIGITAL ENERGYIndustry 4.0 has emerged together with relevant technological tools that have enabled the rise of this new industrial paradigm. One of the main employed tools is Machine Learning techniques, which allow us to extract knowledge from raw data and, therefore, devise intelligent strategies or systems to improve actual industrial processes. In this regard, this paper focuses on the development of a prediction system based on Random Forest (RF) to estimate Pentane concentration in advance. The proposed system is validated offline with more than a year of data and is also tested online in an Energy plant of the Basque Country. Validation results show acceptable outcomes for supporting the operator’s decision-making with a tool that infers Pentane concentration in Butane 400 min in advance and, therefore, the quality of the obtained product.Item Prediction of Metabolic Syndrome Based on Machine Learning Techniques with Emphasis on Feature Relevances and Explainability Analysis(Institute of Electrical and Electronics Engineers Inc., 2023) Ispizua, Begoña; Manjarrés, Diana; Niño-Adan, Iratxe; Jiang, Xingpeng; Wang, Haiying; Alhajj, Reda; Hu, Xiaohua; Engel, Felix; Mahmud, Mufti; Pisanti, Nadia; Cui, Xuefeng; Song, Hong; IAMetabolic syndrome (MetS) is considered to be a major public health problem worldwide leading to a high risk of diabetes and cardiovascular diseases. In this paper, data collected by the Precision Medicine Initiative of the Basque Country, named the AKRIBEA project, is employed to infer via Machine Learning (ML) techniques the features that have the most influence on predicting MetS in the general case and also separately by gender. Different Feature Normalization (FN) and Feature Weighting (FW) methods are applied and an exhaustive analysis of explainability by means of Shapley Additive Explanations (SHAP) and feature relevance methods is performed. Validation results show that the Extreme Gradient Boosting (XGB) with Min-Max FN and Mutual Information FW achieves the best trade-off between precision and recall performance metrics.Item A review of deep learning-based approaches for deepfake content detection(2024) Passos, Leandro A.; Jodas, Danilo; Costa, Kelton A.P.; Souza Júnior, Luis A.; Rodrigues, Douglas; Del Ser, Javier; Camacho, David; Papa, João Paulo; IARecent advancements in deep learning generative models have raised concerns as they can create highly convincing counterfeit images and videos. This poses a threat to people's integrity and can lead to social instability. To address this issue, there is a pressing need to develop new computational models that can efficiently detect forged content and alert users to potential image and video manipulations. This paper presents a comprehensive review of recent studies for deepfake content detection using deep learning-based approaches. We aim to broaden the state-of-the-art research by systematically reviewing the different categories of fake content detection. Furthermore, we report the advantages and drawbacks of the examined works, and prescribe several future directions towards the issues and shortcomings still unsolved on deepfake detection.Item SECURE MULTIPARTY COMPUTATION FOR PREDICTIVE MAINTENANCE: VALIDATION OF SCALE-MAMBA IN TERMS OF ACCURACY AND EFFICIENCY(2022-11) Gamiz-Ugarte, Idoia; Lage-Serrano, Oscar; Legarreta-Solaguren, Leire; Regueiro-Senderos, Cristina; Jacob-Taquet, Eduardo; Seco-Aguirre, Iñaki; CIBERSEC&DLTPrivacy is a booming sector and there is an increasing number of limitations that hinder the centralization of data coming from different sources. Nowadays, having data provides value and an advantage over the rest, since it allows the performance of a wider and more generalizable analysis. Secure Multiparty Computation (SMPC) is a cryptographic technique that allows performing computations with data from different parties while maintaining the privacy of the data and avoiding centralization. This work focuses on the SCALE-MAMBA framework for conducting SMPC and the main objective is its validation in terms of types of operations, the accuracy of the results and execution times. A use case that is directly related to the industry is used, consisting of a manufacturer who wants to implement predictive maintenance on a machine whose data is collected by different users. Two types of scenarios are presented in order to analyze the results, obtaining different conclusions for each of them. On the one hand, the first scenario collects the use cases in which the aim is to compute statistics or simple calculations with data in common. On the other hand, the second scenario focuses on the training of Machine Learning (ML) algorithms. The original contribution of this work includes the implementation of these codes within the Mamba language, their application to concrete data, and the comparison of the results with those that would be obtained by performing it in an insecure way, centralizing the data, and using R or Python. The major limitations encountered are around execution times, which might be acceptable for many use cases in the first scenario, but are prohibitive for many of the techniques used in real ML training.Item Understanding daily mobility patterns in urban road networks using traffic flow analytics(Institute of Electrical and Electronics Engineers Inc., 2016-06-30) Laña, Ibai; Del Ser, Javier; Olabarrieta, Ignacio Iñaki; Badonnel, Sema Oktug; Ulema, Mehmet; Cavdar, Cicek; Granville, Lisandro Zambenedetti; dos Santos, Carlos Raniery P.; IAThe MoveUs project funded by the European Commission aims to foster sustainable eco-friendly mobility habits in cities. In this context predicting the traffic flow is useful for managers to optimize the configuration of the road network towards reducing the congestions and ultimately, the pollution. With the explosion of the so-called Big Data concept and its application to traffic data, a wide range of traffic flow prediction methods has been reported in the related literature. However, most of the efforts in this field have been hitherto focused on short-term prediction models. This paper analyzes how to properly characterize traffic flow in urban road scenarios with an emphasis on the long term. To this end a clustering stage is utilized to discover typicalities or patterns within the traffic flow data registered by each road sensor, which permits building prediction models for each of such discovered patterns. These individual prediction models are intended to become part of the MoveUs platform, which will provide the technical means 1) for traffic managers to analyze in depth the status of the road network, and 2) for road users to better plan their trips.