TECNALIA Publications Repository :: Browsing by Author "Del Ser, Javier"

Browsing by Author "Del Ser, Javier"

Now showing 1 - 20 of 53

An active adaptation strategy for streaming time series classification based on elastic similarity measures
(2022-08) Oregi, Izaskun; Pérez, Aritz; Del Ser, Javier; Lozano, Jose A.; Quantum; IA
In streaming time series classification problems, the goal is to predict the label associated to the most recently received observations over the stream according to a set of categorized reference patterns. In on-line scenarios, data arise from non-stationary processes, which results in a succession of different patterns or events. This work presents an active adaptation strategy that allows time series classifiers to accommodate to the dynamics of streamed time series data. Specifically, our approach consists of a classifier that detects changes between events over streaming time series. For this purpose, the classifier uses features of the dynamic time warping measure computed between the streamed data and a set of reference patterns. When classifying a streaming series, the proposed pattern end detector analyzes such features to predict changes and adapt off-line time series classifiers to newly arriving events. To evaluate the performance of the proposed scheme, we employ the pattern end detection model along with dynamic time warping-based nearest neighbor classifiers over a benchmark of ten time series classification problems. The obtained results present exciting insights into the detection accuracy and latency performance of the proposed strategy.
AI-based medical e-diagnosis for fast and automatic ventricular volume measurement in patients with normal pressure hydrocephalus
(2022-02-24) Zhou, Xi; Ye, Qinghao; Yang, Xiaolin; Chen, Jiakun; Ma, Haiqin; Xia, Jun; Del Ser, Javier; Yang, Guang; IA
Based on CT and MRI images acquired from normal pressure hydrocephalus (NPH) patients, using machine learning methods, we aim to establish a multimodal and high-performance automatic ventricle segmentation method to achieve an efficient and accurate automatic measurement of the ventricular volume. First, we extract the brain CT and MRI images of 143 definite NPH patients. Second, we manually label the ventricular volume (VV) and intracranial volume (ICV). Then, we use the machine learning method to extract features and establish automatic ventricle segmentation model. Finally, we verify the reliability of the model and achieved automatic measurement of VV and ICV. In CT images, the Dice similarity coefficient (DSC), intraclass correlation coefficient (ICC), Pearson correlation, and Bland–Altman analysis of the automatic and manual segmentation result of the VV were 0.95, 0.99, 0.99, and 4.2 ± 2.6, respectively. The results of ICV were 0.96, 0.99, 0.99, and 6.0 ± 3.8, respectively. The whole process takes 3.4 ± 0.3 s. In MRI images, the DSC, ICC, Pearson correlation, and Bland–Altman analysis of the automatic and manual segmentation result of the VV were 0.94, 0.99, 0.99, and 2.0 ± 0.6, respectively. The results of ICV were 0.93, 0.99, 0.99, and 7.9 ± 3.8, respectively. The whole process took 1.9 ± 0.1 s. We have established a multimodal and high-performance automatic ventricle segmentation method to achieve efficient and accurate automatic measurement of the ventricular volume of NPH patients. This can help clinicians quickly and accurately understand the situation of NPH patient’s ventricles.
Big Data for transportation and mobility: recent advances, trends and challenges: Recent advances, trends and challenges
(2018-10-04) Torre-Bastida, Ana I.; Del Ser, Javier; Laña, Ibai; Ilardia, Maitena; Bilbao, Miren Nekane; Campos-Cordobes, Sergio; Tecnalia Research & Innovation; HPA; IA; LABORATORIO DE TRANSFORMACIÓN URBANA; SMART_TRANSPORT
Big Data is an emerging paradigm and has currently become a strong attractor of global interest, specially within the transportation industry. The combination of disruptive technologies and new concepts such as the Smart City upgrades the transport data life cycle. In this context, Big Data is considered as a new pledge for the transportation industry to effectively manage all data this sector required for providing safer, cleaner and more efficient transport means, as well as for users to personalize their transport experience. However, Big Data comes along with its own set of technological challenges, stemming from the multiple and heterogeneous transportation/mobility application scenarios. In this survey we analyze the latest research efforts revolving on Big Data for the transportation and mobility industry, its applications, baselines scenarios, fields and use case such as routing, planning, infrastructure monitoring, network design, among others. This analysis will be done strictly from the Big Data perspective, focusing on those contributions gravitating on techniques, tools and methods for modeling, processing, analyzing and visualizing transport and mobility Big Data. From the literature review a set of trends and challenges is extracted so as to provide researchers with an insightful outlook on the field of transport and mobility.
Bio-inspired computation for big data fusion, storage, processing, learning and visualization: state of the art and future directions: state of the art and future directions
(2021-08-03) Torre-Bastida, Ana I.; Díaz-de-Arcaya, Josu; Osaba, Eneko; Muhammad, Khan; Camacho, David; Del Ser, Javier; HPA; Quantum
This overview gravitates on research achievements that have recently emerged from the confluence between Big Data technologies and bio-inspired computation. A manifold of reasons can be identified for the profitable synergy between these two paradigms, all rooted on the adaptability, intelligence and robustness that biologically inspired principles can provide to technologies aimed to manage, retrieve, fuse and process Big Data efficiently. We delve into this research field by first analyzing in depth the existing literature, with a focus on advances reported in the last few years. This prior literature analysis is complemented by an identification of the new trends and open challenges in Big Data that remain unsolved to date, and that can be effectively addressed by bio-inspired algorithms. As a second contribution, this work elaborates on how bio-inspired algorithms need to be adapted for their use in a Big Data context, in which data fusion becomes crucial as a previous step to allow processing and mining several and potentially heterogeneous data sources. This analysis allows exploring and comparing the scope and efficiency of existing approaches across different problems and domains, with the purpose of identifying new potential applications and research niches. Finally, this survey highlights open issues that remain unsolved to date in this research avenue, alongside a prescription of recommendations for future research.
COEBA: A Coevolutionary Bat Algorithm for Discrete Evolutionary Multitasking: A coevolutionary bat algorithm for discrete evolutionary multitasking
(Springer Nature, 2020) Osaba, Eneko; Del Ser, Javier; Yang, Xin-She; Iglesias, Andres; Galvez, Akemi; Krzhizhanovskaya, Valeria V.; Závodszky, Gábor; Lees, Michael H.; Sloot, Peter M.A.; Sloot, Peter M.A.; Sloot, Peter M.A.; Dongarra, Jack J.; Brissos, Sérgio; Teixeira, João; Quantum; IA
Multitasking optimization is an emerging research field which has attracted lot of attention in the scientific community. The main purpose of this paradigm is how to solve multiple optimization problems or tasks simultaneously by conducting a single search process. The main catalyst for reaching this objective is to exploit possible synergies and complementarities among the tasks to be optimized, helping each other by virtue of the transfer of knowledge among them (thereby being referred to as Transfer Optimization). In this context, Evolutionary Multitasking addresses Transfer Optimization problems by resorting to concepts from Evolutionary Computation for simultaneous solving the tasks at hand. This work contributes to this trend by proposing a novel algorithmic scheme for dealing with multitasking environments. The proposed approach, coined as Coevolutionary Bat Algorithm, finds its inspiration in concepts from both co-evolutionary strategies and the metaheuristic Bat Algorithm. We compare the performance of our proposed method with that of its Multifactorial Evolutionary Algorithm counterpart over 15 different multitasking setups, composed by eight reference instances of the discrete Traveling Salesman Problem. The experimentation and results stemming therefrom support the main hypothesis of this study: the proposed Coevolutionary Bat Algorithm is a promising meta-heuristic for solving Evolutionary Multitasking scenarios.
A Coral Reefs Optimization algorithm with Harmony Search operators for accurate wind speed prediction
(2015-03-01) Salcedo-Sanz, Sancho; Pastor-Sanchez, Alvaro; Del Ser, Javier; Prieto, Luis; Geem, Zong-Woo; IA
This paper introduces a new hybrid bio-inspired solver which combines elements from the recently proposed Coral Reefs Optimization (CRO) algorithm with operators from the Harmony Search (HS) approach, which gives rise to the coined CRO-HS optimization technique. Specifically, this novel bio-inspired optimizer is utilized in the context of short-term wind speed prediction as a means to obtain the best set of meteorological variables to be input to a neural Extreme Learning Machine (ELM) network. The paper elaborates on the main characteristics of the proposed scheme and discusses its performance when predicting the wind speed based on the measures of two meteorological towers located in USA and Spain. The good results obtained in these experiments when compared to naïve versions of the CRO and HS algorithms are promising and pave the way towards the utilization of the derived hybrid solver in other optimization problems arising from diverse disciplines.
A Critical Review of Robustness in Power Grids using Complex Networks Concepts
(2015) Cuadra, Lucas; Salcedo-Sanz, Sancho; Del Ser, Javier; Jimenez-Fernandez, Silvia; Geem, Zong-Woo; IA
This paper reviews the most relevant works that have investigated robustness in power grids using Complex Networks (CN) concepts. In this broad field there are two different approaches. The first one is based solely on topological concepts, and uses metrics such as mean path length, clustering coefficient, efficiency and betweenness centrality, among many others. The second, hybrid approach consists of introducing (into the CN framework) some concepts from Electrical Engineering (EE) in the effort of enhancing the topological approach, and uses novel, more efficient electrical metrics such as electrical betweenness, net-ability, and others. There is however a controversy about whether these approaches are able to provide insights into all aspects of real power grids. The CN community argues that the topological approach does not aim to focus on the detailed operation, but to discover the unexpected emergence of collective behavior, while part of the EE community asserts that this leads to an excessive simplification. Beyond this open debate it seems to be no predominant structure (scale-free, small-world) in high-voltage transmission power grids, the vast majority of power grids studied so far. Most of them have in common that they are vulnerable to targeted attacks on the most connected nodes and robust to random failure. In this respect there are only a few works that propose strategies to improve robustness such as intentional islanding, restricted link addition, microgrids and smart grids, for which novel studies suggest that small-world networks seem to be the best topology.
CURIE: a cellular automaton for concept drift detection: a cellular automaton for concept drift detection
(2021-11) Lobo, Jesus L.; Del Ser, Javier; Osaba, Eneko; Bifet, Albert; Herrera, Francisco; IA; Quantum
Data stream mining extracts information from large quantities of data flowing fast and continuously (data streams). They are usually affected by changes in the data distribution, giving rise to a phenomenon referred to as concept drift. Thus, learning models must detect and adapt to such changes, so as to exhibit a good predictive performance after a drift has occurred. In this regard, the development of effective drift detection algorithms becomes a key factor in data stream mining. In this work we propose CURIECURIE, a drift detector relying on cellular automata. Specifically, in CURIECURIE the distribution of the data stream is represented in the grid of a cellular automata, whose neighborhood rule can then be utilized to detect possible distribution changes over the stream. Computer simulations are presented and discussed to show that CURIECURIE, when hybridized with other base learners, renders a competitive behavior in terms of detection metrics and classification accuracy. CURIECURIE is compared with well-established drift detectors over synthetic datasets with varying drift characteristics.
Dandelion-encoded harmony search heuristics for opportunistic traffic offloading in synthetically modeled mobile networks
(Springer Verlag, 2016) Perfecto, Cristina; Bilbao, Miren Nekane; Del Ser, Javier; Ferro, Armando; Salcedo-Sanz, Sancho; Geem, Zong Woo; Kim, Joong Hoon; IA
The high data volumes being managed by and transferred through mobile networks in the last few years are the main rationale for the upsurge of research aimed at finding efficient technical means to offload exceeding traffic to alternative communication infrastructures with higher transmission bandwidths. This idea is solidly buttressed by the proliferation of short-range wireless communication technologies (e.g.mobile devices with multiple radio interfaces), which can be conceived as available opportunistic hotspots to which the operator can reroute exceeding network traffic depending on the contractual clauses of the owner at hand. Furthermore, by offloading to such hotspots a higher effective coverage can be attained by those operators providing both mobile and fixed telecommunication services. In this context, the operator must decide if data generated by its users will be sent over conventional 4G+/4G/3G communication links, or if they will instead be offloaded to nearby opportunistic networks assuming a contractual cost penalty. Mathematically speaking, this problem can be formulated as a spanning tree optimization subject to cost-performance criteria and coverage constraints. This paper will elaborate on the efficient solving of this optimization paradigm by means of the Harmony Search meta-heuristic algorithm and the so-called Dandelion solution encoding, the latter allowing for the use of conventional meta-heuristic operators maximally preserving the locality of tree representations. The manuscript will discuss the obtained simulation results over different synthetically modeled setups of the underlying communication scenario and contractual clauses of the users.
Data Augmentation for Industrial Prognosis Using Generative Adversarial Networks
(Springer, 2020-10-27) Ortego, Patxi; Diez-Olivan, Alberto; Del Ser, Javier; Sierra, Basilio; Analide, Cesar; Novais, Paulo; Camacho, David; Yin, Hujun; Tecnalia Research & Innovation; IA
The Industry 4.0 revolution allows monitoring and intelligent processing of big amounts of data. When monitoring certain assets, very few data is found for operation under faulty conditions because the cost of not operating properly is unacceptable and thus preventive strategies are put in practice. Because machine learning algorithms are data exhaustive, synthetic data can be created for these cases. Deep learning techniques have been proven to work very well for these cases. Generative Adversarial Networks (GANs) have been deployed in numerous applications with data augmentation objectives, but not so much for balancing unidimensional series with few data. In this paper, a GAN is applied in order to augment data for assets operating under faulty conditions. The proposed method is validated on a real industrial case, yielding promising results with respect to the case with no strategy for class imbalance whatsoever.
Design and implementation of an extended corporate crm database system with big data analytical functionalities
(2015-07-25) Torre-Bastida, Ana I.; Villar-Rodriguez, Esther; Gil-Lopez, Sergio; Del Ser, Javier; HPA; Quantum; IA
The amount of open information available on-line from heterogeneous sources and domains is growing at an extremely fast pace, and constitutes an important knowledge base for the consideration of industries and companies. In this context, two relevant data providers can be highlighted: the “Linked Open Data” (LOD) and “Social Media” (SM) paradigms. The fusion of these data sources – structured the former, and raw data the latter –, along with the information contained in structured corporate databases within the organizations themselves, may unveil significant business opportunities and competitive advantage to those who are able to understand and leverage their value. In this paper, we present two complementary use cases, illustrating the potential of using the open data in the business domain. The first represents the creation of an existing and potential customer knowledge base, exploiting social and linked open data based on which any given organization might infer valuable information as a support for decision making. The second focuses on the classification of organizations and enterprises aiming at detecting potential competitors and/or allies via the analysis of the conceptual similarity between their participated projects. To this end, a solution based on the synergy of Big Data and semantic technologies will be designed and developed. The first will be used to implement the tasks of collection, data fusion and classification supported by natural language processing (NLP) techniques, whereas the latter will deal with semantic aggregation, persistence, reasoning and information retrieval, as well as with the triggering of alerts based on the semantized information.
A Discrete and Improved Bat Algorithm for solving a medical goods distribution problem with pharmacological waste collection
(2019-02) Osaba, Eneko; Yang, Xin-She; Fister, Iztok; Del Ser, Javier; Lopez-Garcia, Pedro; Vazquez-Pardavila, Alejo J.; Tecnalia Research & Innovation; Quantum; IA
The work presented in this paper is focused on the resolution of a real-world drugs distribution problem with pharmacological waste collection. With the aim of properly meeting all the real-world restrictions that comprise this complex problem, we have modeled it as a multi-attribute or rich vehicle routing problem (RVRP). The problem has been modeled as a Clustered Vehicle Routing Problem with Pickups and Deliveries, Asymmetric Variable Costs, Forbidden Roads and Cost Constraints. To the best of authors knowledge, this is the first time that such a RVRP problem is tackled in the literature. For this reason, a benchmark composed of 24 datasets, from 60 to 1000 customers, has also been designed. For the developing of this benchmark, we have used real geographical positions located in Bizkaia, Spain. Furthermore, for the proper dealing of the proposed RVRP, we have developed a Discrete and Improved Bat Algorithm (DaIBA). The main feature of this adaptation is the use of the well-known Hamming Distance to calculate the differences between the bats. An effective improvement has been also contemplated for the proposed DaIBA, which consists on the existence of two different neighborhood structures, which are explored depending on the bat's distance regarding the best individual of the swarm. For the experimentation, we have compared the performance of our presented DaIBA with three additional approaches: an evolutionary algorithm, an evolutionary simulated annealing and a firefly algorithm. Additionally, with the intention of obtaining rigorous conclusions, two different statistical tests have been conducted: the Friedman's non-parametric test and the Holm's post-hoc test. Furthermore, an additional experimentation has been performed in terms of convergence. Finally, the obtained outcomes conclude that the proposed DaIBA is a promising technique for addressing the designed problem.
A discrete water cycle algorithm for solving the symmetric and asymmetric traveling salesman problem
(2018-10) Osaba, Eneko; Del Ser, Javier; Sadollah, Ali; Bilbao, Miren Nekane; Camacho, David; Quantum; IA
The water cycle algorithm (WCA) is a nature-inspired meta-heuristic recently contributed to the community in 2012, which finds its motivation in the natural surface runoff phase in water cycle process and on how streams and rivers flow into the sea. This method has been so far successfully applied to many engineering applications, spread over a wide variety of application fields. In this paper an enhanced discrete version of the WCA (coined as DWCA) is proposed for solving the Symmetric and Asymmetric Traveling Salesman Problem. Aimed at proving that the developed approach is a promising approximation method for solving this family of optimization problems, the designed solver has been tested over 33 problem datasets, comparing the obtained outcomes with the ones got by six different algorithmic counterparts from the related literature: genetic algorithm, island-based genetic algorithm, evolutionary simulated annealing, bat algorithm, firefly algorithm and imperialist competitive algorithm. Furthermore, the statistical significance of the performance gaps found in this benchmark is validated based on the results from non-parametric tests, not only in terms of optimality but also in regards to convergence speed. We conclude that the proposed DWCA approach outperforms – with statistical significance – any other optimization technique in the benchmark in terms of both computation metrics.
Distributed Coordination of Heterogeneous Robotic Swarms Using Stochastic Diffusion Search
(Springer, 2020-10-27) Osaba, Eneko; Del Ser, Javier; Jubeto, Xabier; Iglesias, Andrés; Fister, Iztok; Gálvez, Akemi; Analide, Cesar; Novais, Paulo; Camacho, David; Yin, Hujun; Quantum; IA
The term Swarm Robotics collectively refers to a population of robotic devices that efficiently undertakes diverse tasks in a collaborative way by virtue of computational intelligence techniques. This paradigm has given rise to a profitable stream of contributions in recent years, all sharing a clear consensus on the performance benefits derived from the increased exploration capabilities offered by Swarm Robotics. This manuscript falls within this topic: specifically, it gravitates on an heterogeneous Swarm Robotics system that relies on Stochastic Diffusion Search (SDS) as the coordination heuristics for the exploration, location and delimitation of areas scattered over the area in which robots are deployed. The swarm is composed by agents of diverse kind, which can be ground robots or flying devices. These agents communicate to each other and cooperate towards the accomplishment of the exploration tasks comprising the mission of the overall swarm. Furthermore, maps contain several obstacles and dangers, implying that in order to enter a specific area, robots should meet certain conditions. Experiments are conducted over three different maps and three implemented solving approaches. Conclusions are drawn from the obtained results, confirming that i) SDS allows for a lightweight, heuristic mechanism for the coordination of the robots; and ii) the most efficient swarming approach is the one comprising a heterogeneity of ground and aerial robots.
Edge-enhanced dual discriminator generative adversarial network for fast MRI with parallel imaging using multi-view information
(2022-01-28) Huang, Jiahao; Ding, Weiping; Lv, Jun; Yang, Jingwen; Dong, Hao; Del Ser, Javier; Xia, Jun; Ren, Tiaojuan; Wong, Stephen T.; Yang, Guang; IA
In clinical medicine, magnetic resonance imaging (MRI) is one of the most important tools for diagnosis, triage, prognosis, and treatment planning. However, MRI suffers from an inherent slow data acquisition process because data is collected sequentially in k-space. In recent years, most MRI reconstruction methods proposed in the literature focus on holistic image reconstruction rather than enhancing the edge information. This work steps aside this general trend by elaborating on the enhancement of edge information. Specifically, we introduce a novel parallel imaging coupled dual discriminator generative adversarial network (PIDD-GAN) for fast multi-channel MRI reconstruction by incorporating multi-view information. The dual discriminator design aims to improve the edge information in MRI reconstruction. One discriminator is used for holistic image reconstruction, whereas the other one is responsible for enhancing edge information. An improved U-Net with local and global residual learning is proposed for the generator. Frequency channel attention blocks (FCA Blocks) are embedded in the generator for incorporating attention mechanisms. Content loss is introduced to train the generator for better reconstruction quality. We performed comprehensive experiments on Calgary-Campinas public brain MR dataset and compared our method with state-of-the-art MRI reconstruction methods. Ablation studies of residual learning were conducted on the MICCAI13 dataset to validate the proposed modules. Results show that our PIDD-GAN provides high-quality reconstructed MR images, with well-preserved edge information. The time of single-image reconstruction is below 5ms, which meets the demand of faster processing.
Energy-Aware Multi-Objective Job Shop Scheduling Optimization with Metaheuristics in Manufacturing Industries: A Critical Survey, Results, and Perspectives: A Critical Survey, Results, and Perspectives
(2022-01-29) Para, Jesus; Del Ser, Javier; Nebro, Antonio J.; IA
In recent years, the application of artificial intelligence has been revolutionizing the manufacturing industry, becoming one of the key pillars of what has been called Industry 4.0. In this context, we focus on the job shop scheduling problem (JSP), which aims at productions orders to be carried out, but considering the reduction of energy consumption as a key objective to fulfill. Finding the best combination of machines and jobs to be performed is not a trivial problem and becomes even more involved when several objectives are taken into account. Among them, the improvement of energy savings may conflict with other objectives, such as the minimization of the makespan. In this paper, we provide an in-depth review of the existing literature on multi-objective job shop scheduling optimization with metaheuristics, in which one of the objectives is the minimization of energy consumption. We systematically reviewed and critically analyzed the most relevant features of both problem formulations and algorithms to solve them effectively. The manuscript also informs with empirical results the main findings of our bibliographic critique with a performance comparison among representative multi-objective evolutionary solvers applied to a diversity of synthetic test instances. The ultimate goal of this article is to carry out a critical analysis, finding good practices and opportunities for further improvement that stem from current knowledge in this vibrant research area.
Evolutionary Multitask Optimization: a Methodological Overview, Challenges, and Future Research Directions: a Methodological Overview, Challenges, and Future Research Directions
(2022-04-12) Osaba, Eneko; Del Ser, Javier; Martinez, Aritz D.; Hussain, Amir; Quantum; IA
In this work, we consider multitasking in the context of solving multiple optimization problems simultaneously by conducting a single search process. The principal goal when dealing with this scenario is to dynamically exploit the existing complementarities among the problems (tasks) being optimized, helping each other through the exchange of valuable knowledge. Additionally, the emerging paradigm of evolutionary multitasking tackles multitask optimization scenarios by using biologically inspired concepts drawn from swarm intelligence and evolutionary computation. The main purpose of this survey is to collect, organize, and critically examine the abundant literature published so far in evolutionary multitasking, with an emphasis on the methodological patterns followed when designing new algorithmic proposals in this area (namely, multifactorial optimization and multipopulation-based multitasking). We complement our critical analysis with an identification of challenges that remain open to date, along with promising research directions that can leverage the potential of biologically inspired algorithms for multitask optimization. Our discussions held throughout this manuscript are offered to the audience as a reference of the general trajectory followed by the community working in this field in recent times, as well as a self-contained entry point for newcomers and researchers interested to join this exciting research avenue.
Evolving Spiking Neural Networks for online learning over drifting data streams
(2018-12) Lobo, Jesus L.; Laña, Ibai; Del Ser, Javier; Bilbao, Miren Nekane; Kasabov, Nikola; IA
Nowadays huge volumes of data are produced in the form of fast streams, which are further affected by non-stationary phenomena. The resulting lack of stationarity in the distribution of the produced data calls for efficient and scalable algorithms for online analysis capable of adapting to such changes (concept drift). The online learning field has lately turned its focus on this challenging scenario, by designing incremental learning algorithms that avoid becoming obsolete after a concept drift occurs. Despite the noted activity in the literature, a need for new efficient and scalable algorithms that adapt to the drift still prevails as a research topic deserving further effort. Surprisingly, Spiking Neural Networks, one of the major exponents of the third generation of artificial neural networks, have not been thoroughly studied as an online learning approach, even though they are naturally suited to easily and quickly adapting to changing environments. This work covers this research gap by adapting Spiking Neural Networks to meet the processing requirements that online learning scenarios impose. In particular the work focuses on limiting the size of the neuron repository and making the most of this limited size by resorting to data reduction techniques. Experiments with synthetic and real data sets are discussed, leading to the empirically validated assertion that, by virtue of a tailored exploitation of the neuron repository, Spiking Neural Networks adapt better to drifts, obtaining higher accuracy scores than naive versions of Spiking Neural Networks for online learning environments.
A feature selection method for author identification in interactive communications based on supervised learning and language typicality
(2016-11-01) Villar-Rodriguez, Esther; Del Ser, Javier; Bilbao, Miren Nekane; Salcedo-Sanz, Sancho; Tecnalia Research & Innovation; Quantum; IA
Authorship attribution, conceived as the identification of the origin of a text between different authors, has been a very active area of research in the scientific community mainly supported by advances in Natural Language Processing (NLP), machine learning and Computational Intelligence. This paradigm has been mostly addressed from a literary perspective, aiming at identifying the stylometric features and writeprints which unequivocally typify the writer patterns and allow their unique identification. On the other hand, the upsurge of social networking platforms and interactive messaging have undoubtedly made the anonymous expression of feelings, the sharing of experiences and social relationships much easier than in other traditional communication media. Unfortunately, the popularity of such communities and the virtual identification of their users deploy a rich substrate for cybercrimes against unsuspecting victims and other forms of illegal uses of social networks that call for the activity tracing of accounts. In the context of one-to-one communications this manuscript postulates the identification of the sender of a message as a useful approach to detect impersonation attacks in interactive communication scenarios. In particular this work proposes to select linguistic features extracted from messages via NLP techniques by means of a novel feature selection algorithm based on the dissociation between essential traits of the sender and receiver influences. The performance and computational efficiency of different supervised learning models when incorporating the proposed feature selection method is shown to be promising with real SMS data in terms of identification accuracy, and paves the way towards future research lines focused on applying the concept of language typicality in the discourse analysis field.
From Data to Actions in Intelligent Transportation Systems: A Prescription of Functional Requirements for Model Actionability
(Multidisciplinary Digital Publishing Institute (MDPI), 2021-02-05) Laña, Ibai; Sanchez-Medina, Javier J.; Vlahogianni, Eleni I.; Del Ser, Javier
Advances in Data Science permeate every field of Transportation Science and Engineering, resulting in developments in the transportation sector that are data-driven. Nowadays, Intelligent Transportation Systems (ITS) could be arguably approached as a “story” intensively producing and consuming large amounts of data. A diversity of sensing devices densely spread over the infrastructure, vehicles or the travelers’ personal devices act as sources of data flows that are eventually fed into software running on automatic devices, actuators or control systems producing, in turn, complex information flows among users, traffic managers, data analysts, traffic modeling scientists, etc. These information flows provide enormous opportunities to improve model development and decision-making. This work aims to describe how data, coming from diverse ITS sources, can be used to learn and adapt data-driven models for efficiently operating ITS assets, systems and processes; in other words, for data-based models to fully become actionable. Grounded in this described data modeling pipeline for ITS, we define the characteristics, engineering requisites and challenges intrinsic to its three compounding stages, namely, data fusion, adaptive learning and model evaluation. We deliberately generalize model learning to be adaptive, since, in the core of our paper is the firm conviction that most learners will have to adapt to the ever-changing phenomenon scenario underlying the majority of ITS applications. Finally, we provide a prospect of current research lines within Data Science that can bring notable advances to data-based ITS modeling, which will eventually bridge the gap towards the practicality and actionability of such models.

Browsing by Author "Del Ser, Javier"

Results Per Page

Sort Options