Browsing by Author "Laña, Ibai"
Now showing 1 - 8 of 8
Results Per Page
Sort Options
Item Big Data for transportation and mobility: recent advances, trends and challenges: Recent advances, trends and challenges(2018-10-04) Torre-Bastida, Ana I.; Del Ser, Javier; Laña, Ibai; Ilardia, Maitena; Bilbao, Miren Nekane; Campos-Cordobes, Sergio; Tecnalia Research & Innovation; HPA; IA; LABORATORIO DE TRANSFORMACIÓN URBANA; SMART_TRANSPORTBig Data is an emerging paradigm and has currently become a strong attractor of global interest, specially within the transportation industry. The combination of disruptive technologies and new concepts such as the Smart City upgrades the transport data life cycle. In this context, Big Data is considered as a new pledge for the transportation industry to effectively manage all data this sector required for providing safer, cleaner and more efficient transport means, as well as for users to personalize their transport experience. However, Big Data comes along with its own set of technological challenges, stemming from the multiple and heterogeneous transportation/mobility application scenarios. In this survey we analyze the latest research efforts revolving on Big Data for the transportation and mobility industry, its applications, baselines scenarios, fields and use case such as routing, planning, infrastructure monitoring, network design, among others. This analysis will be done strictly from the Big Data perspective, focusing on those contributions gravitating on techniques, tools and methods for modeling, processing, analyzing and visualizing transport and mobility Big Data. From the literature review a set of trends and challenges is extracted so as to provide researchers with an insightful outlook on the field of transport and mobility.Item Evolving Spiking Neural Networks for online learning over drifting data streams(2018-12) Lobo, Jesus L.; Laña, Ibai; Del Ser, Javier; Bilbao, Miren Nekane; Kasabov, Nikola; IANowadays huge volumes of data are produced in the form of fast streams, which are further affected by non-stationary phenomena. The resulting lack of stationarity in the distribution of the produced data calls for efficient and scalable algorithms for online analysis capable of adapting to such changes (concept drift). The online learning field has lately turned its focus on this challenging scenario, by designing incremental learning algorithms that avoid becoming obsolete after a concept drift occurs. Despite the noted activity in the literature, a need for new efficient and scalable algorithms that adapt to the drift still prevails as a research topic deserving further effort. Surprisingly, Spiking Neural Networks, one of the major exponents of the third generation of artificial neural networks, have not been thoroughly studied as an online learning approach, even though they are naturally suited to easily and quickly adapting to changing environments. This work covers this research gap by adapting Spiking Neural Networks to meet the processing requirements that online learning scenarios impose. In particular the work focuses on limiting the size of the neuron repository and making the most of this limited size by resorting to data reduction techniques. Experiments with synthetic and real data sets are discussed, leading to the empirically validated assertion that, by virtue of a tailored exploitation of the neuron repository, Spiking Neural Networks adapt better to drifts, obtaining higher accuracy scores than naive versions of Spiking Neural Networks for online learning environments.Item From Data to Actions in Intelligent Transportation Systems: A Prescription of Functional Requirements for Model Actionability(Multidisciplinary Digital Publishing Institute (MDPI), 2021-02-05) Laña, Ibai; Sanchez-Medina, Javier J.; Vlahogianni, Eleni I.; Del Ser, JavierAdvances in Data Science permeate every field of Transportation Science and Engineering, resulting in developments in the transportation sector that are data-driven. Nowadays, Intelligent Transportation Systems (ITS) could be arguably approached as a “story” intensively producing and consuming large amounts of data. A diversity of sensing devices densely spread over the infrastructure, vehicles or the travelers’ personal devices act as sources of data flows that are eventually fed into software running on automatic devices, actuators or control systems producing, in turn, complex information flows among users, traffic managers, data analysts, traffic modeling scientists, etc. These information flows provide enormous opportunities to improve model development and decision-making. This work aims to describe how data, coming from diverse ITS sources, can be used to learn and adapt data-driven models for efficiently operating ITS assets, systems and processes; in other words, for data-based models to fully become actionable. Grounded in this described data modeling pipeline for ITS, we define the characteristics, engineering requisites and challenges intrinsic to its three compounding stages, namely, data fusion, adaptive learning and model evaluation. We deliberately generalize model learning to be adaptive, since, in the core of our paper is the firm conviction that most learners will have to adapt to the ever-changing phenomenon scenario underlying the majority of ITS applications. Finally, we provide a prospect of current research lines within Data Science that can bring notable advances to data-based ITS modeling, which will eventually bridge the gap towards the practicality and actionability of such models.Item On the imputation of missing data for road traffic forecasting: New insights and novel techniques: New insights and novel techniques(2018-05) Laña, Ibai; Olabarrieta, Ignacio (Iñaki); Vélez, Manuel; Del Ser, Javier; IAVehicle flow forecasting is of crucial importance for the management of road traffic in complex urban networks, as well as a useful input for route planning algorithms. In general traffic predictive models rely on data gathered by different types of sensors placed on roads, which occasionally produce faulty readings due to several causes, such as malfunctioning hardware or transmission errors. Filling in those gaps is relevant for constructing accurate forecasting models, a task which is engaged by diverse strategies, from a simple null value imputation to complex spatio-temporal context imputation models. This work elaborates on two machine learning approaches to update missing data with no gap length restrictions: a spatial context sensing model based on the information provided by surrounding sensors, and an automated clustering analysis tool that seeks optimal pattern clusters in order to impute values. Their performance is assessed and compared to other common techniques and different missing data generation models over real data captured from the city of Madrid (Spain). The newly presented methods are found to be fairly superior when portions of missing data are large or very abundant, as occurs in most practical cases.Item On the post-hoc explainability of deep echo state networks for time series forecasting, image and video classification(2022-07) Barredo Arrieta, Alejandro; Gil-Lopez, Sergio; Laña, Ibai; Bilbao, Miren Nekane; Del Ser, Javier; Tecnalia Research & Innovation; IASince their inception, learning techniques under the reservoir computing paradigm have shown a great modeling capability for recurrent systems without the computing overheads required for other approaches, specially deep neural networks. Among them, different flavors of echo state networks have attracted many stares through time, mainly due to the simplicity and computational efficiency of their learning algorithm. However, these advantages do not compensate for the fact that echo state networks remain as black-box models whose decisions cannot be easily explained to the general audience. This issue is even more involved for multi-layered (also referred to as deep) echo state networks, whose more complex hierarchical structure hinders even further the explainability of their internals to users without expertise in machine learning or even computer science. This lack of explainability can jeopardize the widespread adoption of these models in certain domains where accountability and understandability of machine learning models is a must (e.g., medical diagnosis, social politics). This work addresses this issue by conducting an explainability study of echo state networks when applied to learning tasks with time series, image and video data. Among these tasks, we stress on the latter one (video classification) which, to the best of our knowledge, has never been tackled before with echo state networks in the related literature. Specifically, the study proposes three different techniques capable of eliciting understandable information about the knowledge grasped by these recurrent models, namely potential memory, temporal patterns and pixel absence effect. Potential memory addresses questions related to the effect of the reservoir size in the capability of the model to store temporal information, whereas temporal patterns unveil the recurrent relationships captured by the model over time. Finally, pixel absence effect attempts at evaluating the effect of the absence of a given pixel when the echo state network model is used for image and video classification. The benefits of the proposed suite of techniques are showcased over three different domains of applicability: time series modeling, image and, for the first time in the related literature, video classification. The obtained results reveal that the proposed techniques not only allow for an informed understanding of the way these models work, but also serve as diagnostic tools capable of detecting issues inherited from data (e.g., presence of hidden bias).Item A Probabilistic Sample Matchmaking Strategy for Imbalanced Data Streams with Concept Drift(2017) L. Lobo, Jesus; Del Ser, Javier; Bilbao, Miren Nekane; Laña, Ibai; Salcedo-Sanz, Sancho; IAIn the last decade the interest in adaptive models for non-stationary environments has gained momentum within the research community due to an increasing number of application scenarios generating non-stationary data streams. In this context the literature has been specially rich in terms of ensemble techniques, which in their majority have focused on taking advantage of past information in the form of already trained predictive models and other alternatives alike. This manuscript elaborates on a rather different approach, which hinges on extracting the essential predictive information of past trained models and determining therefrom the best candidates (intelligent sample matchmaking) for training the predictive model of the current data batch. This novel perspective is of inherent utility for data streams characterized by short-length unbalanced data batches, situation where the so-called trade-off between plasticity and stability must be carefully met. The approach is evaluated on a synthetic data set that simulates a non-stationary environment with recurrently changing concept drift. The proposed approach is shown to perform competitively when adapting to a sudden and recurrent change with respect to the state of the art, but without storing all the past trained models and by lessening its computational complexity in terms of model evaluations. These promising results motivate future research aimed at validating the proposed strategy on other scenarios under concept drift, such as those characterized by semi-supervised data streams.Item The role of local urban traffic and meteorological conditions in air pollution: A data-based case study in Madrid, Spain: A data-based case study in Madrid, Spain(2016-11-01) Laña, Ibai; Del Ser, Javier; Padró, Ales; Vélez, Manuel; Casanova-Mateo, Carlos; IA; CALIDAD Y CONFORT AMBIENTALUrban air pollution is a matter of growing concern for both public administrations and citizens. Road traffic is one of the main sources of air pollutants, though topography characteristics and meteorological conditions can make pollution levels increase or diminish dramatically. In this context an upsurge of research has been conducted towards functionally linking variables of such domains to measured pollution data, with studies dealing with up to one-hour resolution meteorological data. However, the majority of such reported contributions do not deal with traffic data or, at most, simulate traffic conditions jointly with the consideration of different topographical features. The aim of this study is to further explore this relationship by using high-resolution real traffic data. This paper describes a methodology based on the construction of regression models to predict levels of different pollutants (i.e. CO, NO, NO2, O3 and PM10) based on traffic data and meteorological conditions, from which an estimation of the predictive relevance (importance) of each utilized feature can be estimated by virtue of their particular training procedure. The study was made with one hour resolution meteorological, traffic and pollution historic data in roadside and background locations of the city of Madrid (Spain) captured over 2015. The obtained results reveal that the impact of vehicular emissions on the pollution levels is overshadowed by the effects of stable meteorological conditions of this city.Item Transfer Learning and Online Learning for Traffic Forecasting under Different Data Availability Conditions: Alternatives and Pitfalls(Institute of Electrical and Electronics Engineers Inc., 2020-09-20) Manibardo, Eric L.; Laña, Ibai; Del Ser, Javier; IAThis work aims at unveiling the potential of Transfer Learning (TL) for developing a traffic flow forecasting model in scenarios of absent data. Knowledge transfer from high-quality predictive models becomes feasible under the TL paradigm, enabling the generation of new proper models with few data. In order to explore this capability, we identify three different levels of data absent scenarios, where TL techniques are applied among Deep Learning (DL) methods for traffic forecasting. Then, traditional batch learning is compared against TL based models using real traffic flow data, collected by deployed loops managed by the City Council of Madrid (Spain). In addition, we apply Online Learning (OL) techniques, where model receives an update after each prediction, in order to adapt to traffic flow trend changes and incrementally learn from new incoming traffic data. The obtained experimental results shed light on the advantages of transfer and online learning for traffic flow forecasting, and draw practical insights on their interplay with the amount of available training data at the location of interest.