Browsing by Author "Torre-Bastida, Ana I."
Now showing 1 - 6 of 6
Results Per Page
Sort Options
Item Big Data for transportation and mobility: recent advances, trends and challenges: Recent advances, trends and challenges(2018-10-04) Torre-Bastida, Ana I.; Del Ser, Javier; Laña, Ibai; Ilardia, Maitena; Bilbao, Miren Nekane; Campos-Cordobes, Sergio; Tecnalia Research & Innovation; HPA; IA; LABORATORIO DE TRANSFORMACIÓN URBANA; SMART_TRANSPORTBig Data is an emerging paradigm and has currently become a strong attractor of global interest, specially within the transportation industry. The combination of disruptive technologies and new concepts such as the Smart City upgrades the transport data life cycle. In this context, Big Data is considered as a new pledge for the transportation industry to effectively manage all data this sector required for providing safer, cleaner and more efficient transport means, as well as for users to personalize their transport experience. However, Big Data comes along with its own set of technological challenges, stemming from the multiple and heterogeneous transportation/mobility application scenarios. In this survey we analyze the latest research efforts revolving on Big Data for the transportation and mobility industry, its applications, baselines scenarios, fields and use case such as routing, planning, infrastructure monitoring, network design, among others. This analysis will be done strictly from the Big Data perspective, focusing on those contributions gravitating on techniques, tools and methods for modeling, processing, analyzing and visualizing transport and mobility Big Data. From the literature review a set of trends and challenges is extracted so as to provide researchers with an insightful outlook on the field of transport and mobility.Item Bio-inspired computation for big data fusion, storage, processing, learning and visualization: state of the art and future directions: state of the art and future directions(2021-08-03) Torre-Bastida, Ana I.; Díaz-de-Arcaya, Josu; Osaba, Eneko; Muhammad, Khan; Camacho, David; Del Ser, Javier; HPA; QuantumThis overview gravitates on research achievements that have recently emerged from the confluence between Big Data technologies and bio-inspired computation. A manifold of reasons can be identified for the profitable synergy between these two paradigms, all rooted on the adaptability, intelligence and robustness that biologically inspired principles can provide to technologies aimed to manage, retrieve, fuse and process Big Data efficiently. We delve into this research field by first analyzing in depth the existing literature, with a focus on advances reported in the last few years. This prior literature analysis is complemented by an identification of the new trends and open challenges in Big Data that remain unsolved to date, and that can be effectively addressed by bio-inspired algorithms. As a second contribution, this work elaborates on how bio-inspired algorithms need to be adapted for their use in a Big Data context, in which data fusion becomes crucial as a previous step to allow processing and mining several and potentially heterogeneous data sources. This analysis allows exploring and comparing the scope and efficiency of existing approaches across different problems and domains, with the purpose of identifying new potential applications and research niches. Finally, this survey highlights open issues that remain unsolved to date in this research avenue, alongside a prescription of recommendations for future research.Item Design and implementation of an extended corporate crm database system with big data analytical functionalities(2015-07-25) Torre-Bastida, Ana I.; Villar-Rodriguez, Esther; Gil-Lopez, Sergio; Del Ser, Javier; HPA; Quantum; IAThe amount of open information available on-line from heterogeneous sources and domains is growing at an extremely fast pace, and constitutes an important knowledge base for the consideration of industries and companies. In this context, two relevant data providers can be highlighted: the “Linked Open Data” (LOD) and “Social Media” (SM) paradigms. The fusion of these data sources – structured the former, and raw data the latter –, along with the information contained in structured corporate databases within the organizations themselves, may unveil significant business opportunities and competitive advantage to those who are able to understand and leverage their value. In this paper, we present two complementary use cases, illustrating the potential of using the open data in the business domain. The first represents the creation of an existing and potential customer knowledge base, exploiting social and linked open data based on which any given organization might infer valuable information as a support for decision making. The second focuses on the classification of organizations and enterprises aiming at detecting potential competitors and/or allies via the analysis of the conceptual similarity between their participated projects. To this end, a solution based on the synergy of Big Data and semantic technologies will be designed and developed. The first will be used to implement the tasks of collection, data fusion and classification supported by natural language processing (NLP) techniques, whereas the latter will deal with semantic aggregation, persistence, reasoning and information retrieval, as well as with the triggering of alerts based on the semantized information.Item PADL: A Modeling and Deployment Language for Advanced Analytical Services: A modeling and deployment language for advanced analytical services(2020-11-24) Díaz-De-arcaya, Josu; Miñón, Raúl; Torre-Bastida, Ana I.; Del Ser, Javier; Almeida, Aitor; HPA; IAIn the smart city context, Big Data analytics plays an important role in processing the data collected through IoT devices. The analysis of the information gathered by sensors favors the generation of specific services and systems that not only improve the quality of life of the citizens, but also optimize the city resources. However, the difficulties of implementing this entire process in real scenarios are manifold, including the huge amount and heterogeneity of the devices, their geographical distribution, and the complexity of the necessary IT infrastructures. For this reason, the main contribution of this paper is the PADL description language, which has been specifically tailored to assist in the definition and operationalization phases of the machine learning life cycle. It provides annotations that serve as an abstraction layer from the underlying infrastructure and technologies, hence facilitating the work of data scientists and engineers. Due to its proficiency in the operationalization of distributed pipelines over edge, fog, and cloud layers, it is particularly useful in the complex and heterogeneous environments of smart cities. For this purpose, PADL contains functionalities for the specification of monitoring, notifications, and actuation capabilities. In addition, we provide tools that facilitate its adoption in production environments. Finally, we showcase the usefulness of the language by showing the definition of PADL-compliant analytical pipelines over two uses cases in a smart city context (flood control and waste management), demonstrating that its adoption is simple and beneficial for the definition of information and process flows in such environments.Item Pangea: An MLOps Tool for Automatically Generating Infrastructure and Deploying Analytic Pipelines in Edge, Fog and Cloud Layers: An MLOps Tool for Automatically Generating Infrastructure and Deploying Analytic Pipelines in Edge, Fog and Cloud Layers(2022-06-11) Miñón, Raúl; Diaz-de-Arcaya, Josu; Torre-Bastida, Ana I.; Hartlieb, Philipp; HPADevelopment and operations (DevOps), artificial intelligence (AI), big data and edge–fog–cloud are disruptive technologies that may produce a radical transformation of the industry. Nevertheless, there are still major challenges to efficiently applying them in order to optimise productivity. Some of them are addressed in this article, concretely, with respect to the adequate management of information technology (IT) infrastructures for automated analysis processes in critical fields such as the mining industry. In this area, this paper presents a tool called Pangea aimed at automatically generating suitable execution environments for deploying analytic pipelines. These pipelines are decomposed into various steps to execute each one in the most suitable environment (edge, fog, cloud or on-premise) minimising latency and optimising the use of both hardware and software resources. Pangea is focused in three distinct objectives: (1) generating the required infrastructure if it does not previously exist; (2) provisioning it with the necessary requirements to run the pipelines (i.e., configuring each host operative system and software, install dependencies and download the code to execute); and (3) deploying the pipelines. In order to facilitate the use of the architecture, a representational state transfer application programming interface (REST API) is defined to interact with it. Therefore, in turn, a web client is proposed. Finally, it is worth noting that in addition to the production mode, a local development environment can be generated for testing and benchmarking purposes.Item Semantic Information Fusion of Linked Open Data and Social Big Data for the Creation of an Extended Corporate CRM Database(2015-01-01) Torre-Bastida, Ana I.; Villar-Rodriguez, Esther; Del Ser, Javier; Gil-Lopez, Sergio; Tecnalia Research & Innovation; HPA; Quantum; IAThe amount of on-line available open information from heterogeneous sources and domains is growing at an extremely fast pace, and constitutes an important knowledge base for the consideration of industries and companies. In this context, two relevant data providers can be highlighted: the “Linked Open Data” and “Social Media” paradigms. The fusion of these data sources – structured the former, and raw data the latter –, along with the information contained in structured corporate databases within the organizations themselves, may unveil significant business opportunities and competitive advantage to those who are able to understand and leverage their value. In this paper, we present a use case that represents the creation of an existing and potential customer knowledge base, exploiting social and linked open data based on which any given organization might infer valuable information as a support for decision making. In order to achieve this a solution based on the synergy of big data and semantic technologies will be designed and developed. The first will be used to implement the tasks of collection and initial data fusion based on natural language processing techniques, whereas the latter will perform semantic aggregation, persistence, reasoning and retrieval of information, as well as the triggering of alerts over the semantized information.