Browsing by Author "Villar-Rodriguez, Esther"
Now showing 1 - 7 of 7
Results Per Page
Sort Options
Item Advanced Machine Learning Techniques and Meta-Heuristic Optimization for the Detection of Masquerading Attacks in Social Networks(Universidad de Alcalá, 2015-12-11) Villar-Rodriguez, Esther; Del Ser, Javier; Salcedo-Sanz, SanchoAccording to the report published by the online protection firm Iovation in 2012, cyber fraud ranged from 1 percent of the Internet transactions in North America Africa to a 7 percent in Africa, most of them involving credit card fraud, identity theft, and account takeover or h¼acking attempts. This kind of crime is still growing due to the advantages offered by a non face-to-face channel where a increasing number of unsuspecting victims divulges sensitive information. Interpol classifies these illegal activities into 3 types: • Attacks against computer hardware and software. • Financial crimes and corruption. • Abuse, in the form of grooming or “sexploitation”. Most research efforts have been focused on the target of the crime developing different strategies depending on the casuistic. Thus, for the well-known phising, stored blacklist or crime signals through the text are employed eventually designing adhoc detectors hardly conveyed to other scenarios even if the background is widely shared. Identity theft or masquerading can be described as a criminal activity oriented towards the misuse of those stolen credentials to obtain goods or services by deception. On March 4, 2005, a million of personal and sensitive information such as credit card and social security numbers was collected by White Hat hackers at Seattle University who just surfed the Web for less than 60 minutes by means of the Google search engine. As a consequence they proved the vulnerability and lack of protection with a mere group of sophisticated search terms typed in the engine whose large data warehouse still allowed showing company or government websites data temporarily cached. As aforementioned, platforms to connect distant people in which the interaction is undirected pose a forcible entry for unauthorized thirds who impersonate the licit user in a attempt to go unnoticed with some malicious, not necessarily economic, interests. In fact, the last point in the list above regarding abuses has become a major and a terrible risk along with the bullying being both by means of threats, harassment or even self-incrimination likely to drive someone to suicide, depression or helplessness. California Penal Code Section 528.5 states: “Notwithstanding any other provision of law, any person who knowingly and without consent credibly impersonates another actual person through or on an Internet Web site or by other electronic means for purposes of harming, intimidating, threatening, or defrauding another person is guilty of a public offense punishable pursuant to subdivision [...]”. IV Therefore, impersonation consists of any criminal activity in which someone assumes a false identity and acts as his or her assumed character with intent to get a pecuniary benefit or cause some harm. User profiling, in turn, is the process of harvesting user information in order to construct a rich template with all the advantageous attributes in the field at hand and with specific purposes. User profiling is often employed as a mechanism for recommendation of items or useful information which has not yet considered by the client. Nevertheless, deriving user tendency or preferences can be also exploited to define the inherent behavior and address the problem of impersonation by detecting outliers or strange deviations prone to entail a potential attack. This dissertation is meant to elaborate on impersonation attacks from a profiling perspective, eventually developing a 2-stage environment which consequently embraces 2 levels of privacy intrusion, thus providing the following contributions: • The inference of behavioral patterns from the connection time traces aiming at avoiding the usurpation of more confidential information. When compared to previous approaches, this procedure abstains from impinging on the user privacy by taking over the messages content, since it only relies on time statistics of the user sessions rather than on their content. • The application and subsequent discussion of two selected algorithms for the previous point resolution: – A commonly employed supervised algorithm executed as a binary classifier which thereafter has forced us to figure out a method to deal with the absence of labeled instances representing an identity theft. – And a meta-heuristic algorithm in the search for the most convenient parameters to array the instances within a high dimensional space into properly delimited clusters so as to finally apply an unsupervised clustering algorithm. • The analysis of message content encroaching on more private information but easing the user identification by mining discriminative features by Natural Language Processing (NLP) techniques. As a consequence, the development of a new feature extraction algorithm based on linguistic theories motivated by the massive quantity of features often gathered when it comes to texts. In summary, this dissertation means to go beyond typical, ad-hoc approaches adopted by previous identity theft and authorship attribution research. Specifically it proposes tailored solutions to this particular and extensively studied paradigm with the aim at introducing a generic approach from a profiling view, not tightly bound to a unique application field. In addition technical contributions have been made in the course of the solution formulation intending to optimize familiar methods for a better versatility towards the problem at hand. In summary: this Thesis establishes an encouraging research basis towards unveiling subtle impersonation attacks in Social Networks by means of intelligent learning techniques.Item Design and implementation of an extended corporate crm database system with big data analytical functionalities(2015-07-25) Torre-Bastida, Ana I.; Villar-Rodriguez, Esther; Gil-Lopez, Sergio; Del Ser, Javier; HPA; Quantum; IAThe amount of open information available on-line from heterogeneous sources and domains is growing at an extremely fast pace, and constitutes an important knowledge base for the consideration of industries and companies. In this context, two relevant data providers can be highlighted: the “Linked Open Data” (LOD) and “Social Media” (SM) paradigms. The fusion of these data sources – structured the former, and raw data the latter –, along with the information contained in structured corporate databases within the organizations themselves, may unveil significant business opportunities and competitive advantage to those who are able to understand and leverage their value. In this paper, we present two complementary use cases, illustrating the potential of using the open data in the business domain. The first represents the creation of an existing and potential customer knowledge base, exploiting social and linked open data based on which any given organization might infer valuable information as a support for decision making. The second focuses on the classification of organizations and enterprises aiming at detecting potential competitors and/or allies via the analysis of the conceptual similarity between their participated projects. To this end, a solution based on the synergy of Big Data and semantic technologies will be designed and developed. The first will be used to implement the tasks of collection, data fusion and classification supported by natural language processing (NLP) techniques, whereas the latter will deal with semantic aggregation, persistence, reasoning and information retrieval, as well as with the triggering of alerts based on the semantized information.Item Digital Quantum Simulation and Circuit Learning for the Generation of Coherent States(2022-10-25) Liu, Ruilin; V. Romero, Sebastián; Oregi, Izaskun; Osaba, Eneko; Villar-Rodriguez, Esther; Ban, Yue; Tecnalia Research & Innovation; QuantumCoherent states, known as displaced vacuum states, play an important role in quantum information processing, quantum machine learning, and quantum optics. In this article, two ways to digitally prepare coherent states in quantum circuits are introduced. First, we construct the displacement operator by decomposing it into Pauli matrices via ladder operators, i.e., creation and annihilation operators. The high fidelity of the digitally generated coherent states is verified compared with the Poissonian distribution in Fock space. Secondly, by using Variational Quantum Algorithms, we choose different ansatzes to generate coherent states. The quantum resources—such as numbers of quantum gates, layers and iterations—are analyzed for quantum circuit learning. The simulation results show that quantum circuit learning can provide high fidelity on learning coherent states by choosing appropriate ansatzes.Item A feature selection method for author identification in interactive communications based on supervised learning and language typicality(2016-11-01) Villar-Rodriguez, Esther; Del Ser, Javier; Bilbao, Miren Nekane; Salcedo-Sanz, Sancho; Tecnalia Research & Innovation; Quantum; IAAuthorship attribution, conceived as the identification of the origin of a text between different authors, has been a very active area of research in the scientific community mainly supported by advances in Natural Language Processing (NLP), machine learning and Computational Intelligence. This paradigm has been mostly addressed from a literary perspective, aiming at identifying the stylometric features and writeprints which unequivocally typify the writer patterns and allow their unique identification. On the other hand, the upsurge of social networking platforms and interactive messaging have undoubtedly made the anonymous expression of feelings, the sharing of experiences and social relationships much easier than in other traditional communication media. Unfortunately, the popularity of such communities and the virtual identification of their users deploy a rich substrate for cybercrimes against unsuspecting victims and other forms of illegal uses of social networks that call for the activity tracing of accounts. In the context of one-to-one communications this manuscript postulates the identification of the sender of a message as a useful approach to detect impersonation attacks in interactive communication scenarios. In particular this work proposes to select linguistic features extracted from messages via NLP techniques by means of a novel feature selection algorithm based on the dissociation between essential traits of the sender and receiver influences. The performance and computational efficiency of different supervised learning models when incorporating the proposed feature selection method is shown to be promising with real SMS data in terms of identification accuracy, and paves the way towards future research lines focused on applying the concept of language typicality in the discourse analysis field.Item On a Machine Learning Approach for the Detection of Impersonation Attacks in Social Networks(2015) Villar-Rodriguez, Esther; Del Ser, Javier; Salcedo-Sanz, Sancho; Tecnalia Research & Innovation; Quantum; IALately the proliferation of social networks has given rise to a myriad of fraudulent strategies aimed at getting some sort of benefit from the attacked individual. Despite most of them being exclusively driven by economic interests, the so called impersonation, masquerading attack or identity fraud hinges on stealing the credentials of the victim and assuming his/her identity to get access to resources (e.g. relationships or confidential information), credit and other benefits in that person’s name. While this problem is getting particularly frequent within the teenage community, the reality is that very scarce technological approaches have been proposed in the literature to address this issue which, if not detected in time, may catastrophically unchain other fatal consequences to the impersonated person such as bullying and intimidation. In this context, this paper delves into a machine learning approach that permits to efficiently detect this kind of attacks by solely relying on connection time information of the potential victim. The manuscript will demonstrate how these learning algorithms - in particular, support vector classifiers - can be of great help to understand and detect impersonation attacks without compromising the user privacy of social networks.Item Semantic Information Fusion of Linked Open Data and Social Big Data for the Creation of an Extended Corporate CRM Database(2015-01-01) Torre-Bastida, Ana I.; Villar-Rodriguez, Esther; Del Ser, Javier; Gil-Lopez, Sergio; Tecnalia Research & Innovation; HPA; Quantum; IAThe amount of on-line available open information from heterogeneous sources and domains is growing at an extremely fast pace, and constitutes an important knowledge base for the consideration of industries and companies. In this context, two relevant data providers can be highlighted: the “Linked Open Data” and “Social Media” paradigms. The fusion of these data sources – structured the former, and raw data the latter –, along with the information contained in structured corporate databases within the organizations themselves, may unveil significant business opportunities and competitive advantage to those who are able to understand and leverage their value. In this paper, we present a use case that represents the creation of an existing and potential customer knowledge base, exploiting social and linked open data based on which any given organization might infer valuable information as a support for decision making. In order to achieve this a solution based on the synergy of big data and semantic technologies will be designed and developed. The first will be used to implement the tasks of collection and initial data fusion based on natural language processing techniques, whereas the latter will perform semantic aggregation, persistence, reasoning and retrieval of information, as well as the triggering of alerts over the semantized information.Item A Systematic Literature Review of Quantum Computing for Routing Problems(2022-05) Osaba, Eneko; Villar-Rodriguez, Esther; Oregi, Izaskun; Tecnalia Research & Innovation; QuantumQuantum Computing is drawing a significant attention from the current scientific community. The potential advantages offered by this revolutionary paradigm has led to an upsurge of scientific production in different fields such as economics, industry, or logistics. The main purpose of this paper is to collect, organize and systematically examine the literature published so far on the application of Quantum Computing to routing problems. To do this, we embrace the well-established procedure named as Systematic Literature Review. Specifically, we provide a unified, self-contained, and end-to-end review of 18 years of research (from 2004 to 2021) in the intersection of Quantum Computing and routing problems through the analysis of 53 different papers. Several interesting conclusions have been drawn from this analysis, which has been formulated to give a comprehensive summary of the current state of the art by providing answers related to the most recurrent type of study (practical or theoretical), preferred solving approaches (dedicated or hybrid), detected open challenges or most used Quantum Computing device, among others.