Browsing by Keyword "Natural language processing"
Now showing 1 - 3 of 3
Results Per Page
Sort Options
Item On the design and tuning of machine learning models for language toxicity classification in online platforms(Springer Verlag, 2018) Rybinski, Maciej; Miller, William; Del Ser, Javier; Bilbao, Miren Nekane; Aldana-Montes, José F.; IAOne of the most concerning drawbacks derived from the lack of supervision in online platforms is their exploitation by misbehaving users to deliver offending (toxic) messages while remaining unknown themselves. Given the huge volumes of data handled by these platforms, the detection of toxicity in exchanged comments and messages has naturally called for the adoption of machine learning models to automate this task. In the last few years Deep Learning models and related techniques have played a major role in this regard due to their superior modeling capabilities, which have made them stand out as the prevailing choice in the related literature. By addressing a toxicity classification problem over a real dataset, this work aims at throwing light on two aspects of this noted dominance of Deep Learning models: (1) an empirical assessment of their predictive gains with respect to traditional Shallow Learning models; and (2) the impact of using different text embedding methods and data augmentation techniques in this classification task. Our findings reveal that in our case study the application of non-optimized Shallow and Deep Learning models attains very competitive accuracy scores, thus leaving a narrow improvement margin for the fine-grained refinement of the models or the addition of data augmentation techniques.Item A survey on extremism analysis using natural language processing: definitions, literature review, trends and challenges: definitions, literature review, trends and challenges(2022-12-12) Torregrosa, Javier; Bello-Orgaz, Gema; Martínez-Cámara, Eugenio; Ser, Javier Del; Camacho, David; IAExtremism has grown as a global problem for society in recent years, especially after the apparition of movements such as jihadism. This and other extremist groups have taken advantage of different approaches, such as the use of Social Media, to spread their ideology, promote their acts and recruit followers. The extremist discourse, therefore, is reflected on the language used by these groups. Natural language processing (NLP) provides a way of detecting this type of content, and several authors make use of it to describe and discriminate the discourse held by these groups, with the final objective of detecting and preventing its spread. Following this approach, this survey aims to review the contributions of NLP to the field of extremism research, providing the reader with a comprehensive picture of the state of the art of this research area. The content includes a first conceptualization of the term extremism, the elements that compose an extremist discourse and the differences with other terms. After that, a review description and comparison of the frequently used NLP techniques is presented, including how they were applied, the insights they provided, the most frequently used NLP software tools, descriptive and classification applications, and the availability of datasets and data sources for research. Finally, research questions are approached and answered with highlights from the review, while future trends, challenges and directions derived from these highlights are suggested towards stimulating further research in this exciting research area.Item Uso combinado de tecnologías semánticas y análisis visual para la anotación automática de imágenes y su recuperación(2012-01-01) Rodríguez-Vaamonde, Sergio; Ruiz-Ibáñez, Pilar; González-Rodríguez, Marta; Tecnalia Research & Innovation; SG; SGThe present stage of development of semantic systems used in the indexing and retrieval of non-text information on the internet is described. The most effective algorithms for retrieval of non-text information are the combined use of computer vision and content analysis of the text associated with the images. These techniques can lead to the best results in the retrieval of relevant information. The Web's immediate future lies in automatic contextualisation of images to establish similarities between them and to be able to effectively retrieve non-text content.