TECNALIA Publications Repository :: Browsing by Author "Baik, Sung Wook"

Browsing by Author "Baik, Sung Wook"

Now showing 1 - 7 of 7

Activity Recognition Using Temporal Optical Flow Convolutional Features and Multilayer LSTM
(2019-12) Ullah, Amin; Muhammad, Khan; Del Ser, Javier; Baik, Sung Wook; De Albuquerque, Victor Hugo C.; IA
Nowadays digital surveillance systems are universally installed for continuously collecting enormous amounts of data, thereby requiring human monitoring for the identification of different activities and events. Smarter surveillance is the need of this era through which normal and abnormal activities can be automatically identified using artificial intelligence and computer vision technology. In this paper, we propose a framework for activity recognition in surveillance videos captured over industrial systems. The continuous surveillance video stream is first divided into important shots, where shots are selected using the proposed convolutional neural network (CNN) based human saliency features. Next, temporal features of an activity in the sequence of frames are extracted by utilizing the convolutional layers of a FlowNet2 CNN model. Finally, a multilayer long short-term memory is presented for learning long-term sequences in the temporal optical flow features for activity recognition. Experiments11https://github.com/Aminullah6264/Activity-Rec-ML-LSTM. are conducted using different benchmark action and activity recognition datasets, and the results reveal the effectiveness of the proposed method for activity recognition in industrial settings compared with state-of-the-art methods.
Artificial Intelligence of Things-assisted two-stream neural network for anomaly detection in surveillance Big Video Data
(2022-04) Ullah, Waseem; Ullah, Amin; Hussain, Tanveer; Muhammad, Khan; Heidari, Ali Asghar; Del Ser, Javier; Baik, Sung Wook; De Albuquerque, Victor Hugo C.; IA
In the last few years, visual sensors are deployed almost everywhere, generating a massive amount of surveillance video data in smart cities that can be inspected intelligently to recognize anomalous events. In this work, we present an efficient and robust framework to recognize anomalies from surveillance Big Video Data (BVD) using Artificial Intelligence of Things (AIoT). Smart surveillance is an important application of AIoT and we propose a two-stream neural network in this direction. The first stream comprises instant anomaly detection that is functional over resource-constrained IoT devices, whereas second phase is a two-stream deep neural network allowing for detailed anomaly analysis, suited to be deployed as a cloud computing service. Firstly, a self-pruned fine-tuned lightweight convolutional neural network (CNN) classifies the ongoing events as normal or anomalous in an AIoT environment. Upon anomaly detection, the edge device alerts the concerned departments and transmits the anomalous frames to cloud analysis center for their detailed evaluation in the second phase. The cloud analysis center resorts to the proposed two-stream network, modeled from the integration of spatiotemporal and optical flow features through the sequential frames. Fused features flow through a bi-directional long short-term memory (BD-LSTM) layer, which classifies them into their respective anomaly classes, e.g., assault and abuse. We perform extensive experiments over benchmarks built on top of the UCF-Crime and RWF-2000 datasets to test the effectiveness of our framework. We report a 9.88% and 4.01% increase in accuracy when compared to state-of-the-art methods evaluated over the aforementioned datasets.
Intelligent Embedded Vision for Summarization of Multiview Videos in IIoT
(2020-04) Hussain, Tanveer; Muhammad, Khan; Ser, Javier Del; Baik, Sung Wook; De Albuquerque, Victor Hugo C.; IA
Nowadays, video sensors are used on a large scale for various applications, including security monitoring and smart transportation. However, the limited communication bandwidth and storage constraints make it challenging to process such heterogeneous nature of Big Data in real time. Multiview video summarization (MVS) enables us to suppress redundant data in distributed video sensors settings. The existing MVS approaches process video data in offline manner by transmitting them to the local or cloud server for analysis, which requires extra streaming to conduct summarization, huge bandwidth, and are not applicable for integration with industrial Internet of Things (IIoT). This article presents a light-weight convolutional neural network (CNN) and IIoT-based computationally intelligent (CI) MVS framework. Our method uses an IIoT network containing smart devices, Raspberry Pi (RPi) (clients and master) with embedded cameras to capture multiview video data. Each client RPi detects target in frames via light-weight CNN model, analyzes these targets for traffic and crowd density, and searches for suspicious objects to generate alert in the IIoT network. The frames of each client RPi are encoded and transmitted with approximately 17.02% smaller size of each frame to master RPi for final MVS. Empirical analysis shows that our proposed framework can be used in industrial environments for various applications such as security and smart transportation and can be proved beneficial for saving resources.11[Online]. Available: https://github.com/tanveer-hussain/Embedded-Vision-for-MVS.
Modelling Electricity Consumption During the COVID19 Pandemic: Datasets, Models, Results and a Research Agenda
(2023-09-01) Khan, Zulfiqar Ahmad; Hussain, Tanveer; Ullah, Amin; Ullah, Waseem; Del Ser, Javier; Muhammad, Khan; Sajjad, Muhammad; Baik, Sung Wook; IA
The COVID19 pandemic has impacted the global economy, social activities, and Electricity Consumption (EC), affecting the performance of historical data-based Electricity Load Forecasting (ELF) algorithms. This study thoroughly analyses the pandemic's impact on these models and develop a hybrid model with better prediction accuracy using COVID19 data. Existing datasets are reviewed, and their limited generalization potential for the COVID19 period is highlighted. A dataset of 96 residential customers, comprising 36 and six months before and after the pandemic, is collected, posing significant challenges for current models. The proposed model employs convolutional layers for feature extraction, gated recurrent nets for temporal feature learning, and a self-attention module for feature selection, leading to better generalization for predicting EC patterns. Our proposed model outperforms existing models, as demonstrated by a detailed ablation study using our dataset. For instance, it achieves an average reduction of 0.56% & 3.46% in MSE, 1.5% & 5.07% in RMSE, and 11.81% & 13.19% in MAPE over the pre- and post-pandemic data, respectively. However, further research is required to address the varied nature of the data. These findings have significant implications for improving ELF algorithms during pandemics and other significant events that disrupt historical data patterns.
Multiview Summarization and Activity Recognition Meet Edge Computing in IoT Environments
(2021-06-15) Hussain, Tanveer; Muhammad, Khan; Ullah, Amin; Ser, Javier Del; Gandomi, Amir H.; Sajjad, Muhammad; Baik, Sung Wook; De Albuquerque, Victor Hugo C.; IA
Multiview video summarization (MVS) has not received much attention from the research community due to inter-view correlations and views' overlapping, etc. The majority of previous MVS works are offline, relying on only summary, and require additional communication bandwidth and transmission time, with no focus on foggy environments. We propose an edge intelligence-based MVS and activity recognition framework that combines artificial intelligence with Internet of Things (IoT) devices. In our framework, resource-constrained devices with cameras use a lightweight CNN-based object detection model to segment multiview videos into shots, followed by mutual information computation that helps in a summary generation. Our system does not rely solely on a summary, but encodes and transmits it to a master device using a neural computing stick for inter-view correlations computation and efficient activity recognition, an approach which saves computation resources, communication bandwidth, and transmission time. Experiments show an increase of 0.4 unit in F -measure on an MVS Office dateset and 0.2% and 2% improved accuracy for UCF-50 and YouTube 11 datesets, respectively, with lower storage and transmission times. The processing time is reduced from 1.23 to 0.45 s for a single frame and optimally 0.75 seconds faster MVS. A new dateset is constructed by synthetically adding fog to an MVS dateset to show the adaptability of our system for both certain and uncertain IoT surveillance environments.
QuickLook: Movie summarization using scene-based leading characters with psychological cues fusion
(2021-12) Haq, Ijaz Ul; Muhammad, Khan; Hussain, Tanveer; Ser, Javier Del; Sajjad, Muhammad; Baik, Sung Wook; IA
Due to recent advances in the film industry, the production of movies has grown exponentially, which has led to challenges in what is referred to as discoverability: given the overwhelming number of choices, choosing which film to watch has become a tedious task for audiences. Movie summarization (MS) could help, as it presents the central theme of the movie in a compact format and makes browsing more efficient for the audience. In this paper, we present an automatic MS framework coined as ‘QuickLook’, which identifies the leading characters and fuses multiple cues extracted from a movie. Firstly, the movie data is preprocessed for its division into scenes, followed by shot segmentation. Secondly, the leading characters in each segmented scene are determined. Next, four visual cues that capture the film's scenic beauty, memorability, informativeness and emotional resonance are extracted from shots containing the leading characters. These extracted features are then intelligently fused based on the assignment of different weights; shots with a fusion score above a certain threshold are selected for the final summary. The proposed MS framework is assessed by comparison with official trailers from ten Hollywood movies, providing a novel baseline for future fair comparison in the MS literature. The proposed framework is shown to outperform other state-of-the-art MS methods in terms of enjoyability and informativeness.
Visual Appearance and Soft Biometrics Fusion for Person Re-Identification Using Deep Learning
(2023-05-01) Khan, Samee Ullah; Khan, Noman; Hussain, Tanveer; Muhammad, Khan; Hijji, Mohammad; Del Ser, Javier; Baik, Sung Wook; IA
Learning descriptions of individual pedestrian is a common goal of both person re-identification (P-ReID) and attribute recognition methods, which are typically differentiated only in terms of their granularity. However, existing P-ReID methods only consider identification labels for individual pedestrian. In this article, we present a multi-scale pyramid attention (MSPA) model for P-ReID that jointly manipulates the complementarity between semantic attributes and visual appearance to address this limitation. The proposed MSPA method mainly comprises three steps. Initially, a backbone model followed by appearance and attribute networks is individually trained to perform P-ReID and pedestrian attribute classification tasks. The attribute network primarily focuses on suppressed image areas associated with soft biometric data while retaining the semantic context among attributes using a convolutional long short-term memory architecture. Additionally, the identification network extracts rich contextual features from an image at varying scales using a residual pyramid module. In the second step, the dual network features are fused, and MSPA is re-trained for the P-ReID task to further improve its complementary capabilities. Finally, we experimentally evaluated the proposed model on the two benchmark datasets Market-1501 and DukeMTMC-reID, and the results show that our approach achieved state-of-the-art performance.

Browsing by Author "Baik, Sung Wook"

Results Per Page

Sort Options