TECNALIA Publications Repository :: Browsing by Author "Hussain, Tanveer"

Browsing by Author "Hussain, Tanveer"

Now showing 1 - 11 of 11

Artificial Intelligence of Things-assisted two-stream neural network for anomaly detection in surveillance Big Video Data
(2022-04) Ullah, Waseem; Ullah, Amin; Hussain, Tanveer; Muhammad, Khan; Heidari, Ali Asghar; Del Ser, Javier; Baik, Sung Wook; De Albuquerque, Victor Hugo C.; IA
In the last few years, visual sensors are deployed almost everywhere, generating a massive amount of surveillance video data in smart cities that can be inspected intelligently to recognize anomalous events. In this work, we present an efficient and robust framework to recognize anomalies from surveillance Big Video Data (BVD) using Artificial Intelligence of Things (AIoT). Smart surveillance is an important application of AIoT and we propose a two-stream neural network in this direction. The first stream comprises instant anomaly detection that is functional over resource-constrained IoT devices, whereas second phase is a two-stream deep neural network allowing for detailed anomaly analysis, suited to be deployed as a cloud computing service. Firstly, a self-pruned fine-tuned lightweight convolutional neural network (CNN) classifies the ongoing events as normal or anomalous in an AIoT environment. Upon anomaly detection, the edge device alerts the concerned departments and transmits the anomalous frames to cloud analysis center for their detailed evaluation in the second phase. The cloud analysis center resorts to the proposed two-stream network, modeled from the integration of spatiotemporal and optical flow features through the sequential frames. Fused features flow through a bi-directional long short-term memory (BD-LSTM) layer, which classifies them into their respective anomaly classes, e.g., assault and abuse. We perform extensive experiments over benchmarks built on top of the UCF-Crime and RWF-2000 datasets to test the effectiveness of our framework. We report a 9.88% and 4.01% increase in accuracy when compared to state-of-the-art methods evaluated over the aforementioned datasets.
Communication Technologies for Edge Learning and Inference: A Novel Framework, Open Issues, and Perspectives
(2023-03-01) Muhammad, Khan; Ser, Javier Del; Magaia, Naercio; Fonseca, Ramon; Hussain, Tanveer; Gandomi, Amir H.; Daneshmand, Mahmoud; De Albuquerque, Victor Hugo C.; IA
With the continuous advancement of smart devices and their demand for data, the complex computation that was previously exclusive to the cloud server is now moving toward the edge of the network. For numerous reasons (e.g., applications demanding low latencies and data privacy), data-based computation has been brought closer to the originating source, forging the edge computing paradigm. Together with machine learning, edge computing has become a powerful local decision-making tool, fostering the advent of edge learning. However, the latter has become delay-sensitive and resource-Thirsty in terms of hardware and networking. New methods have been developed to solve or minimize these issues, as proposed in this study. We first investigated representative communication methods for edge learning and inference (ELI), focusing on data compression, latency, and resource management. Next, we proposed an ELI-based video data prioritization framework that only considers data with events and hence significantly reduces the transmission and storage resources when implemented in surveillance networks. Furthermore, we critically examined various communication aspects related to edge learning by analyzing their issues and highlighting their advantages and disadvantages. Finally, we discuss the challenges and present issues that remain.
DeepReS: A Deep Learning-Based Video Summarization Strategy for Resource-Constrained Industrial Surveillance Scenarios
(2020-09) Muhammad, Khan; Hussain, Tanveer; Del Ser, Javier; Palade, Vasile; De Albuquerque, Victor Hugo C.; IA
The exponential growth in the production of video contents in different industries causes an urgent need for effective video summarization (VS) techniques, in order to get an optimal storage and preservation of key information in the video. Compared to other domains, industrial videos are more challenging to process, as they usually contain diverse and complex events, which make their online processing a difficult task. In this article, we introduce an online system for intelligent video capturing, coarse and fine redundancy removal, and summary generation. First, we capture video data through resource-constrained devices in an industrial Internet of Things network, equipped with vision sensors and apply coarse redundancy removal through the comparison of low-level features. Second, we transmit the resulting frames to the cloud for detailed analysis, where sequential features are extracted for the selection of candidate keyframes. Finally, we refine the candidate keyframes in order to discriminate those with maximum information as part of the summary. The key contributions of this article include the coarse and fine refining of video data implemented over resource-restricted devices and the presentation of important data in the form of a summary. Experiments11[Online]. Available: https://github.com/tanveer-hussain/DeepRes-Video-Summarization. over publicly available datasets evince a 0.3-unit increase in the F1 score when compared to state-of-the-art and with reduced time complexity. Furthermore, we provide convincing results on our newly created dataset in an industrial environment, which is made publicly available for the research community along with its labeled ground truth.
DeepSmoke: Deep learning model for smoke detection and segmentation in outdoor environments
(2021-11-15) Khan, Salman; Muhammad, Khan; Hussain, Tanveer; Ser, Javier Del; Cuzzolin, Fabio; Bhattacharyya, Siddhartha; Akhtar, Zahid; de Albuquerque, Victor Hugo C.; IA
Fire disaster throughout the globe causes social, environmental, and economical damage, making its early detection and instant reporting essential for saving human lives and properties. Smoke detection plays a key role in early fire detection but majority of the existing methods are limited to either indoor or outdoor surveillance environments, with poor performance for hazy scenarios. In this paper, we present a Convolutional Neural Network (CNN)-based smoke detection and segmentation framework for both clear and hazy environments. Unlike existing methods, we employ an efficient CNN architecture, termed EfficientNet, for smoke detection with better accuracy. We also segment the smoke regions using DeepLabv3+, which is supported by effective encoders and decoders along with a pixel-wise classifier for optimum localization. Our smoke detection results evince a noticeable gain up to 3% in accuracy and a decrease of 0.46% in False Alarm Rate (FAR), while segmentation reports a significant increase of 2% and 1% in global accuracy and mean Intersection over Union (IoU) scores, respectively. This makes our method a best fit for smoke detection and segmentation in real-world surveillance settings.
Fuzzy Logic in Surveillance Big Video Data Analysis
(2021-06) Muhammad, Khan; Obaidat, Mohammad S.; Hussain, Tanveer; Ser, Javier Del; Kumar, Neeraj; Tanveer, Mohammad; Doctor, Faiyaz; IA
CCTV cameras installed for continuous surveillance generate enormous amounts of data daily, forging the term Big Video Data (BVD). The active practice of BVD includes intelligent surveillance and activity recognition, among other challenging tasks. To efficiently address these tasks, the computer vision research community has provided monitoring systems, activity recognition methods, and many other computationally complex solutions for the purposeful usage of BVD. Unfortunately, the limited capabilities of these methods, higher computational complexity, and stringent installation requirements hinder their practical implementation in real-world scenarios, which still demand human operators sitting in front of cameras to monitor activities or make actionable decisions based on BVD. The usage of human-like logic, known as fuzzy logic, has been employed emerging for various data science applications such as control systems, image processing, decision making, routing, and advanced safety-critical systems. This is due to its ability to handle various sources of real-world domain and data uncertainties, generating easily adaptable and explainable data-based models. Fuzzy logic can be effectively used for surveillance as a complementary for huge-sized artificial intelligence models and tiresome training procedures. In this article, we draw researchers' attention toward the usage of fuzzy logic for surveillance in the context of BVD. We carry out a comprehensive literature survey of methods for vision sensory data analytics that resort to fuzzy logic concepts. Our overview highlights the advantages, downsides, and challenges in existing video analysis methods based on fuzzy logic for surveillance applications. We enumerate and discuss the datasets used by these methods, and finally provide an outlook toward future research directions derived from our critical assessment of the efforts invested so far in this exciting field.
Intelligent Embedded Vision for Summarization of Multiview Videos in IIoT
(2020-04) Hussain, Tanveer; Muhammad, Khan; Ser, Javier Del; Baik, Sung Wook; De Albuquerque, Victor Hugo C.; IA
Nowadays, video sensors are used on a large scale for various applications, including security monitoring and smart transportation. However, the limited communication bandwidth and storage constraints make it challenging to process such heterogeneous nature of Big Data in real time. Multiview video summarization (MVS) enables us to suppress redundant data in distributed video sensors settings. The existing MVS approaches process video data in offline manner by transmitting them to the local or cloud server for analysis, which requires extra streaming to conduct summarization, huge bandwidth, and are not applicable for integration with industrial Internet of Things (IIoT). This article presents a light-weight convolutional neural network (CNN) and IIoT-based computationally intelligent (CI) MVS framework. Our method uses an IIoT network containing smart devices, Raspberry Pi (RPi) (clients and master) with embedded cameras to capture multiview video data. Each client RPi detects target in frames via light-weight CNN model, analyzes these targets for traffic and crowd density, and searches for suspicious objects to generate alert in the IIoT network. The frames of each client RPi are encoded and transmitted with approximately 17.02% smaller size of each frame to master RPi for final MVS. Empirical analysis shows that our proposed framework can be used in industrial environments for various applications such as security and smart transportation and can be proved beneficial for saving resources.11[Online]. Available: https://github.com/tanveer-hussain/Embedded-Vision-for-MVS.
Modelling Electricity Consumption During the COVID19 Pandemic: Datasets, Models, Results and a Research Agenda
(2023-09-01) Khan, Zulfiqar Ahmad; Hussain, Tanveer; Ullah, Amin; Ullah, Waseem; Del Ser, Javier; Muhammad, Khan; Sajjad, Muhammad; Baik, Sung Wook; IA
The COVID19 pandemic has impacted the global economy, social activities, and Electricity Consumption (EC), affecting the performance of historical data-based Electricity Load Forecasting (ELF) algorithms. This study thoroughly analyses the pandemic's impact on these models and develop a hybrid model with better prediction accuracy using COVID19 data. Existing datasets are reviewed, and their limited generalization potential for the COVID19 period is highlighted. A dataset of 96 residential customers, comprising 36 and six months before and after the pandemic, is collected, posing significant challenges for current models. The proposed model employs convolutional layers for feature extraction, gated recurrent nets for temporal feature learning, and a self-attention module for feature selection, leading to better generalization for predicting EC patterns. Our proposed model outperforms existing models, as demonstrated by a detailed ablation study using our dataset. For instance, it achieves an average reduction of 0.56% & 3.46% in MSE, 1.5% & 5.07% in RMSE, and 11.81% & 13.19% in MAPE over the pre- and post-pandemic data, respectively. However, further research is required to address the varied nature of the data. These findings have significant implications for improving ELF algorithms during pandemics and other significant events that disrupt historical data patterns.
Multiview Summarization and Activity Recognition Meet Edge Computing in IoT Environments
(2021-06-15) Hussain, Tanveer; Muhammad, Khan; Ullah, Amin; Ser, Javier Del; Gandomi, Amir H.; Sajjad, Muhammad; Baik, Sung Wook; De Albuquerque, Victor Hugo C.; IA
Multiview video summarization (MVS) has not received much attention from the research community due to inter-view correlations and views' overlapping, etc. The majority of previous MVS works are offline, relying on only summary, and require additional communication bandwidth and transmission time, with no focus on foggy environments. We propose an edge intelligence-based MVS and activity recognition framework that combines artificial intelligence with Internet of Things (IoT) devices. In our framework, resource-constrained devices with cameras use a lightweight CNN-based object detection model to segment multiview videos into shots, followed by mutual information computation that helps in a summary generation. Our system does not rely solely on a summary, but encodes and transmits it to a master device using a neural computing stick for inter-view correlations computation and efficient activity recognition, an approach which saves computation resources, communication bandwidth, and transmission time. Experiments show an increase of 0.4 unit in F -measure on an MVS Office dateset and 0.2% and 2% improved accuracy for UCF-50 and YouTube 11 datesets, respectively, with lower storage and transmission times. The processing time is reduced from 1.23 to 0.45 s for a single frame and optimally 0.75 seconds faster MVS. A new dateset is constructed by synthetically adding fog to an MVS dateset to show the adaptability of our system for both certain and uncertain IoT surveillance environments.
QuickLook: Movie summarization using scene-based leading characters with psychological cues fusion
(2021-12) Haq, Ijaz Ul; Muhammad, Khan; Hussain, Tanveer; Ser, Javier Del; Sajjad, Muhammad; Baik, Sung Wook; IA
Due to recent advances in the film industry, the production of movies has grown exponentially, which has led to challenges in what is referred to as discoverability: given the overwhelming number of choices, choosing which film to watch has become a tedious task for audiences. Movie summarization (MS) could help, as it presents the central theme of the movie in a compact format and makes browsing more efficient for the audience. In this paper, we present an automatic MS framework coined as ‘QuickLook’, which identifies the leading characters and fuses multiple cues extracted from a movie. Firstly, the movie data is preprocessed for its division into scenes, followed by shot segmentation. Secondly, the leading characters in each segmented scene are determined. Next, four visual cues that capture the film's scenic beauty, memorability, informativeness and emotional resonance are extracted from shots containing the leading characters. These extracted features are then intelligently fused based on the assignment of different weights; shots with a fusion score above a certain threshold are selected for the final summary. The proposed MS framework is assessed by comparison with official trailers from ten Hollywood movies, providing a novel baseline for future fair comparison in the MS literature. The proposed framework is shown to outperform other state-of-the-art MS methods in terms of enjoyability and informativeness.
Vision-Based Semantic Segmentation in Scene Understanding for Autonomous Driving: Recent Achievements, Challenges, and Outlooks
(2022-12-01) Muhammad, Khan; Hussain, Tanveer; Ullah, Hayat; Ser, Javier Del; Rezaei, Mahdi; Kumar, Neeraj; Hijji, Mohammad; Bellavista, Paolo; De Albuquerque, Victor Hugo C.; IA
Scene understanding plays a crucial role in autonomous driving by utilizing sensory data for contextual information extraction and decision making. Beyond modeling advances, the enabler for vehicles to become aware of their surroundings is the availability of visual sensory data, which expand the vehicular perception and realizes vehicular contextual awareness in real-world environments. Research directions for scene understanding pursued by related studies include person/vehicle detection and segmentation, their transition analysis, lane change, and turns detection, among many others. Unfortunately, these tasks seem insufficient to completely develop fully-autonomous vehicles i.e., achieving level-5 autonomy, travelling just like human-controlled cars. This latter statement is among the conclusions drawn from this review paper: scene understanding for autonomous driving cars using vision sensors still requires significant improvements. With this motivation, this survey defines, analyzes, and reviews the current achievements of the scene understanding research area that mostly rely on computationally complex deep learning models. Furthermore, it covers the generic scene understanding pipeline, investigates the performance reported by the state-of-the-art, informs about the time complexity analysis of avant garde modeling choices, and highlights major triumphs and noted limitations encountered by current research efforts. The survey also includes a comprehensive discussion on the available datasets, and the challenges that, even if lately confronted by researchers, still remain open to date. Finally, our work outlines future research directions to welcome researchers and practitioners to this exciting domain.
Visual Appearance and Soft Biometrics Fusion for Person Re-Identification Using Deep Learning
(2023-05-01) Khan, Samee Ullah; Khan, Noman; Hussain, Tanveer; Muhammad, Khan; Hijji, Mohammad; Del Ser, Javier; Baik, Sung Wook; IA
Learning descriptions of individual pedestrian is a common goal of both person re-identification (P-ReID) and attribute recognition methods, which are typically differentiated only in terms of their granularity. However, existing P-ReID methods only consider identification labels for individual pedestrian. In this article, we present a multi-scale pyramid attention (MSPA) model for P-ReID that jointly manipulates the complementarity between semantic attributes and visual appearance to address this limitation. The proposed MSPA method mainly comprises three steps. Initially, a backbone model followed by appearance and attribute networks is individually trained to perform P-ReID and pedestrian attribute classification tasks. The attribute network primarily focuses on suppressed image areas associated with soft biometric data while retaining the semantic context among attributes using a convolutional long short-term memory architecture. Additionally, the identification network extracts rich contextual features from an image at varying scales using a residual pyramid module. In the second step, the dual network features are fused, and MSPA is re-trained for the P-ReID task to further improve its complementary capabilities. Finally, we experimentally evaluated the proposed model on the two benchmark datasets Market-1501 and DukeMTMC-reID, and the results show that our approach achieved state-of-the-art performance.

Browsing by Author "Hussain, Tanveer"

Results Per Page

Sort Options