ResNet50-deep affinity network for object detection and tracking in videos

Journal: International Journal of Advanced Technology and Engineering Exploration (IJATEE) (Vol.11, No. 111)

Publication Date: 2024-02-29

Authors : Nandeeswar Sampigehalli Basavaraju; Pallavi Hallappanavar Basavaraja;

Page : 190-204

Keywords : Deep tracking; Multiple object tracking; Object detection; Online tracking; Tracking challenge; Video surveillance.;

Source : Download Find it from : Google Scholar

Abstract

Multiple-object tracking (MOT) plays a crucial role in addressing many fundamental challenges within the fields of computer vision and video analysis. The majority of MOT methods rely on two primary processes: object detection and data association. Initially, each video frame is analyzed to detect objects, followed by a subsequent step that establishes correlations among the detected objects across multiple frames to generate their tracks. However, the data association for tracking often relies on manually defined criteria such as motion, appearance, grouping, and spatial proximity, among others. In the study, the ResNet50-deep affinity network (DAN) was introduced, which had been designed for the detection and tracking of objects in videos, including those that appear and disappear between frames. The proposed method was evaluated using the widely recognized MOT17 dataset to address MOT challenges. During the preprocessing phase, photometric distortion correction, frame expansion, and cropping were performed. The ResNet50 model was utilized to extract features. The DAN was employed to identify object appearances in video frames and to calculate their cross-frame affinities (CFA). The approach was compared with existing research, including DAN, ByteTrack, the graph neural network for simultaneous detection and tracking (GSDT), the reptile search optimization algorithm with deep learning-based multiple object detection and tracking (RSOADL-MODT), the center graph network (CGTracker), the hybrid motion model, FlowNet2-deep learning, and the super chained tracker (SCT), to validate the efficiency of the ResNet50-DAN method. The ResNet50-DAN method achieved superior results, with a multiple-object tracking accuracy (MOTA) of 84.2%, an F1 score for identification metrics (IDF1) of 80.3%, 10,352 false positives (FP), and 1,284 identity switches (ID-Sw). The ResNet50-DAN method demonstrated higher MOTA compared to the existing approaches, including DAN, ByteTrack, GSDT, RSOADL-MODT, CGTracker, the hybrid motion model, FlowNet2-DL, and SCT.

Main Menu

Searching By

PARTNERS

ResNet50-deep affinity network for object detection and tracking in videos

Abstract

Advertisement