TY - GEN
T1 - VIDI
T2 - 14th IEEE Image, Video, and Multidimensional Signal Processing Workshop, IVMSP 2022
AU - Sesver, Duygu
AU - Gencoglu, Alp Eren
AU - Yildiz, Cagri Emre
AU - Gunindi, Zehra
AU - Habibi, Faeze
AU - Yazici, Ziya Ata
AU - Ekenel, Hazim Kemal
N1 - Publisher Copyright:
© 2022 IEEE.
PY - 2022
Y1 - 2022
N2 - Automatic detection of natural disasters and incidents has become more important as a tool for fast response. There have been many studies to detect incidents using still images and text. However, the number of approaches that exploit temporal information is rather limited. One of the main reasons for this is that a diverse video dataset with various incident types does not exist. To address this need, in this paper we present a video dataset - Video Dataset of Incidents, VIDI - that contains 4,534 video clips corresponding to 43 incident categories. Each incident class has around 100 videos with a duration of ten seconds on average. To increase diversity, the videos have been searched in several languages. To assess the performance of the recent state-of-the-art approaches, Vision Transformer and TimeSformer, as well as to explore the contribution of video-based information for incident classification, we performed benchmark experiments on the VIDI and Incidents Dataset. We have shown that the recent methods improve the incident classification accuracy. We have found that employing video data is very beneficial for the task. By using the video data, the top-1 accuracy is increased to 76.56% from 67.37%, which was obtained using a single frame. VIDI will be made publicly available. Additional materials can be found at the following link: https://github.com/vididataset/VIDI
AB - Automatic detection of natural disasters and incidents has become more important as a tool for fast response. There have been many studies to detect incidents using still images and text. However, the number of approaches that exploit temporal information is rather limited. One of the main reasons for this is that a diverse video dataset with various incident types does not exist. To address this need, in this paper we present a video dataset - Video Dataset of Incidents, VIDI - that contains 4,534 video clips corresponding to 43 incident categories. Each incident class has around 100 videos with a duration of ten seconds on average. To increase diversity, the videos have been searched in several languages. To assess the performance of the recent state-of-the-art approaches, Vision Transformer and TimeSformer, as well as to explore the contribution of video-based information for incident classification, we performed benchmark experiments on the VIDI and Incidents Dataset. We have shown that the recent methods improve the incident classification accuracy. We have found that employing video data is very beneficial for the task. By using the video data, the top-1 accuracy is increased to 76.56% from 67.37%, which was obtained using a single frame. VIDI will be made publicly available. Additional materials can be found at the following link: https://github.com/vididataset/VIDI
KW - incident classification
KW - video incidents dataset
KW - Video processing
UR - http://www.scopus.com/inward/record.url?scp=85135183866&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85135183866&partnerID=8YFLogxK
U2 - 10.1109/IVMSP54334.2022.9816319
DO - 10.1109/IVMSP54334.2022.9816319
M3 - Conference contribution
AN - SCOPUS:85135183866
T3 - IVMSP 2022 - 2022 IEEE 14th Image, Video, and Multidimensional Signal Processing Workshop
BT - IVMSP 2022 - 2022 IEEE 14th Image, Video, and Multidimensional Signal Processing Workshop
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 26 June 2022 through 29 June 2022
ER -