Image and video data continues to gain popularity among users as hardware costs have decreased, the usage of mobile phones with cameras attached increased, and social media became more popular. The popularity of images and videos derived more computer vision applications such as recognizing objects, alerting security cameras, assisting the blind, supporting robotic usages, autonomous driving & drones, and much more. To track and measure the progress of the field, the TRECVID evaluation project at NIST has been running since 2001 to promote content-based video retrieval through providing the necessary infrastructure to research groups in academia and industry such as real-world datasets, evaluation metrics, and practical use-case scenarios for the research problems. In this talk I will give a high-level overview of a typical video retrieval system and a closer look at different prominent tracks at TRECVID, datasets and how evaluation takes place.
Keywords: Computer Vision, Video Retrieval, Multimedia Understanding
George Awad is a computer scientist at the information retrieval group, Information Access Division at the National Institute of Standards and Technology (NIST). He has a PhD in Computer Science (2007) from Dublin City University. He previously co-organized tutorials and workshops at international conferences including multimedia grand challenges. He has more than 20 publications in international conferences and journals and was awarded jointly the 2018 IEEE Computer Society PAMI Mark Everingham Prize. He is now leading the TRECVID project at NIST which works on evaluating multiple video retrieval and understanding tasks since 2003.