TRECVID is hosted by the National Institute of Standards and Technology (NIST). The workshop's goal is to encourage research in content-based video retrieval by providing a large test collection, uniform scoring procedures, and a forum for organizations interested in comparing their results.
This year, TRECVID tracks included: Ad-hoc Video Search, Video-to-Text, Deep Video Understanding, Medical Video Question Answering, and Activity Detection in Extended Videos.
The 2023 TREC Video Retrieval (TRECVID) workshop will be hybrid, running November 13-15. Workshop participants include researchers those who took part in TRECVID 2023 evaluation benchmark, and all external researchers from academia and/or industry who are are interested to join. The OpenMFC (Open Media Forensics Challenge will also be held within the TRECVID workshop. For information about OpenMFC, please contact haiying.guanhaiying.guan [at] nist.govnist.gov (Haiying Guan)). More information about the TRECVID program can be found on the TRECVID project page.
Who should attend the workshop?
Attendance is strongly encouraged to all TRECVID participants. External researchers and stakeholders are encouraged to attend by contacting the workshop's Technical Contact, george.awad [at] nist.gov (George Awad), prior to registering.
Day | Start Time | End Time | Talk | Speaker | Affiliation |
Nov. 13 | 9:00 | 9:15 | Introduction and welcome | Ian Soboroff | NIST |
9:15 | 9:45 | Ad-hoc Video Search - Task Overview | Georges Quenot | LABORATOIRE D'INFORMATIQUE DE GRENOBLE | |
9:45 | 10:05 | Waseda_Meisei_SoftBank at TRECVID 2023: Ad-hoc video search | Kazuya Ueki | Meisei University | |
10:05 | 10:25 | Harnessing Large Multimodal Models and Datasets for Ad-hoc Video Search | Fan Hu | Renmin University of China | |
10:25 | 10:45 | WHU-NERCMS@TRECVID 2023: Ad-hoc Search Task | Jiangshan He | National Engineering Research Center for Multimedia Software, School of Computer Science, Wuhan University | |
10:45 | 11:00 | Break | |||
11:00 | 11:20 | Understanding AVS query by generating images and asking questions | Jiaxin Wu | City University of Hong Kong | |
11:20 | 11:40 | NII_UIT at TRECVID 2023: Ad-hoc Video Search Task | Tien V Do | University of Information Technology, VNU-HCMC, Vietnam | |
11:40 | 12:00 | AVS Task Discussion | Coordinators/Teams | ||
12:00 | 1:00 | Lunch | |||
1:00 | 1:30 | Deep Video Understanding - Task Overview | George Awad | NIST | |
1:30 | 1:50 | WHU-NERCMS @ TRECVID 2023: Deep Video Understanding Task | Ruizhe Li | National Engineering Research Center for Multimedia Software, School of Computer Science, Wuhan University | |
1:50 | 2:10 | NII_UIT at TRECVID 2023: Deep Video Understanding Task | An NT Pham | University of Information Technology, VNU-HCMC, Vietnam | |
2:10 | 2:30 | Deep Video Understanding with Video-Language Model | Runze Liu | Nanjing University | |
Nov. 14 | 9:00 | 9:30 | Video-to-Text - Task Overview | Asad Butt | NIST |
9:30 | 9:50 | RUC_AIM3 at TRECVID 2023: Video to Text Description | Kaiwen Wei | Renmin University of China | |
9:50 | 10:10 | BUPT_MCPRL at TRECVID 2023: Video to Text Description | Zeliang Ma, Shuai Jiang | Beijing University of Posts and Telecommunications | |
10:10 | 10:30 | Nagaoka University of Technology at TRECVID 2023: Video to Text | Mutsuki Ishii | Nagaoka University of Technology | |
10:30 | 10:45 | break | |||
10:45 | 11:05 | Waseda_Meisei_SoftBank at TRECVID 2023 | Hiroki Takushima | SoftBank Corp. | |
11:05 | 11:20 | VTT Task Discussion | Coordinators/Teams | ||
11:20 | 11:50 | Activities in Extended Videos - Task Overview | Jonathan Fiscus | NIST | |
11:50 | 12:10 | An Effective Framework for Activity Detection in Untrimmed | Yang Song, HongPu Zhang | Beijing University of Posts and Telecommunications | |
12:10 | 12:25 | ActEv Task Discussion | Coordinators/Teams | ||
12:25 | 13:30 | Lunch | |||
13:30 | 13:40 | OpenMFC Open remark | Jim H. | NIST | |
13:40 | 14:20 | Combatting with DeepFakes | Siwei Lyu | Univ. at Buffalo | |
14:20 | 14:30 | Break | |||
14:30 | 15:10 | Stego | Jennifer Newman | Iowa State University | |
15:10 | 15:50 | Anti-forensics | Matthew Stamm | Drexel University | |
Nov. 15 | 9:00 | 9:30 | Medical Video Question Answering - Task Overview | Deepak Gupta | National Library of Medicine, National Institutes of Health |
9:30 | 9:50 | Medical Question Generation: Leveraging Vision-Language Summarisation Models and Keyword Extraction with Flan-T5 | Zihao Chen | Doshisha University | |
9:50 | 10:10 | T5 Model for Medical Video Temporal Segment Prediction | Owen Deen | University of North Carolina Wilmington | |
10:10 | 10:30 | Attention-based Multimodal Deep Learning Models for Medical Instructional Question Generation | Shaswati Saha, Sanjay Purushotham | University of Maryland, Baltimore County | |
10:30 | 10:45 | break | |||
10:45 | 11:05 | Medical visual question answering via cross-modal representation | Weizhi Nie | School of Electrical and Information Engineering, Tianjin University | |
11:05 | 11:20 | Medical Video Question Answering - Task Discussion | Deepak Gupta | National Library of Medicine, National Institutes of Health | |
11:20 | 11:35 | TRECVID 2024 - Planning and Final Remarks | George Awad | NIST | |
11:35 | 12:30 | Lunch | |||
12:30 | 13:10 | Deepfake (2) | Jun-cheng Chen | Research Center for Information Technology Innovation, Academia Sinica | |
13:10 | 13:50 | Manipulation & Standard | Wendy Dinova-Wimmer | Adobe | |
13:50 | 14:00 | Break | |||
14:00 | 14:40 | Deepfake Activities in Deepmedia | Rijul Gupta | Deepmedia | |
14:40 | 15:15 | OpenMFC Overview | Haiying Guan | NIST |