Skip to main content
U.S. flag

An official website of the United States government

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

MFC Datasets: Large-Scale Benchmark Datasets for Media Forensic Challenge Evaluation



Haiying Guan, Mark Kozak, Eric Robertson, Yooyoung Lee, Amy Yates, Andrew Delgado, Daniel F. Zhou, Timothée N. Kheyrkhah, Jeff Smith, Jonathan G. Fiscus


We provide a benchmark for digital media forensic challenge evaluations. A series of datasets are used to assess the progress and deeply analyze the performance of diverse systems on different media forensic tasks across last two years. The benchmark data contains four major parts: (1) 35 million images and 300,000 video clips world data download from the internet with their characteristics and labels; (2) up to 176,000 pristine high provenance (HP) images and 11,000 HP videos; (3) approximately 100,000 manipulated images and 4,000 manipulated videos from approximately 5,000 image manipulation journals, and over 500 video manipulation journals with manipulation history graphs and annotation details. (4) a series of evaluation datasets with reference ground-truth to support 6 challenge tasks in media forensic challenge evaluations. In the paper, we first introduce the objectives, challenges, and approaches to building media forensic evaluation datasets. We then discuss our approaches to forensic dataset collection, annotation, and manipulation, and present the design and infrastructure to effectively and efficiently build the evaluation datasets to support different evaluation tasks. Given a specified query, we build an infrastructure that dynamically generates the evaluation comparison subsets for the specified evaluation analysis report. Finally, we demonstrate the evaluation results in the past evaluations.
Proceedings Title
IEEE Winter Conference on Applications of Computer Vision (WACV 2019)
Conference Dates
January 8-11, 2019
Conference Location
Waikola, HI
Conference Title
WACV 2019


Guan, H. , Kozak, M. , Robertson, E. , Lee, Y. , Yates, A. , Delgado, A. , Zhou, D. , Kheyrkhah, T. , Smith, J. and , J. (2019), MFC Datasets: Large-Scale Benchmark Datasets for Media Forensic Challenge Evaluation, IEEE Winter Conference on Applications of Computer Vision (WACV 2019), Waikola, HI, [online], (Accessed May 30, 2024)


If you have any questions about this publication or are having problems accessing it, please contact

Created January 10, 2019, Updated December 31, 2019