Skip to main content
U.S. flag

An official website of the United States government

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

Ground Truth and Benchmarks for Performance Evaluation



A Takeuchi, Michael O. Shneier, Tsai H. Hong, Christopher J. Scrapper Jr, Geraldine Cheok


Progress in algorithm development and transfer of results to practical applications such as military robotics requires the setup of standard tasks, of standard qualitative and quantitative measurements for performance evaluation and validation. Although the evaluation and validation of algorithms have been discussed for over a decade, the research community still faces a lack of well-defined and standardized methodology. The range of fundamental problems include a lack of quantifiable measures of performance, a lack of data from state-of-the-art sensors in calibrated real-world environments, and a lack of facilities for conducting realistic experiments. In this research, we propose three methods for creating ground truth databases and benchmarks using multiple sensors. The databases and benchmarks will provide researchers with high quality data from suites of sensors operating in complex environments representing real problems of great relevance to the development of autonomous driving systems. At National Institute of Standards and Technology (NIST), we have prototyped a High Mobility Multi-purpose Wheeled Vehicle (HMMWV) system with a suite of sensors including a Riegl ladar, General Dynamics Robotics Systems (GDRS) ladar, stereo Charge Coupled Device (CCD), several color cameras, Global Position System (GPS), Inertial Navigation System (INS), pan/tilt encoders, and odometry*. All sensors are calibrated with respect to each other in space and time. This allows a database of features and terrain elevation to be built. Ground truth for each sensor can then be extracted from the database. The main goal of this research is to provide ground truth databases for researchers and engineers to evaluate algorithms for effectiveness, efficiency, reliability, and robustness, thus advancing the development of algorithms.
Proceedings Title
Proceedings of the SPIE Aerosense Conference
Conference Dates
April 21-25, 2003
Conference Location
Orlando, FL, USA
Conference Title
SPIE Aerosense Conference


ground truth, ladar, mobile robots, Performance evaluation, Robotics & Intelligent Systems, sensory processing, Unmanned Systems


Takeuchi, A. , Shneier, M. , Hong, T. , Scrapper Jr, C. and Cheok, G. (2003), Ground Truth and Benchmarks for Performance Evaluation, Proceedings of the SPIE Aerosense Conference, Orlando, FL, USA, [online], (Accessed February 21, 2024)
Created April 24, 2003, Updated October 12, 2021