Ground Truth and Benchmarks for Performance Evaluation

A Takeuchi; Michael O. Shneier; Tsai H. Hong; Christopher J. Scrapper Jr; Geraldine Cheok

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

PUBLICATIONS

Ground Truth and Benchmarks for Performance Evaluation

Published

April 25, 2003

Author(s)

A Takeuchi, Michael O. Shneier, Tsai H. Hong, Christopher J. Scrapper Jr, Geraldine Cheok

Abstract

Progress in algorithm development and transfer of results to practical applications such as military robotics requires the setup of standard tasks, of standard qualitative and quantitative measurements for performance evaluation and validation. Although the evaluation and validation of algorithms have been discussed for over a decade, the research community still faces a lack of well-defined and standardized methodology. The range of fundamental problems include a lack of quantifiable measures of performance, a lack of data from state-of-the-art sensors in calibrated real-world environments, and a lack of facilities for conducting realistic experiments. In this research, we propose three methods for creating ground truth databases and benchmarks using multiple sensors. The databases and benchmarks will provide researchers with high quality data from suites of sensors operating in complex environments representing real problems of great relevance to the development of autonomous driving systems. At National Institute of Standards and Technology (NIST), we have prototyped a High Mobility Multi-purpose Wheeled Vehicle (HMMWV) system with a suite of sensors including a Riegl ladar, General Dynamics Robotics Systems (GDRS) ladar, stereo Charge Coupled Device (CCD), several color cameras, Global Position System (GPS), Inertial Navigation System (INS), pan/tilt encoders, and odometry*. All sensors are calibrated with respect to each other in space and time. This allows a database of features and terrain elevation to be built. Ground truth for each sensor can then be extracted from the database. The main goal of this research is to provide ground truth databases for researchers and engineers to evaluate algorithms for effectiveness, efficiency, reliability, and robustness, thus advancing the development of algorithms.

Proceedings Title

Proceedings of the SPIE Aerosense Conference

Conference Dates

April 21-25, 2003

Conference Location

Orlando, FL, USA

Conference Title

SPIE Aerosense Conference

Pub Type

Conferences

Download Paper

Local Download

Keywords

ground truth, ladar, mobile robots, Performance evaluation, Robotics & Intelligent Systems, sensory processing, Unmanned Systems

Citation

Takeuchi, A. , Shneier, M. , Hong, T. , Scrapper Jr, C. and Cheok, G. (2003), Ground Truth and Benchmarks for Performance Evaluation, Proceedings of the SPIE Aerosense Conference, Orlando, FL, USA, [online], https://tsapps.nist.gov/publication/get_pdf.cfm?pub_id=822594 (Accessed July 2, 2025)

Issues

If you have any questions about this publication or are having problems accessing it, please contact [email protected].

Created April 24, 2003, Updated October 12, 2021

Was this page helpful?

Ground Truth and Benchmarks for Performance Evaluation

Author(s)

Abstract

Download Paper

Keywords

Citation

Additional citation formats

Issues