NOTICE: Due to a lapse in annual appropriations, most of this website is not being updated. Learn more.
Form submissions will still be accepted but will not receive responses at this time. Sections of this site for programs using non-appropriated funds (such as NVLAP) or those that are excepted from the shutdown (such as CHIPS and NVD) will continue to be updated.
An official website of the United States government
Here’s how you know
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
Secure .gov websites use HTTPS
A lock (
) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.
Computing confidence intervals for common IR measures
Published
Author(s)
Ian M. Soboroff
Abstract
Confidence intervals quantify the uncertainty in an average and o↵er a robust alternative to hypothesis testing. We measure the performance of standard and bootstrapped con- fidence intervals on a number of common IR measures using several TREC and NTCIR collections. The performance of an interval is its empirical coverage of the estimated statistic. We find that both standard and bootstrapped intervals give excellent coverage for all measures except in situations of abysmal retrieval performance. We recommend using stan- dard confidence intervals when statistical software is handy, and bootstrap percentile intervals as equivalent when no sta- tistical libraries are available.
Proceedings Title
Proceedings of the Workshop on Evaluation for Information Access (EVIA 2014)
Soboroff, I.
(2014),
Computing confidence intervals for common IR measures, Proceedings of the Workshop on Evaluation for Information Access (EVIA 2014), Tokyo, -1, [online], https://tsapps.nist.gov/publication/get_pdf.cfm?pub_id=917303
(Accessed October 10, 2025)