NOTICE: Due to a lapse in annual appropriations, most of this website is not being updated. Learn more.
Form submissions will still be accepted but will not receive responses at this time. Sections of this site for programs using non-appropriated funds (such as NVLAP) or those that are excepted from the shutdown (such as CHIPS and NVD) will continue to be updated.
An official website of the United States government
Here’s how you know
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
Secure .gov websites use HTTPS
A lock (
) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.
Classical and Bayesian Interpretation of the Birge Test of Consistency and Its Generalized Version in Interlaboratory Evaluations
Published
Author(s)
Raghu N. Kacker, Alistair Forbes, Ruediger Kessel, K D. Sommer
Abstract
The results from an interlaboratory evaluation are said to be consistent if their dispersion is not more than what can reasonably be attributed to their stated variances. A well known test of consistency in interlaboratory evaluations is the Birge test, named after its developer physicist Raymond T. Birge. We show that the Birge test may be interpreted as a classical test of the null hypothesis that the variances of the results are less than or equal to their stated values against the alternative hypothesis that the variances of the results are more than their stated values. A modern protocol for hypothesis testing is to calculate the classical p-value under the null hypothesis of realizing a value of the test statistic equal to or larger than observed (realized) and to reject the null hypothesis when the p-value is too small. We show that, interestingly, the classical p-value of the Birge test statistic is equal to the Bayesian posterior probability of the null hypothesis based on commonly used non-informative prior distributions for the unknown statistical parameters. Thus the Birge test may be interpreted also as a Bayesian test of the hypothesis of consistency. The Birge test of consistency was developed for those interlaboratory evaluations where the results are uncorrelated. We present a general test of consistency for both correlated and uncorrelated results, of which the Birge test is a special case. Then we show that the classical p-value of the general test statistic under the null hypothesis of realizing a value equal to or larger than observed (realized) is equal to the Bayesian posterior probability of the null hypothesis based on non-informative prior distributions. The general test makes it possible to check consistency of correlated results from interlaboratory evaluations.
Kacker, R.
, Forbes, A.
, Kessel, R.
and Sommer, K.
(2008),
Classical and Bayesian Interpretation of the Birge Test of Consistency and Its Generalized Version in Interlaboratory Evaluations, Metrologia, [online], https://tsapps.nist.gov/publication/get_pdf.cfm?pub_id=51220
(Accessed October 1, 2025)