NIST logo
*

Uncertainty Estimation in Interlaboratory Studies

Summary:

Statistical modeling and analysis of interlaboratory comparisons pose several fundamental questions about determination of the consensus value and its associated uncertainty. An appropriate choice of statistical model can be difficult especially when measurements are made across a range of values of a physical characteristic, i.e. the reference value is a curve or a multivariate vector. Under a random effects model we investigated the behavior of the maximum likelihood estimator and its simplified moment-type estimates. Conservative confidence intervals/ellipsoids have been found in many applications.

Description:

Data sets consisting of multivariate samples occur increasingly often in applications in particular in in interlaboratory studies known as Key Comparisons in which the key comparison reference value (KCRV) is to be determined. The vector samples, being independent realizations of an underlying stochastic process, present common features that a researcher wants to investigate. In many applications, these features have irregular shapes that cannot be adequately captured by traditional statistical models. We derive statistical procedures for vector KCRV evaluation (a discretized version of possibly irregular underlying curve) along with estimates of the uncertainty of this value by using a model taken from meta-analysis methodology under the assumption of Gaussian distributions and equivalent qualification of all participating laboratories. A class of matrix-weighted vector means for the KCRV and a method of assessing the uncertainty of the resulting KCRV estimates are obtained. In particular, we analyzed the estimation problem of the covariance matrix of the KCRV, approximate confidence ellipsoids are constructed for these estimators.

One of the motivating examples for this study was the Key Comparisons of accelerometers (CCAUV.V-K1, von Martens et al, 2002) which was organized to compare measurements of sinusoidal linear accelerometers over the range of frequencies from 40Hz to 5kHz. (Each accelerometer measured charge sensitivity at the specified frequencies and at different acceleration amplitudes.) Two types of (single-ended design and back-to-back) were employed at each of twelve NMIs (including NIST), with the Physikalish-Technische Bundesanstalt, Germany serving as the coordinating laboratory. Each participating NMI reported its own laboratory means, the within lab sample covariance matrices (Type A uncertainties). The KCRV for charge sensitivity as a function of frequency is determined by our method. In the original study the KCRV was found separately for each type of accelerometer and for each specified frequency.  

Another example is related to the study of Pyroceram 9606, a glass ceramic material especially suited for high temperature applications. This material is being used for performance evaluation of instruments measuring thermal properties such as thermal conductivity, thermal diffusivity, and specific heat (heat capacity). All these characteristics are temperature dependent, so that the reference value must be a function of temperature. Twenty eight thermal conductivity experiments in different countries have been performed on this material, and a consensus value was needed. Indeed, the data from different laboratories were of widely differing quality.  

Under random effects model the maximum likelihood estimator maximizes the likelihood function, which can have many local extrema, and iterative algorithms may converge to one of them. Methods of computational algebra can be used to find all (complex) roots. Also for moderate number of participants the estimator of the variance obtained from the inverse of the Fisher information as well as some other commonly used statistics of the form, underestimate the true variance.

For this reason alternative, simpler procedures are desired. They include the analogue of one of traditional estimators of the common mean suggested by Graybill and Deal, the sample mean, and an extension of the procedure introduced by DerSimonian and Laird (1986) as well as the algorithm proposed by Mandel and Paule (1970). We demonstrate the relationship of these estimators to the restricted maximum likelihood estimator and to the maximum likelihood estimator. An estimator of the covariance matrix of these statistics similar to the one suggested in a more general setting of linear models by Horn, Horn and Duncan (1975) is also put forward. This method is commonly used now in the Standard Reference Materials studies by the Statistical Engineering Division at the National Institute of Standards and Technology.

The results of a Monte Carlo simulation study confirmed a good approximation of the pivotal ratio by an F-distribution. To illustrate the techniques, we implemented them in the mentioned accelerometers key comparisons study (CCAUV.V-K1) and in interlaboratory study of thermal diffusivity and conductivity. As in the simulation study, the Mandel-Paule procedure and the DerSimonian-Laird method all give the same answer which practically coincides with the PTB solution given by von Martens et al (2002).

The suggested (matrix-weighted) vector means are useful for the vector KCRV estimation. The method of assessing the uncertainty of these estimates provides joint confidence ellipsoids for the parameters involved.

Major Accomplishments:

A. L. Rukhin, Estimating Common Vector Mean in Interlaboratory Studies, Journal of Multivariate Analysis, 98, 2007, 435-454.

A. L. Rukhin, N. Sedransk,
Statistics in Metrology: International Key Comparisons and Interlaboratory Studies, Journal of Data Science, 7, 2007, 393--412.

A. L. Rukhin,
Conservative Confidence Intervals Based on Weighted Means Statistics, Statistics&Probability Letters, 77, 2007, 853-861.

D. Evans, A. Hornikova, S. Leigh, A. L. Rukhin and W. Strawderman,
Report on Acceleration Comparison, SIM.AUV.V-K1 Metrologia, 46, 2009, Technical Supplement 09002.

A. L. Rukhin,
Weighted Means Statistics in Interlaboratory Studies, Metrologia, 46, 2009, 323-331.

A. L. Rukhin,
Conservative Confidence Ellipsoids for Linear Model Parameters, Mathematical Methods of Statistics, 18, 2009, 375-396.

A. L. Rukhin,
Confidence Regions for Parameters in Linear Models, Statistica Sinica, 20,  2010, 787-806.

A. L. Rukhin,
Maximum Likelihood and Restricted Maximum Likelihood Solutions in Multiple-Method Studies, Journal of Research of the  National Institute of Standards and Technology, to appear 2011.

Lead Organizational Unit:

itl

Staff:

Contact

Andrew Rukhin
301-975-2951
andrew.rukhin@nist.gov
100 Bureau Drive, M/S 8980
Gaithersburg, MD 20899-8980