Measurement systems such as those based on microarrays routinely produce thousands of parallel data sets. This project seeks to develop an approach to experimental characterization of the technical variation associated with such measurement systems. An approach is proposed that is based on observation of the technical variation through analysis of corresponding parallel data sets. Empirical Bayes methods are applied to summarization of these parallel data sets.
An archetype among biological experiments is gene-by-gene determination of the expression difference between two groups of biological units. For each gene, the experiment yields a data set consisting of one or more expression values for each unit. Gene to gene, these data sets can be called parallel although the expression differences may be unrelated. A gene-expression microarray gives expression values for thousands of genes, in other words, a response with thousands of dimensions. Thus, measurement systems based on gene-expression microarrays can give thousands of parallel data sets, one for each dimension. Other measurement systems can give even more parallel data sets. Such systems are the topic of this project.
Instead of something in the subject matter, measurement systems themselves can be the object of experimentation. Experiments are performed on measurement systems for assessment of the technical variation, the variation associated with repeated measurements on the same material. Steps to reduce the technical variation may then follow. This project presents a class of such experiments suitable for large-scale parallel measurement systems.
Measurement systems produce measurements according to the levels of the measurand in the materials measured. It follows that test materials define a measurement system experiment. The general experimental concept is interpretation of test material measurements in terms of what is known a priori about the measurand levels in the test materials. The class of experiments considered here involves biological units with differences that reflect only the unit-to-unit biological variation that is unavoidable despite the similar handling of the units. From each biological unit, two test materials with differences in measurand levels are extracted. These differences are intended to be in a useful range for a large proportion of the system dimensions. From each pair of test materials, additional test materials are obtained by mixing. What is known a priori is that for test materials from a particular biological unit have measurands that are related because of the test material mixing and that analogous test materials from different units have similar measurands because of the similar unit handling.
Technical variation, which is the target of a measurement system experiment, has structure that is linked to changes in measurement conditions. These conditions can be held constant (repeatability conditions), or certain conditions can be allowed to change (reproducibility conditions). Measurements made under constant conditions constitute a batch. A set of measurements may be made in several batches that are delineated by changes in specific measurement conditions. Technical variation observed within a batch is usually considered random variation. Variations observed among batches beyond what is observed under constant conditions are batch effects. The most common instances of batch effects are laboratory effects and operator effects. Characterization of a batch effect through a measurement system experiment requires that the batch delineation be suitably imposed on the measurement of the test materials.
For each dimension, the measurement system experiments considered here produce a data set consisting of univariate measurements made on the test materials and made under different measurement conditions. Modeling of such a data set leads to characteristics of the technical variation for a particular dimension. The modeling includes testing the hypothesis that a particular change in measurement conditions produces no batch effect. The modeling also includes comparison of the sizes of the batch effects with the size of the variation in the biological units.
Characterization of the technical variation for large-scale measurement systems requires summarization over the dimensions, that is, summarization of the parallel modeling of the parallel data sets. Bradley Efron's book Large-Scale Inference presents an empirical Bayes approach to such summarization. This approach is applied to large-scale measurement system experiments in this paper. Efron's book focuses on the comparison of two groups of biological units. The parallel modeling in measurement system experiments requires some adaptations.
Submitted the following paper for publication and presented it at the 2010 Joint Statistical Meetings:
Massively-parallel linear modeling of gene expression data
Experience with actual gene-expression measurements such as the measurements examined in this paper provides support for future choices of data analysis approach. For gene expression measurements, a data analysis approach involves choices of methods for preprocessing and for constructing gene lists. The effect of preprocessing choice on the gene list obtained is the particular focus of this paper. We model the measurements gene by gene using a set of regressors common to all the genes. Use of common regressors, which requires proper choice of preprocessing, leads to parallel F tests. This paper proposes alternative preprocessing methods and compares them through application of false discovery rate methods to the parallel F tests. The data analysis methods incorporate preprocessing closely related to RMA (robust multi-array analysis) and local false discovery rate calculations involving an empirical null distribution. For each of six animals, the data consist of triplicate measurements on four mixtures of liver and kidney mRNA. Known mixtures allow F tests for lack of response linearity with mRNA concentration. Although the alternative preprocessing algorithms differ, the differences in results are not sizeable. The performance of the false discovery rate method is analogous to performance in the two groups case.
Status: Awaiting decision of journal
Submitted the following paper for publication:
Technical vis-à-vis biological variation in gene expression measurements
Walter Liggett and other members of the MAQC (Microarray Quality Control) Titration Working Group
Technical variation, especially technical variation with batch effects, that is, with correlation structure associated with measurement in batches, can result in erroneous conclusions. Experimental characterization of the technical variation must include assessment of batch effects and moreover, should suggest approaches to study design and to removal of sources of technical variation. However, approaches to reducing technical variation are only worthwhile relative to other limitations on study sensitivity such as biological variation. This paper demonstrates an experimental approach that leads to effective assessment of technical variation in gene expression measurements. The demonstration involves measurement of liver RNA, kidney RNA and mixtures thereof from six animals. In the demonstration, batch effects are the result of changes in the scanner and in the fluidics machine used in the hybridization. The approach applies to changes in operator or laboratory and to other causes of lack of reproducibility. The assessment presents comparison with biological variation.
Status: Rejected by three journals.
The first journal said, "A large part of the study data rests on analyzing data from mixtures of two RNAs. This is a highly unusual, even perhaps bizarre, study design that is poorly justified."
The second journal said, "My main objection is that the authors' take-home message is that every laboratory should undertake a similar approach to assess the relative contributions of batch effects and biological signal in the data from their machines. Such a process would cost tens of thousands of dollars to create the data and then require repetition of the analysis described in the manuscript. While the analysis is described well, there is no way for others to execute it – no software is described or made available. I don't agree with the assertion that this manuscript is of general interest because it describes a extremely expensive, time-consuming and statistics-intensive pre-study method to evaluate laboratory batch effects."
The third journal said, "Your paper appears to be the application of existing, standard statistical methodologies to a relevant practical problem, rather than the development of new statistical methods."
These comments suggest that this paper be rewritten for a statistics audience with emphasis on what is new in the design and analysis.