An international standardization programme towards the application of gene expression profiling in routine leukaemia diagnostics: the Microarray Innovations in LEukemia study prephase

Gene expression profiling has the potential to enhance current methods for the diagnosis of haematological malignancies. Here, we present data on 204 analyses from an international standardization programme that was conducted in 11 laboratories as a prephase to the Microarray Innovations in LEukemia (MILE) study. Each laboratory prepared two cell line samples, together with three replicate leukaemia patient lysates in two distinct stages: (i) a 5-d course of protocol training, and (ii) independent proficiency testing. Unsupervised, supervised, and r2 correlation analyses demonstrated that microarray analysis can be performed with remarkably high intra-laboratory reproducibility and with comparable quality and reliability.

Several microarray studies have already demonstrated the identification of differentially expressed genes associated with distinct clinical and therapeutically relevant classes of leukaemias (Golub et al, 1999;Armstrong et al, 2002;Schoch et al, 2002;Yeoh et al, 2002). Given that microarray assays analyse the expression of multiple genes in parallel, they appear to be a robust test method for diagnostic usage (Kohlmann et al, 2003Haferlach et al, 2005). However, to date, all of these studies aimed at subclassifying leukaemia subtypes through gene expression profiling have been performed mainly as monocentric studies that included only a limited number of patients or using mostly RNA specimens that were predominantly analysed retrospectively from archived samples.
Here we report data from an international study group formed around the European Leukemia Network (ELN, http:// www.leukemia-net.org) in 11 laboratories: seven from the ELN, three from the United States, and one in Singapore. The so-called Microarray Innovations in LEukemia (MILE) study programme will prospectively assess the clinical accuracy of gene expression profiles of 16 acute and chronic leukaemia subclasses, of myelodysplastic syndromes (MDS), and a ''none of the target classes'' control group, as compared to current routine diagnostic workup in over 3000 patients. As a first step representing a major effort to standardize the microarray analysis workflow in the participating centres, a prephase of the MILE study was performed. This report presents the results of the prephase, i.e., a standardization programme of the microarray procedure in the participating laboratories in order to ensure a robust gene expression profiling test performance before patient samples were analysed.

Materials and methods
There were two stages in the MILE prephase study: protocol training and proficiency testing. As part of the initial protocol training each participating laboratory was provided with identical equipment, including reagent kits, enzymes, spectrophotometer, and heat block instruments, and eight microarray experiments were performed at each centre with an on-site trainer in the respective laboratory being trained. The eight samples analysed during the training course were represented by MCF-7 (breast adenocarcinoma) and HepG2 (liver carcinoma) cell line total RNA (Ambion, Austin, TX, USA) with 1AE0 lg and 5AE0 lg input of total RNA, respectively, and four leukaemia patient sample lysates prepared from mononuclear cells obtained after Ficoll density purification. Patient lysates comprised cells of one chronic myeloid leukaemia (CML), one chronic lymphocytic leukaemia (CLL), and two replicate lysates of an AML patient sample (containing a translocation t(8;21), French-American-British (FAB) type M2). The total RNA from the patient lysates was extracted at each centre as part of the training programme, making these samples a test of the entire microarray process workflow post sample acquisition (RNeasy kit, Qiagen, Hilden, Germany). Subsequently, after the training phase and for operator proficiency testing, each laboratory independently performed four microarray experiments each for MCF-7 and HepG2 cell lines with inputs of 1AE5 lg, 3AE0 lg, 5AE0 lg, and 8AE0 lg total RNA. In total, 204 microarray profiles were included in the analysis (for details see Appendix SI and SII). The three anonymous replicate patient lysates were provided by the Laboratory for Leukaemia Diagnostics in Munich, Germany. All patients gave their informed consent for participation after having been advised of the purpose and investigational nature of the study. The study design adhered to the tenets of the Declaration of Helsinki and was approved by the ethics committees of the participating institutions before its initiation. Details on the microarray analysis workflow, image analysis, quality reports, as well as statistical methods are given in Appendix SI.

Intra-laboratory reproducibility of gene expression analyses
As shown in an unsupervised Principal Component Analysis (PCA), the individual gene expression profiles grouped closely together with their corresponding biological sample types based on the underlying similarity, but not according to the centre where the microarray experiments were performed (Fig 1). The arrows in Fig 1 indicate that the four leukaemia sample preparations from Centre 9 (N17-20), as well as one HepG2 preparation from Centre 3 (N18) were outliers in the PCA. Large differences in gene expression profiles were also observed with respect to the manufacturing batches for MCF-7 total RNA, but overall, a high level of reproducibility between laboratories was seen when a standardized protocol for microarray analysis was followed by trained operators. According to the unsupervised PCA plots, replicated gene expression profiles of the HepG2 cell line were more biologically homogeneous and not as influenced by manufacturing batch numbers, as seen for MCF-7 cell line replicates. Therefore, replicated profiles of the HepG2 cell line were chosen to further investigate the intra-and inter-laboratory correlations. All centres generated highly reproducible gene expression profiles for this cell line, as shown in the box plot analysis of r 2 values from all pairwise comparisons within each centre for the sample type HepG2 (Fig 2A), where mean r 2 values range from 0AE973 to 0AE988. The slightly higher variability at Centre 11 might be explained by a higher number of operators and replicate analyses than in other centres. Figure 2B shows the intra-site repeatability of microarray data based on quantitative signal values and qualitative detection calls. The number of generally detected genes for each sample type at each centre varied from 24 627-27 075 for HepG2 and 25 841-28 953 for MCF-7. The coefficient of variation (CV) of the quantitative signal values between the intra-site replicates was calculated using the generally detected subset of genes for   Analysis of intra-and inter-laboratory reproducibility. (A) Box-and-whisker plots display, for each laboratory, the intra-laboratory squared correlation coefficients (r 2 ) of all probe sets represented on the HG-U133 Plus 2.0 microarray for the HepG2 cell line sample. The signal used is DS. Each laboratory analysed six HepG2 samples using various amounts of starting total RNA: 1AE0 lg, 1AE5 lg, 3AE0 lg, 5AE0 lg (duplicate), or 8AE0 lg, respectively. Thus, all possible different pairwise comparisons were performed (Count). Mean r 2 values (black arrow) and standard deviation (SD) values are given for each of the series of comparisons for each laboratory. Outliers are represented as red boxes. Note: more comparisons were performed in Centres 9 and 11 because multiple operators contributed microarray data (Appendix SII). (B) Repeatability of expression signal within laboratories. The CV of the expression signal values between centre replicates of the same sample type was calculated for all generally detected genes (left y-axis). The distributions of replicate CVs are presented in a series of eleven box-and-whisker plots: one for each of the two sample types HepG2 (left) or MCF-7 (right) at the eleven distinct laboratories. The median (line), interquartile range as well as the 10th and 90th percentile values are indicated in each plot. Only genes that were generally detected were included in the box plots and CV calculations. The number of generally detected genes was defined as being called present in at least one third of the samples, e.g., at least two out of the six replicates per centre. This number varied by sample and laboratory and is noted as the line plot with the y-axis on the right. (C) Box-and-whisker plots display the inter-laboratory squared correlation coefficients (r 2 ) of all probe sets represented on the HG-U133 Plus 2.0 microarray for the HepG2 cell line sample. The signal used is DS. Each centre analysed six HepG2 samples using various amounts of starting total RNA: 1AE0 lg, 1AE5 lg, 3AE0 lg, 5AE0 lg (duplicate), or 8AE0 lg, respectively. Here, microarray data from Centre 3 is compared with all other laboratories. Each inter-laboratory analysis with different pairwise comparisons is represented by a single box plot (Count). Mean r 2 values (black arrow) and standard deviation (SD) values are given for each series of comparisons. Outliers are represented as red boxes. Note: more comparisons were performed in Centres 9 and 11 because multiple operators contributed microarray data (Appendix SII). (D) Scatter plot analysis of inter-laboratory reproducibility. The graph shows 10 distinct scatter plot analyses, each displaying a comparison between Centre 3 and the other laboratories for the 5AE0 lg HepG2 sample run at the stage of proficiency testing. The r 2 value calculation is based on DS intensity signals from all probe sets on the HG-U133 Plus 2.0 microarray.

Short Report ª 2008 The Authors
Inter-laboratory reproducibility of gene expression analyses As an example of inter-laboratory reproducibility of gene expression analyses, correlations between Centre 3 and all other ten laboratories are given (Fig 2C and D). The degree of correlation was only slightly different to the intra-laboratory reproducibility (Fig 2C). The minimum and maximum mean values were 0AE959 and 0AE985, respectively. This again demonstrated a high inter-laboratory correlation of HepG2 gene expression profiles and confirms the outstanding performance of microarray analysis in the 11 centres. This high interlaboratory consistency can be also shown in pairwise scatter plot analyses. The 5AE0 lg HepG2 replicate analysis between Centre 3 and other laboratories is shown as an example (Fig 2D). A very tight distribution of gene expression data can be observed along the diagonal line for every paired HepG2 sample. Additional analyses of inter-site correlations for HepG2 subsets across all laboratories, along with hierarchical cluster and principal component analyses, are given in Appendix SI. Furthermore, the online section also contains an analysis of the relative contribution of different sources of both technical and biological variability in gene expression measurements.

Discussion
Taken together, this study demonstrated that standardizing experimental protocols for microarray analysis and performing a thorough operator training resulted in excellent comparability with respect to both data sets generated within a participating laboratory and across 11 different laboratories in three continents. This extends the observations of a recent across-platform comparison study from the Toxicogenomics Research Consortium (Bammler et al, 2005). In particular, and also noted by Bammler et al (2005), the standardization of RNA labelling protocols using common procedures was recognized as an important contributor to signal intensity correlations across different laboratories. Our study further shows consistent results when compared with the intra-platform precision demonstrated from three different centres in the recent MicroArray Quality Consortia data (Shi et al, 2006).
In conclusion, this standardization effort represented the prerequisite foundation of the first phase of the MILE study, wherein 1889 patients have, thus far, been analysed by whole genome expression microarrays (Haferlach et al, 2006). The protocol devised for sample preparation takes only one working day from cDNA synthesis to cocktail hybridization and is easily applicable in a daily routine setting. The standardization of gene expression profiling testing in this way has the potential to offer identical objective diagnostic results in any trained laboratory throughout the world. Thus, microarrays are getting substantially closer to a routine application of gene expression profiling for the diagnosis of leukaemias in the clinical practice.

Funding
This study is part of the MILE Study (Microarray Innovations In LEukemia) programme, an ongoing collaborative effort headed by the European Leukaemia Network (ELN) and sponsored by Roche Molecular Systems, Inc., addressing gene expression signatures in acute and chronic leukaemias. This work is further partly supported by AIRC (Associazione Italiana per la Ricerca sul Cancro), Milan, Ministero dell'Università e della Ricerca, Fondo per gli Investimenti della Ricerca di Base (FIRB) and COFIN, Rome, Italy.