See this page online at: http://www.bioscienceworld.ca/MAQCsamplesinfluencinganewgenerationofmicroarraytechnology
Sign up for your subscription and keep up-to-date.
Stay updated on the latest news and technologies with Bioscienceworld's newsletters.
Five to choose from.
Agilent Technology has come up with a new platform that provides expression measurements that more accurately reflect the range of gene activities in biological systems. First, by reducing feature size and spacing, the footprint of the microarray is reduced, while feature count and content remain constant. The reduction in area allows for a more concentrated target sample without increasing RNA input. Higher concentration samples result in higher signals, allowing for accurate detection of transcripts previously below the detection limit. An extended dynamic range scanner prevents saturation of bright features. This system, which also includes new tools for automated scanning and data extraction, increases the range of gene expression measurements to greater than five logs. To assess the resulting performance and data quality, RNA samples from the MicroArray Quality Control (MAQC) project were used and compared the performance of Agilent's new four Pack high-density microarrays to the previous generation Agilent microarrays. It demonstrated enhanced sensitivity, increased dynamic range and improved reproducibility and confirmed the utility of the MAQC samples for ongoing assessment of data quality in microarray experiments.
Introduction
Recently the MicroArray Quality Control (MAQC) consortium published the most comprehensive study to date assessing the performance and cross platform comparability of microarray data (MAQC Consortium. Nat. Biotechnol. 24, 1151-1161 (2006)). This study laid a foundation and framework for assessment of microarray performance, whether for proficiency testing, or for assessing changes to microarray platforms.
The commercially available samples and the metrics used in the study, as well as the large reference dataset generated by the Introduction Recently the MicroArray Quality Control (MAQC) consortium published the most comprehensive study to date assessing the performance and cross platform comparability of microarray data (MAQC Consortium. Nat. Biotechnol. 24, 1151-1161 (2006)). This study laid a foundation and framework for assessment of microarray performance, whether for proficiency testing, or for assessing changes to microarray platforms. The commercially available samples and the metrics used in the study, as well as the large reference dataset generated by the study allow for the relatively straightforward assessment of microarray performance.
As demonstrated in the MAQC study, microarray performance is generally a balance among sensitivity, accuracy and reproducibility. Platform design choices that emphasize one aspect tend to do so at the expense of the others. At Agilent Technologies has designed a microarray platform to achieve an appropriate balance among these different performance attributes with a strong emphasis on accuracy and sensitivity of detecting differential expression across a wide dynamic range. The results of these design choices can be seen in the data presented in the MAQC study.
Through technology enhancements that allow for four individual whole genome microarrays to be printed on a single glass slide, both the sensitivity and reproducibility of the platform has been improved without sacrificing the accuracy of differential expression calls. Due to the reduced surface area of the array, an increased concentration of sample can be hybridized without increasing sample input requirements. Signals that might otherwise have been lost due to scanner saturation are captured by scanning at high and low gain settings using the new eXtended Dynamic Range (XDR) feature of the Agilent microarray scanner. Combining the feature intensity data from the two different scans is accomplished automatically with the latest version of the Agilent Feature Extraction software (v9.1).
Commercially available MAQC samples were used to evaluate the performance of Agilent's new high density multi-pack whole human genome arrays (four pack). Using some of the analysis methods and metrics put forth in that study, we compare the performance of the new generation microarrays to the "legacy" whole genome products (44K).
Sensitivity and Dynamic Range
RNA Spike-In control results for two different format microarrays hybridized with sample B are shown in Figure 2. These Spike-In controls are included in the labeling reaction and span six orders of magnitude in concentration. The graphs shown in Figure 2 demonstrate the four pack arrays (shown in red) have a greatly expanded dynamic range as compared to the 44K arrays. Scanner saturation of the probe representing the highest concentration transcript is eliminated by use of the XDR scanning capabilities and lower concentration transcripts are now detected due to increased sample concentration as well as improvements in hybridization conditions.
The response of biological probes is also improved using the new four pack format. Figure 3 illustrates the detected signals for the same sample hybridized on the two formats. As can be seen in the figure, overall signal correlation is good between the formats, with higher probe intensities for the four pack array. In addition, 5600 probes that were either below the detection limit (three times the measured background noise) or saturated on the 44K array are now detectable on the four pack array. The arrays represented here are the same as those with spike-in data shown in Figure 2, and the four pack array is from the slide shown in Figure 1.
Differential Expression Detection
The primary goal of gene expression microarray experiments is to detect changes in gene expression levels between samples. In this analysis, the number of genes differentially expressed between two MAQC samples (A, Stratagene Universal Human Reference, and B, Ambion Human Brain Reference) was calculated and looked at the agreement of those genes between the two formats. To facilitate comparison to data presented in the MAQC study, the focus was on the set of 12,091 commonly mapped genes as a basis for comparison to those data. For the four pack and 44K experiments performed here, differential expression was determined using the same method as described in the MAQC study.
The number of differentially expressed genes for each user and array format is shown in the left of Figure 4. On the right are the numbers reported for the different test sites and platforms in the MAQC study. The values plotted are provided in supplemental table S7 from the main MAQC paper. The data on the right highlighted in blue represent the values for the Agilent platform in the MAQC study. Figure 5 illustrates the gene list agreement between the four pack and 44K formats and among the users. The numbers shown are the percentage of genes detected by the user and format shown in the row which is also detected by the user and format shown in the column. In all cases, >95% of genes detected by any format and user were also detected by each user with the four pack format.
Comparison To TaqMan(R)
Relative accuracy of microarray platforms can be assessed by comparison to gene expression measurements collected by alternative platforms. Figure 6 presents scatter plots of the four pack and 44K log2(B/A) ratio data with the TaqMan(R) data presented in the MAQC study. Shown are data where at least three replicates were detected in both samples by both the microarray and TaqMan(R). The number of data points included for each user as well as the correlation and slope of the individual orthogonal fits are shown in the inset. Dotted lines represent the 45 degree line, while solid lines represent the orthogonal fit of all the data. Figure 7 presents the average slopes (solid bars) and correlations (hatched bars) for these data (left) as well as the data presented in the MAQC study (right, values from supplemental tables S12 and S13). The data on the right highlighted in blue represent the values for the Agilent platform in the MAQC study.
Reproducibility
Reproducibility is represented as the coefficient of variation within or among users for normalized expression signals from those probes that were generally detected. Figure 8 shows the distributions of the within user CV as box and whisker plots for each sample, user and format. Shown in green are the numbers of genes detected in at least three replicates for each sample per user; these genes were used to calculate the CVs. Figure 9 shows the median CVs both within user (blue), and across user (red). Shown in green are the number of genes detected in at least three replicates for all three users; these genes were used to calculate the CVs.
Both formats demonstrate good reproducibility with median within user CVs lower than 10%. The four pack microarrays show generally improved reproducibility as compared to the 44K microarrays, due in part to the increased concentration of hybridization.
Conclusions
The MAQC samples have been used to evaluate performance of a new generation microarray technology, the Agilent four pack microarray and compared performance to that of the previous generation.