Loading…

PANP - a New Method of Gene Detection on Oligonucleotide Expression Arrays

The method currently most used for probeset detection calls on Affymetrix GeneChipreg Human Genome Arrays is provided as part of the MAS5 software. The MAS method uses Wilcoxon statistics for determining presence-absence (MAS-P/A) calls. However, MAS-P/A is only usable with MAS5 processing, which re...

Full description

Saved in:
Bibliographic Details
Main Authors: Warren, P., Taylor, D., Martini, P.G.V., Jackson, J., Bienkowska, J.
Format: Conference Proceeding
Language:English
Subjects:
Citations: Items that cite this one
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The method currently most used for probeset detection calls on Affymetrix GeneChipreg Human Genome Arrays is provided as part of the MAS5 software. The MAS method uses Wilcoxon statistics for determining presence-absence (MAS-P/A) calls. However, MAS-P/A is only usable with MAS5 processing, which requires the use of both perfect match (PM) and mismatch (MM) probe data in order to call the resulting probeset present or absent. A considerable amount of recent research has convincingly shown that using MM data in gene expression analysis may be problematic. The RMA method, which uses PM data only, is one method that has been developed in response to this. However, there is no publicly available method that works with PM-only expression data to establish presence or absence of genes from the probesets in microarray data. It seems desirable to decouple the method used to generate gene expression values from the method used to make gene detection calls. We have therefore developed a statistical method in R, called presence-absence calls with negative probesets (PANP) which uses sets of Affymetrix-reported probes with no known hybridization partners on two chip sets: HG-U133A and HG-U133 Plus 2.0. PANP allows the use of any Affymetrix microarray data preprocessing method to generate expression values, including PM-only methods as well as PM and MM methods. We present our results on PANP and its performance using the set of 28 HG-U133A chips from a published Affymetrix Latin squares spike-in dataset as well as an internal TaqMan-validated human tissue dataset on the HG-U133 Plus 2.0 chipsets. We And that using these datasets, PANP out-performs the MAS-PA method in several metrics of accuracy and precision using a variety of preprocessing methods: RMA, GCRMA, and even MAS5 itself. PANP out-performs MAS-P/A in probeset detection across a full range of concentrations, especially with low concentration transcripts. An R software package has been prepared for PANP and is available in R as part of the Bioconductor package release at http://www.bioconductor.org.
DOI:10.1109/BIBE.2007.4375552