HOMESEARCHCONTACT US
ResearchPersonnelPublications

Study Design

Data and Downloads

Acknowledgments
Data Quality Control

This web page describes the lines of evidence suggesting that this genome-wide protein-DNA interaction dataset is of high quality.

Analysis of False Call Rates in ChIP microarray experiments

Estimating a false positive and false negative rate is challenging as the estimates depend on perfect knowledge of a ground truth or confirmation by other experimental techniques that will each have their own bias. For the array platform used here, our experience with yeast provides an estimate of the error inherent in the platform. In this case, we selected a set of positives and negatives for the binding of Gcn4, a well-studied yeast transcription factor. The 84 positive genes were selected using three criteria: previous high confidence binding data (P = 0.001, (Harbison et al., 2004), the presence of a perfect or near perfect Gcn4 consensus binding site (TGASTCA) in the promoter region (-400bp to +50bp), and a greater then 2-fold change in steady state mRNA levels dependent on Gcn4 when shifted to amino acid starvation medium (Natarajan et al., 2001). The negative list of 222 genes was selected by weak binding (P = 0.1), absence of a motif near the presumed start site, and less then a 20% change in steady state mRNA levels in response to shift to amino acid starvation.

Using these positive and negative sets, we used ROC curve analysis (Statistics-ROC package for Perl) to evaluate a range of different IP/WCE ratio thresholds for false positive and false negative rates. Essentially, we examined a range of thresholds to denote "bound" and asked how many false positives and false negatives are detected at each threshold. Each gene was scored based on the maximum median-normalized IP/WCE ratio found in the region -250 to +50bp from the UAS. With the optimal cutoff for minimizing false positives (a 3.5 fold ratio), the data suggest a false positive rate of less than 0.5% and a false negative rate of ~20%. Thus, the oligo array platform is capable of generating extremely accurate, high quality data.

 
COLLABORATORSINTERNAL SITEQUICK LINKS
   
YOUNG LAB
Whitehead Institute
9 Cambridge Center
Cambridge, MA 02142
[T] 617.258.5218
[F] 617.258.0376
CONTACT US