Error Estimates
We previously estimated a false positive rate of 4% and a false negative rate
of ~24% for genome-wide binding data that meets a P ≤ 0.001 threshold
(Harbison
et al.). To estimate false positive and negative rates for the current technology,
we compared maximum IP/WCE ratios for the Gcn4 data to high likelihood positive
and high likelihood negative lists of binding targets. A positive
list of 84 genes was selected on the basis of previous high confidence binding
data (P ≤ 0.001, (Harbison
et al.), the presence of a perfect or near perfect Gcn4 consensus binding
site (TGASTCA) in the region of -400bp to +50bp, and a greater then 2-fold change
in steady state mRNA levels dependent on Gcn4 when shifted to amino acid starvation
medium (Natarajan
et al.). The negative list of 945 genes was selected by weak binding (P
≥ 0.1), absence of a motif near the presumed start site, and less then
a 60% change in steady state mRNA levels in response to shift to amino acid
starvation. Each gene was scored based on the maximum median-normalized IP/WCE
ratio found in the region -250 to +50bp from the UAS. Parameters were optimized
by maximizing the absolute difference in identified genes in both the positive
list and negative lists using the Statistics-ROC package for Perl. Based on
these results, we estimate a false positive
rate of <1% and a false negative rate of ~25%.
|