Young Lab

Mapping Genome Occupancy in Embryonic Stem Cells

Data

Global Transcriptional Repression by PRC2

Key Developmental Regulators Are Targets of PRC2

PRC2 and Highly Conserved Elements

Signaling Genes Are Among PRC2 Targets

Activation of PRC2 Target Genes During Differentiation

Supplementary Information

Acknowledgements
References

Data Normalization and Analysis

We used GenePix software (Axon) to obtain background-subtracted intensity values for each fluorophore for every feature on the array. To obtain set-normalized intensities, we first calculated, for each slide, the median intensities in each channel for the set of 1,500 control probes described here and included on each array. For multiple slide sets (whole genome and promoter array), we then calculated the average of these median intensities for all slides. Intensities were then normalized such that the median intensity of each channel for an individual slide equaled the average of the median intensities of that channel across all slides.

Among the Agilent controls is a set of negative control spots that contain 60-mer sequences that do not cross-hybridize to human genomic DNA. We calculated the median intensity of these negative control spots in each channel and then subtracted this number from the set-normalized intensities of all other features.

To correct for different amounts of genomic and immunoprecipitated DNA hybridized to the chip, the set-normalized, negative control-subtracted median intensity value of the IPenriched DNA channel was then divided by the median of the genomic DNA channel. This yielded a normalization factor that was applied to each intensity in the genomic DNA channel.

Next, we calculated the log of the ratio of intensity in the IP-enriched channel to intensity in the genomic DNA channel for each probe and used a whole chip error model (Hughes et al., 2000) to calculate confidence values for each spot on each array (single probe p-value). This error model functions by converting the intensity information in both channels to an X score which is dependent on both the absolute value of intensities and background noise in each channel. When available, replicate data were combined, using the X scores and ratios of individual replicates to weight each replicate's contribution to a combined X score and ratio. The X scores for the combined replicate are assumed to be normally distributed which allows for calculation of a p-value for the enrichment ratio seen at each feature. Pvalues were also calculated based on a second model assuming that, for any range of signal intensities, IP:control ratios below 1 represent noise (as the immunoprecipitation should only result in enrichment of specific signals) and the distribution of noise among ratios above 1 is the reflection of the distribution of noise among ratios below 1.


	YOUNG LAB Whitehead Institute 9 Cambridge Center Cambridge, MA 02142 [T] 617.258.5218 [F] 617.258.0376 CONTACT US