Challenges for Future Drafts of the Regulatory Code
We have used extensive in vivo binding data, conserved sequence information and prior knowledge of regulator-DNA interactions to construct a first draft of the transcriptional regulatory code of a eukaryote cell. This draft provides a foundation for improvements that should lead to a more complete understanding of the transcriptional regulatory code. A more complete understanding of the code should emerge from 1) collecting additional experimental data, 2) testing models that emerge from the data, and 3) developing improved computational algorithms to integrate various data types.
Collecting additional experimental data
We have identified 203 transcriptional regulators in yeast and have profiled
their genomic binding under a limited number of growth conditions (rich media
and conditions where we knew that specific regulators are important for growth).
It may be valuable to collect genome-wide binding data for these DNA-binding
regulators in cells grown in additional environments. It is also possible that
there are additional regulators that have not yet been identified, and the binding
sites for these regulators will need to be determined. Since cell-type regulators
are known to exist, it might also be useful to determine whether binding events
depend on cell type (a or alpha) or genome copy number.
Testing models
Experimental tests of models for regulator functions predicted by their environment-dependent
binding behaviour will provide new insights into the regulatory mechanisms involved
in control of global gene expression programs. Knowledge of the environment-dependent
changes in the abundance, modification state and intracellular compartmentalization
of transcriptional regulators will be valuable. It may also be useful to explore
the effects of mutations in transcriptional regulators and cis elements on a
large scale.
Improved computational algorithms
Computational algorithms already play a key role in binding site sequence prediction
and integration of various data types to reveal how the transcriptional regulatory
code controls gene expression programs under diverse conditions. New algorithms
may be needed to encompass sequence conservation, condition-dependent regulator
binding and dynamic gene expression data in order to assemble robust transcriptional
regulatory networks.