Challenges for Future Drafts of the Regulatory Code

We have used extensive in vivo binding data, conserved sequence information and prior knowledge of regulator-DNA interactions to construct a first draft of the transcriptional regulatory code of a eukaryote cell. This draft provides a foundation for improvements that should lead to a more complete understanding of the transcriptional regulatory code. A more complete understanding of the code should emerge from 1) collecting additional experimental data, 2) testing models that emerge from the data, and 3) developing improved computational algorithms to integrate various data types.


Collecting additional experimental data
We have identified 203 transcriptional regulators in yeast and have profiled their genomic binding under a limited number of growth conditions (rich media and conditions where we knew that specific regulators are important for growth). It may be valuable to collect genome-wide binding data for these DNA-binding regulators in cells grown in additional environments. It is also possible that there are additional regulators that have not yet been identified, and the binding sites for these regulators will need to be determined. Since cell-type regulators are known to exist, it might also be useful to determine whether binding events depend on cell type (a or alpha) or genome copy number.


Testing models
Experimental tests of models for regulator functions predicted by their environment-dependent binding behaviour will provide new insights into the regulatory mechanisms involved in control of global gene expression programs. Knowledge of the environment-dependent changes in the abundance, modification state and intracellular compartmentalization of transcriptional regulators will be valuable. It may also be useful to explore the effects of mutations in transcriptional regulators and cis elements on a large scale.

Improved computational algorithms
Computational algorithms already play a key role in binding site sequence prediction and integration of various data types to reveal how the transcriptional regulatory code controls gene expression programs under diverse conditions. New algorithms may be needed to encompass sequence conservation, condition-dependent regulator binding and dynamic gene expression data in order to assemble robust transcriptional regulatory networks.