Dissecting the expression patterns of transcription factors across conditions using an integrated network-based approach

Sarath Chandra Janga^1,*and Bruno Contreras-Moreira^2,3,4,*

¹MRC Laboratory of Molecular Biology, Hills Road, Cambridge, CB2 0QH, United Kingdom.

²Estación Experimental de Aula Dei /CSIC, Universidad de Zaragoza, Av.Montañana 1.005, 50059 Zaragoza, España.

³Fundación ARAID, Paseo María Agustín 36, Zaragoza, España.

⁴Institute of Biocomputation and Physics of Complex Systems (BIFI), Universidad de Zaragoza, Spain

*Corresponding authors:

Sarath Chandra Janga: sarath@mrc-lmb.cam.ac.uk
Bruno Contreras-Moreira: bcontreras@eead.csic.es

Abstract

In prokaryotes, regulation of gene expression is predominantly controlled at the level of transcription. Transcription in turn is mediated by a set of DNA-binding factors called Transcription Factors (TFs). In this study, we map the complete repertoire of ~ 300 TFs of the bacterial model, iEscherichia coli, on to gene expression data for a number of non-redundant experimental conditions and show that TFs are generally expressed at a lower level than other functional classes. We also demonstrate that different conditions harbor varying number of active TFs with an average of about 15% of the total repertoire being significantly expressed across conditions, with certain stress and drug induced conditions exhibiting as high as one-third of the total collection of TFs. Our results also show that activators are more abundant than repressors in the set of significantly expressed TFs across conditions, indicating that activation of promoters might be a more common phenomenon than repression in bacteria. Finally, to understand the association of TFs with different conditions and to elucidate their dynamic interplay with other TFs we develop a network-based framework to identify TFs which act as markers (those which are responsible for condition-specific transcriptional rewiring) starting from a literature curated static set of TF-TF regulatory interactions. This analysis allowed us to pinpoint several marker TFs as being central in various specialized conditions like drug-induction or growth condition variations, which we discuss in light of previously reported experimental findings. Further analysis showed that a majority of identified markers effectively control the expression of their regulons. It was also found that closeness is a key centrality measure which can aid in the successful identification of marker TFs in regulatory networks. Our results suggest the utility of the network-based approaches developed in this study to be applicable for understanding other interactomic datasets.

I. Datasets used in the study

An EXCEL file (supplementary_data.xls) containing:

A RegulonDB-annotated list of transcription factors in E.coli K12.
A classification of TFs in term of sensing class (for compete details please refer to these papers/PMIDs: 16311037, 17321548).
A list of TFs clasified in terms of network connectivity.
A list of regulatory interactions between TFs and their target genes (the original and up-to-date version is available from RegulonDB).
A two-column file describing the static TF-TF regulatory network used in this work.
A list of non-redundant microarray conditions (<0.95 Pearson correlation as described in Materials and Methods). Expression data was obtained from M3D database using the Build 4 of the E. coli expression compendium.

II. Supplementary figures and tables

An EXCEL file (supplementary_figures_results.xls) containing all the results used along the manuscript, plus a few extra figures and tables:

Expression data linked to COG and Riley classifications used to produce Figure 1, which presents a comparison of the expression levels of TFs with other functional classes.
Degree vs associations data used to plot Figure 6A in the manuscript.
Figure S1, showing the proportion of expressed TFs in each H,M,L connectivity class across conditions.
Table S1, listing all TFs found to be associated to any microarray conditions.
Table S2, listing all asociated TFs (markers) associated to each of the 62 nr conditions, as measured in terms of network centrality measures. This table contains the data used to generate Figure 6B (betweeness correlation) in the manuscript and also lists those conditions for which no markers where found.
Table S3, detailing the expression change measured in the positively and negatively regulated regulons of marker TFs.

III. mRNA expression patterns of transcription factors across conditions. Identifying active TFs in each condition.

To identify active TFs in each condition we first plotted the expression level of each TF across conditions. These transcriptional patterns can be browsed and downloaded here. Since different TFs are expressed to varying levels it is not possible to identify active TFs based on a single expression level threshold. Therefore we defined Significant Expression Threshold (SET) as described in Materials and Methods for each TF, enabling us to detect active TFs. Note that threshold values are marked with arrows in the diagrams.

IV. Dynamic TF-TF networks for 62 non-redundant conditions

Condition-specific TF-TF subnetworks can be browsed from this link. These networks are the basis for the identification of markers for each condition as described in the manuscript. An animated GIF which shows the observed network re-wiring is available here.