Towards resolving ambiguity in promoter to gene assignment for omics data integration

Towards resolving ambiguity in promoter to gene assignment for omics data integration


Author(s): Fiona Ross,Charlotte Soneson,Michael B Stadler

Affiliation(s): FMI, University of Basel



The analysis of transcriptional regulation often involves linking transcriptomic data to measurements of the chromatin state of the corresponding promoters, for example obtained from ChIP-seq or ATAC-seq experiments. The analysis is typically done at the gene level, which poses the question of which promoter region to choose for genes with multiple transcription start sites. Widely used approaches include selecting the most upstream promoter or the promoter showing the highest intensity of a signal of interest, neglecting alternative promoters. Here, we compare several alternative approaches to this issue, and their impact on downstream analyses. We specifically evaluate performing chromatin analyses at the level of individual promoters, merging overlapping promoters of the same gene into larger regions, called promoter groups. Similarly, expression analyses are performed at the level of transcripts, and expression levels of transcripts with promoters in the same promoter group are summed. With this approach we aim to improve the detection of association between chromatin state and transcriptional activity and facilitate studying alternative promoter usage between samples.