Transcription initiation mapping in 31 bovine tissues reveals complex promoter activity, pervasive transcription, and tissue-specific promoter usage
ABSTRACTCharacterizing transcription start sites is essential for understanding the regulatory mechanisms that control gene expression. Recently, a new bovine genome assembly (ARS-UCD1.2) with high continuity, accuracy, and completeness was released; however, the functional annotation of the bovine genome lacks precise transcription start sites and includes a low number of transcripts in comparison to human and mouse. Using the RAMPAGE approach, this study identified transcription start sites at high resolution in a large collection of bovine tissues. We found several known and novel transcription start sites attributed to promoters of protein coding and lncRNA genes that were validated through experimental and in silico evidence. With these findings, the annotation of transcription start sites in cattle reached a level comparable to the mouse and human genome annotations. In addition, we identified and characterized transcription start sites for antisense transcripts derived from bidirectional promoters, potential lncRNAs, mRNAs, and pre-miRNAs. We also analyzed the quantitative aspects of RAMPAGE data for producing a promoter activity atlas, reaching highly reproducible results comparable to traditional RNA-Seq. Lastly, gene co-expression networks revealed an impressive use of tissue-specific promoters, especially between brain and testicle, which expressed several genes in common from alternate transcription start sites. Regions surrounding co-expressed modules were enriched in binding factor motifs representative of their tissues. This annotation will be highly useful for future studies on expression control in cattle and other species. Furthermore, these data provide significant insight into transcriptional activity for a comprehensive set of tissues.