Human cells contain myriad excised linear intron RNAs with potential functions in gene regulation and as disease biomarkers
AbstractWe used thermostable group II intron reverse transcriptase sequencing (TGIRT-seq), which gives full-length end-to-end sequence reads of structured RNAs, to identify > 8,500 short fulllength excised linear intron (FLEXI) RNAs originating from > 3,500 different genes in human cells and tissues. Most FLEXI RNAs have stable predicted secondary structures, making them difficult to detect by other methods. Some FLEXI RNAs corresponded to annotated mirtron pre-miRNAs (introns that are processed by DICER into functional miRNAs) or agotrons (introns that bind AGO2 and function in a miRNA-like manner) and a few encode snoRNAs. However, the vast majority had not been characterized previously. FLEXI RNA profiles were cell-type specific, reflecting differences in host gene transcription, alternative splicing, and intron RNA turnover, and comparisons of matched tumor and healthy tissues from breast cancer patients and cell lines revealed hundreds of differences in FLEXI RNA expression. About half of the FLEXI RNAs contained an experimentally identified binding site for one or more proteins in published CLIP-seq datasets. In addition to proteins that have RNA splicing-or miRNA-related functions, proteins that bind ≥ 30 different FLEXI RNAs included transcription factors, chromatin remodeling proteins, and proteins involved in cellular stress responses and growth regulation, potentially linking FLEXI RNA binding to these processes. Our findings suggest previously unsuspected connections between intron RNAs and cellular regulatory pathways and identify a large new class of RNAs that may serve as broadly applicable biomarkers for human diseases.