transcript structure
Recently Published Documents


TOTAL DOCUMENTS

32
(FIVE YEARS 9)

H-INDEX

7
(FIVE YEARS 2)

2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Heon Seok Kim ◽  
Susan M. Grimes ◽  
Anna C. Hooker ◽  
Billy T. Lau ◽  
Hanlee P. Ji

AbstractWe developed a single-cell approach to detect CRISPR-modified mRNA transcript structures. This method assesses how genetic variants at splicing sites and splicing factors contribute to alternative mRNA isoforms. We determine how alternative splicing is regulated by editing target exon-intron segments or splicing factors by CRISPR-Cas9 and their consequences on transcriptome profile. Our method combines long-read sequencing to characterize the transcript structure and short-read sequencing to match the single-cell gene expression profiles and gRNA sequence and therefore provides targeted genomic edits and transcript isoform structure detection at single-cell resolution.


2021 ◽  
Author(s):  
Heon Seok Kim ◽  
Susan M Grimes ◽  
Anna C Hooker ◽  
Billy T Lau ◽  
Hanlee P Ji

Transcript isoforms are mRNAs that arise from alternative splicing events. During RNA processing, different combinations of a gene's exons lead to a diverse set of isoforms. Polymorphisms or mutations at splice junctions can generate alternative splicing events. Various splicing factors also impact the representation of a gene's transcript isoforms. To assess how these two features contribute to alternative splicing, we developed a single cell approach to introduce CRISPR edits that modify mRNA transcript structure. Our method combines (1) long-read sequencing to characterize the expressed transcripts and identify the edit at single cell resolution; (2) short-read sequencing to match the single cell gene expression profiles of the cells with the altered isoform. First, we modify target exon-intron segments with CRISPR-Cas9. Second, using cDNAs with cell barcodes, we use long read sequencing to directly identify the changes in transcript isoforms from the targeted CRISPR edits. As a variation on this approach, we also determined how modifying specific splicing factors influence isoform expression and structure. Overall, we demonstrate how the integration of single cell long read analysis and CRISPR engineering can be used to directly confirm transcript isoform and target genomic edits at single cell resolution. This approach will improve our understanding of the role of alternative splicing in transcriptional regulation.


2021 ◽  
Author(s):  
Dafni A Glinos ◽  
Garrett Garborcauskas ◽  
Paul Hoffman ◽  
Nava Ehsan ◽  
Lihua Jiang ◽  
...  

SummaryRegulation of transcript structure generates transcript diversity and plays an important role in human disease. The advent of long-read sequencing technologies offers the opportunity to study the role of genetic variation in transcript structure. In this paper, we present a large human long-read RNA-seq dataset using the Oxford Nanopore Technologies platform from 88 samples from GTEx tissues and cell lines, complementing the GTEx resource. We identified just under 100,000 new transcripts for annotated genes, and validated the protein expression of a similar proportion of novel and annotated transcripts. We developed a new computational package, LORALS, to analyze genetic effects of rare and common variants on the transcriptome via allele-specific analysis of long reads. We called allele-specific expression and transcript structure events, providing novel insights into the specific transcript alterations caused by common and rare genetic variants and highlighting the resolution gained from long-read data. We were able to perturb transcript structure upon knockdown of PTBP1, an RNA binding protein that mediates splicing, thereby finding genetic regulatory effects that are modified by the cellular environment. Finally, we use this dataset to enhance variant interpretation and study rare variants leading to aberrant splicing patterns.


Author(s):  
Xu Shi ◽  
Andrew F Neuwald ◽  
Xiao Wang ◽  
Tian-Li Wang ◽  
Leena Hilakivi-Clarke ◽  
...  

Abstract Motivation High-throughput RNA sequencing has revolutionized the scope and depth of transcriptome analysis. Accurate reconstruction of a phenotype-specific transcriptome is challenging due to the noise and variability of RNA-seq data. This requires computational identification of transcripts from multiple samples of the same phenotype, given the underlying consensus transcript structure. Results We present a Bayesian method, integrated assembly of phenotype-specific transcripts (IntAPT), that identifies phenotype-specific isoforms from multiple RNA-seq profiles. IntAPT features a novel two-layer Bayesian model to capture the presence of isoforms at the group layer and to quantify the abundance of isoforms at the sample layer. A spike-and-slab prior is used to model the isoform expression and to enforce the sparsity of expressed isoforms. Dependencies between the existence of isoforms and their expression are modeled explicitly to facilitate parameter estimation. Model parameters are estimated iteratively using Gibbs sampling to infer the joint posterior distribution, from which the presence and abundance of isoforms can reliably be determined. Studies using both simulations and real datasets show that IntAPT consistently outperforms existing methods for the IntAPT. Experimental results demonstrate that, despite sequencing errors, IntAPT exhibits a robust performance among multiple samples, resulting in notably improved identification of expressed isoforms of low abundance. Availability and implementation The IntAPT package is available at http://github.com/henryxushi/IntAPT. Supplementary information Supplementary data are available at Bioinformatics online.


Genetics ◽  
2020 ◽  
Vol 216 (4) ◽  
pp. 1039-1049
Author(s):  
Weijia Su ◽  
Tao Zuo ◽  
Thomas Peterson

Transposable elements (TEs) are DNA sequences that can mobilize and proliferate throughout eukaryotic genomes. Previous studies have shown that in plant genomes, TEs can influence gene expression in various ways, such as inserting in introns or exons to alter transcript structure and content, and providing novel promoters and regulatory elements to generate new regulatory patterns. Furthermore, TEs can also regulate gene expression at the epigenetic level by modifying chromatin structure, changing DNA methylation status, and generating small RNAs. In this study, we demonstrated that Ac/fractured Ac (fAc) TEs are able to induce ectopic gene expression by duplicating and shuffling enhancer elements. Ac/fAc elements belong to the hAT family of class II TEs. They can undergo standard transposition events, which involve the two termini of a single transposon, or alternative transposition events that involve the termini of two different nearby elements. Our previous studies have shown that alternative transposition can generate various genome rearrangements such as deletions, duplications, inversions, translocations, and composite insertions (CIs). We identified >50 independent cases of CIs generated by Ac/fAc alternative transposition and analyzed 10 of them in detail. We show that these CIs induced ectopic expression of the maize pericarp color 2 (p2) gene, which encodes a Myb-related protein. All the CIs analyzed contain sequences including a transcriptional enhancer derived from the nearby p1 gene, suggesting that the CI-induced activation of p2 is affected by mobilization of the p1 enhancer. This is further supported by analysis of a mutant in which the CI is excised and p2 expression is lost. These results show that alternative transposition events are not only able to induce genome rearrangements, but also generate CIs that can control gene expression.


2020 ◽  
Vol 10 (10) ◽  
pp. 3505-3514
Author(s):  
Hongmei Zhuang ◽  
Qiang Wang ◽  
Hongwei Han ◽  
Huifang Liu ◽  
Hao Wang

To generate the full-length transcriptome of Xinjiang green and purple turnips, Brassica rapa var. Rapa, using single-molecule real-time (SMRT) sequencing. The samples of two varieties of Brassica rapa var. Rapa at five developmental stages were collected and combined to perform SMRT sequencing. Meanwhile, next generation sequencing was performed to correct SMRT sequencing data. A series of analyses were performed to investigate the transcript structure. Finally, the obtained transcripts were mapped to the genome of Brassica rapa ssp. pekinesis Chiifu to identify potential novel transcripts. For green turnip (F01), a total of 19.54 Gb clean data were obtained from 8 cells. The number of reads of insert (ROI) and full-length non-chimeric (FLNC) reads were 510,137 and 267,666. In addition, 82,640 consensus isoforms were obtained in the isoform sequences clustering, of which 69,480 were high-quality, and 13,160 low-quality sequences were corrected using Illumina RNA seq data. For purple turnip (F02), there were 20.41 Gb clean data, 552,829 ROIs, and 274,915 FLNC sequences. A total of 93,775 consensus isoforms were obtained, of which 78,798 were high-quality, and the 14,977 low-quality sequences were corrected. Following the removal of redundant sequences, there were 46,516 and 49,429 non-redundant transcripts for F01 and F02, respectively; 7,774 and 9,385 alternative splicing events were predicted for F01 and F02; 63,890 simple sequence repeats, 59,460 complete coding sequences, and 535 long-non coding RNAs were predicted. Moreover, 5,194 and 5,369 novel transcripts were identified by mapping to Brassica rapa ssp. pekinesis Chiifu. The obtained transcriptome data may improve turnip genome annotation and facilitate further study of the Brassica rapa var. Rapa genome and transcriptome.


Author(s):  
Yi Liao ◽  
Xinwen Zhang ◽  
Mahul Chakraborty ◽  
J.J. Emerson

Topologically associating domains (TADs) were recently identified as fundamental units of three-dimensional eukaryotic genomic organization, though our knowledge of the influence of TADs on genome evolution remains preliminary. To study the molecular evolution of TADs in Drosophila species, we constructed a new reference-grade genome assembly and accompanying high-resolution TAD map for D. pseudoobscura. Comparison of D. pseudoobscura and D. melanogaster, which are separated by ~49 million years of divergence, showed that ~30-40% of their genomes retain conserved TADs. Comparative genomic analysis of 17 Drosophila species revealed that chromosomal rearrangement breakpoints are enriched at TAD boundaries but depleted within TADs. Additionally, genes within conserved TADs exhibit lower expression divergence than those located in nonconserved TADs. Furthermore, we found that a substantial proportion of long genes (>50 kbp) in D. melanogaster (42%) and D. pseudoobscura (26%) constitute their own TADs, implying transcript structure may be one of the deterministic factors for TAD formation. Using structural variants (SVs) identified from 14 D. melanogaster strains, its 3 closest sibling species from the D. simulans species complex, and two obscura clade species, we uncovered evidence of selection acting on SVs at TAD boundaries, but with the nature of selection differing between SV types. Deletions are depleted at TAD boundaries in both divergent and polymorphic SVs, suggesting purifying selection, whereas divergent tandem duplications are enriched at TAD boundaries relative to polymorphism, suggesting they are adaptive. Our findings highlight how important TADs are in shaping the acquisition and retention of structural mutations that fundamentally alter genome organization.


2020 ◽  
Author(s):  
Yu Du ◽  
Jie Wang ◽  
Huazhi Chen ◽  
Xiaoxue Fan ◽  
Zhiwei Zhu ◽  
...  

ABSTRACTAscosphaera apis is an entomopathogenic fungus that exclusively infects honeybee larvae, resulting in chalkbrood disease, a widespread fungal disease damaging the beekeeping industry all over the world. In this article, purified mycelia (Aam) and spores (Aas) of A. apis pure culture under lab condition were sequenced using PacBio Sequel platform. In total, 13,302,489 and 9,911,345 subreads were yielded from Aam and Aas, respectively; 394,142 and 274,928 circular consensus sequence (CCS) reads were identified as being full-length non-chimeric (FLNC) reads, with a mean length of 2820 bp and 2602 bp, respectively. Furthermore, 174,095 and 103,845 corrected isoforms were identified, with a N50 length of 3543 bp and 3262 bp, respectively. The reported full-length transcriptome data of A. apis mycelium and spore will provide a valuable resource for improvement of genome and transcriptome annotations as well as better understanding of transcript structure such as alternative splicing and polyadenylation.Value of the dataCurrent dataset offers a set of high-quality full-length transcripts of A. apis.The data can facilitate the improvement of A. apis genome and transcriptome annotations.This dataset benefits further exploration of alternative splicing and polyadenylation of A. apis mRNAs.


Cell Reports ◽  
2019 ◽  
Vol 27 (13) ◽  
pp. 3988-4002.e5 ◽  
Author(s):  
Tina O’Grady ◽  
April Feswick ◽  
Brett A. Hoffman ◽  
Yiping Wang ◽  
Eva M. Medina ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document