scholarly journals Single molecule, full-length transcript sequencing provides insight into the extreme metabolism of ruby-throated hummingbird Archilochus colubris

2017 ◽  
Author(s):  
Rachael E. Workman ◽  
Alexander M. Myrka ◽  
Elizabeth Tseng ◽  
G. William Wong ◽  
Kenneth C. Welch ◽  
...  

AbstractHummingbirds can support their high metabolic rates exclusively by oxidizing ingested sugars, which is unsurprising given their sugar-rich nectar diet and use of energetically expensive hovering flight. However, they cannot rely on dietary sugars as a fuel during fasting periods, such as during the night, at first light, or when undertaking long-distance migratory flights, and must instead rely exclusively on onboard lipids. This metabolic flexibility is remarkable both in that the birds can switch between exclusive use of each fuel type within minutes and in that de novo lipogenesis from dietary sugar precursors is the principle way in which fat stores are built, sometimes at exceptionally high rates, such as during the few days prior to a migratory flight. The hummingbird hepatopancreas is the principle location of de novo lipogenesis and likely plays a key role in fuel selection, fuel switching, and glucose homeostasis. Yet understanding how this tissue, and the whole organism, achieves and moderates high rates of energy turnover is hampered by a fundamental lack of information regarding how genes coding for relevant enzymes differ in their sequence, expression, and regulation in these unique animals. To address this knowledge gap, we generated a de novo transcriptome of the hummingbird liver using PacBio full-length cDNA sequencing (Iso-Seq), yielding a total of 8.6Gb of sequencing data, or 2.6M reads from 4 different size fractions. We analyzed data using the SMRTAnalysis v3.1 Iso-Seq pipeline, including classification of reads and clustering of isoforms (ICE) followed by error-correction (Arrow). With COGENT, we clustered different isoforms into gene families to generate de novo gene contigs. We performed orthology analysis to identify closely related sequences between our transcriptome and other avian and human gene sets. We also aligned our transcriptome against the Calypte anna genome where possible. Finally, we closely examined homology of critical lipid metabolic genes between our transcriptome data and avian and human genomes. We confirmed high levels of sequence divergence within hummingbird lipogenic enzymes, suggesting a high probability of adaptive divergent function in the hepatic lipogenic pathways. Our results have leveraged cutting-edge technology and a novel bioinformatics pipeline to provide a compelling first direct look at the transcriptome of this incredible organism.

2021 ◽  
Vol 12 ◽  
Author(s):  
Yupeng Cui ◽  
Xinqiang Gao ◽  
Jianshe Wang ◽  
Zengzhen Shang ◽  
Zhibin Zhang ◽  
...  

Artemisia argyi is an important medicinal plant widely utilized for moxibustion heat therapy in China. The terpenoid biosynthesis process in A. argyi is speculated to play a key role in conferring its medicinal value. However, the molecular mechanism underlying terpenoid biosynthesis remains unclear, in part because the reference genome of A. argyi is unavailable. Moreover, the full-length transcriptome of A. argyi has not yet been sequenced. Therefore, in this study, de novo transcriptome sequencing of A. argyi's root, stem, and leaf tissues was performed to obtain those candidate genes related to terpenoid biosynthesis, by combining the PacBio single-molecule real-time (SMRT) and Illumina sequencing NGS platforms. And more than 55.4 Gb of sequencing data and 108,846 full-length reads (non-chimeric) were generated by the Illumina and PacBio platform, respectively. Then, 53,043 consensus isoforms were clustered and used to represent 36,820 non-redundant transcripts, of which 34,839 (94.62%) were annotated in public databases. In the comparison sets of leaves vs roots, and leaves vs stems, 13,850 (7,566 up-regulated, 6,284 down-regulated) and 9,502 (5,284 up-regulated, 4,218 down-regulated) differentially expressed transcripts (DETs) were obtained, respectively. Specifically, the expression profile and KEGG functional enrichment analysis of these DETs indicated that they were significantly enriched in the biosynthesis of amino acids, carotenoids, diterpenoids and flavonoids, as well as the metabolism processes of glycine, serine and threonine. Moreover, multiple genes encoding significant enzymes or transcription factors related to diterpenoid biosynthesis were highly expressed in the A. argyi leaves. Additionally, several transcription factor families, such as RLK-Pelle_LRR-L-1 and RLK-Pelle_DLSV, were also identified. In conclusion, this study offers a valuable resource for transcriptome information, and provides a functional genomic foundation for further research on molecular mechanisms underlying the medicinal use of A. argyi leaves.


2021 ◽  
Vol 22 (2) ◽  
pp. 787
Author(s):  
Ziqing He ◽  
Yingjuan Su ◽  
Ting Wang

Cephalotaxus oliveri is a tertiary relict conifer endemic to China, regarded as a national second-level protected plant in China. This species has experienced severe changes in temperature and precipitation in the past millions of years, adapting well to harsh environments. In view of global climate change and its endangered conditions, it is crucial to study how it responds to changes in temperature and precipitation for its conservation work. In this study, single-molecule real-time (SMRT) sequencing and Illumina RNA sequencing were combined to generate the complete transcriptome of C. oliveri. Using the RNA-seq data to correct the SMRT sequencing data, the four tissues obtained 63,831 (root), 58,108 (stem), 33,013 (leaf) and 62,436 (male cone) full-length unigenes, with a N50 length of 2523, 3480, 3181, and 3267 bp, respectively. Additionally, 35,887, 11,306, 36,422, and 25,439 SSRs were detected for the male cone, leaf, root, and stem, respectively. The number of long non-coding RNAs predicted from the root was the largest (11,113), and the other tissues were 3408 (stem), 3193 (leaf), and 3107 (male cone), respectively. Functional annotation and enrichment analysis of tissue-specific expressed genes revealed the special roles in response to environmental stress and adaptability in the different four tissues. We also characterized the gene families and pathways related to abiotic factors. This work provides a comprehensive transcriptome resource for C. oliveri, and this resource will facilitate further studies on the functional genomics and adaptive evolution of C. oliveri.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Hai-Feng Tian ◽  
Qiao-Mu Hu ◽  
Zhong Li

Abstract The swamp eel (Monopterus albus) is one economically important fish in China and South-Eastern Asia and a good model species to study sex inversion. There are different genetic lineages and multiple local strains of swamp eel in China, and one local strain of M. albus with deep yellow and big spots has been selected for consecutive selective breeding due to superiority in growth rate and fecundity. A high-quality reference genome of the swamp eel would be a very useful resource for future selective breeding program. In the present study, we applied PacBio single-molecule sequencing technique (SMRT) and the high-throughput chromosome conformation capture (Hi-C) technologies to assemble the M. albus genome. A 799 Mb genome was obtained with the contig N50 length of 2.4 Mb and scaffold N50 length of 67.24 Mb, indicating 110-fold and ∼31.87-fold improvement compared to the earlier released assembly (∼22.24 Kb and 2.11 Mb, respectively). Aided with Hi-C data, a total of 750 contigs were reliably assembled into 12 chromosomes. Using 22,373 protein-coding genes annotated here, the phylogenetic relationships of the swamp eel with other teleosts showed that swamp eel separated from the common ancestor of Zig-zag eel ∼49.9 million years ago, and 769 gene families were found expanded, which are mainly enriched in the immune system, sensory system, and transport and catabolism. This highly accurate, chromosome-level reference genome of M. albus obtained in this work will be used for the development of genome-scale selective breeding.


2018 ◽  
Vol 35 (15) ◽  
pp. 2654-2656 ◽  
Author(s):  
Guoli Ji ◽  
Wenbin Ye ◽  
Yaru Su ◽  
Moliang Chen ◽  
Guangzao Huang ◽  
...  

Abstract Summary Alternative splicing (AS) is a well-established mechanism for increasing transcriptome and proteome diversity, however, detecting AS events and distinguishing among AS types in organisms without available reference genomes remains challenging. We developed a de novo approach called AStrap for AS analysis without using a reference genome. AStrap identifies AS events by extensive pair-wise alignments of transcript sequences and predicts AS types by a machine-learning model integrating more than 500 assembled features. We evaluated AStrap using collected AS events from reference genomes of rice and human as well as single-molecule real-time sequencing data from Amborella trichopoda. Results show that AStrap can identify much more AS events with comparable or higher accuracy than the competing method. AStrap also possesses a unique feature of predicting AS types, which achieves an overall accuracy of ∼0.87 for different species. Extensive evaluation of AStrap using different parameters, sample sizes and machine-learning models on different species also demonstrates the robustness and flexibility of AStrap. AStrap could be a valuable addition to the community for the study of AS in non-model organisms with limited genetic resources. Availability and implementation AStrap is available for download at https://github.com/BMILAB/AStrap. Supplementary information Supplementary data are available at Bioinformatics online.


BMC Genomics ◽  
2020 ◽  
Vol 21 (1) ◽  
Author(s):  
Pingping Liang ◽  
Hafiz Sohaib Ahmed Saqib ◽  
Xiaomin Ni ◽  
Yingjia Shen

Abstract Background Marine medaka (Oryzias melastigma) is considered as an important ecotoxicological indicator to study the biochemical, physiological and molecular responses of marine organisms towards increasing amount of pollutants in marine and estuarine waters. Results In this study, we reported a high-quality and accurate de novo genome assembly of marine medaka through the integration of single-molecule sequencing, Illumina paired-end sequencing, and 10X Genomics linked-reads. The 844.17 Mb assembly is estimated to cover more than 98% of the genome and is more continuous with fewer gaps and errors than the previous genome assembly. Comparison of O. melastigma with closely related species showed significant expansion of gene families associated with DNA repair and ATP-binding cassette (ABC) transporter pathways. We identified 274 genes that appear to be under significant positive selection and are involved in DNA repair, cellular transportation processes, conservation and stability of the genome. The positive selection of genes and the considerable expansion in gene numbers, especially related to stimulus responses provide strong supports for adaptations of O. melastigma under varying environmental stresses. Conclusions The highly contiguous marine medaka genome and comparative genomic analyses will increase our understanding of the underlying mechanisms related to its extraordinary adaptation capability, leading towards acceleration in the ongoing and future investigations in marine ecotoxicology.


Forests ◽  
2020 ◽  
Vol 11 (8) ◽  
pp. 866
Author(s):  
Lei Kan ◽  
Qicong Liao ◽  
Zhiyao Su ◽  
Yushan Tan ◽  
Shuyu Wang ◽  
...  

Madhuca pasquieri (Dubard) Lam. is a tree on the International Union for Conservation of Nature Red List and a national key protected wild plant (II) of China, known for its seed oil and timber. However, lacking of genomic and transcriptome data for this species hampers study of its reproduction, utilization, and conservation. Here, single-molecule long-read sequencing (PacBio) and next-generation sequencing (Illumina) were combined to obtain the transcriptome from five developmental stages of M. pasquieri. Overall, 25,339 transcript isoforms were detected by PacBio, including 24,492 coding sequences (CDSs), 9440 simple sequence repeats (SSRs), 149 long non-coding RNAs (lncRNAs), and 182 alternative splicing (AS) events, a majority was retained intron (RI). A further 1058 transcripts were identified as transcriptional factors (TFs) from 51 TF families. PacBio recovered more full-length transcript isoforms with a longer length, and a higher expression level, whereas larger number of transcripts (124,405) was captured in de novo from Illumina. Using Nr, Swissprot, KOG, and KEGG databases, 24,405 transcripts (96.31%) were annotated by PacBio. Functional annotation revealed a role for the auxin, abscisic acid, gibberellin, and cytokinine metabolic pathways in seed germination and post-germination. These findings support further studies on seed germination mechanism and genome of M. pasquieri, and better protection of this endangered species.


2019 ◽  
Vol 37 (4) ◽  
pp. 1193-1201 ◽  
Author(s):  
Mathieu Genete ◽  
Vincent Castric ◽  
Xavier Vekemans

Abstract Plant self-incompatibility (SI) is a genetic system that prevents selfing and enforces outcrossing. Because of strong balancing selection, the genes encoding SI are predicted to maintain extraordinarily high levels of polymorphism, both in terms of the number of functionally distinct S-alleles that segregate in SI species and in terms of their nucleotide sequence divergence. However, because of these two combined features, documenting polymorphism of these genes also presents important methodological challenges that have so far largely prevented the comprehensive analysis of complete allelic series in natural populations, and also precluded the obtention of complete genic sequences for many S-alleles. Here, we develop a powerful methodological approach based on a computationally optimized comparison of short Illumina sequencing reads from genomic DNA to a database of known nucleotide sequences of the extracellular domain of SRK (eSRK). By examining mapping patterns along the reference sequences, we obtain highly reliable predictions of S-genotypes from individuals collected from natural populations of Arabidopsis halleri. Furthermore, using a de novo assembly approach of the filtered short reads, we obtain full-length sequences of eSRK even when the initial sequence in the database was only partial, and we discover putative new SRK alleles that were not initially present in the database. When including those new alleles in the reference database, we were able to resolve the complete diploid SI genotypes of all individuals. Beyond the specific case of Brassicaceae S-alleles, our approach can be readily applied to other polymorphic loci, given reference allelic sequences are available.


2017 ◽  
Vol 114 (46) ◽  
pp. E9873-E9882 ◽  
Author(s):  
Gal Haimovich ◽  
Christopher M. Ecker ◽  
Margaret C. Dunagin ◽  
Elliott Eggan ◽  
Arjun Raj ◽  
...  

RNAs have been shown to undergo transfer between mammalian cells, although the mechanism behind this phenomenon and its overall importance to cell physiology is not well understood. Numerous publications have suggested that RNAs (microRNAs and incomplete mRNAs) undergo transfer via extracellular vesicles (e.g., exosomes). However, in contrast to a diffusion-based transfer mechanism, we find that full-length mRNAs undergo direct cell–cell transfer via cytoplasmic extensions characteristic of membrane nanotubes (mNTs), which connect donor and acceptor cells. By employing a simple coculture experimental model and using single-molecule imaging, we provide quantitative data showing that mRNAs are transferred between cells in contact. Examples of mRNAs that undergo transfer include those encoding GFP, mouse β-actin, and human Cyclin D1, BRCA1, MT2A, and HER2. We show that intercellular mRNA transfer occurs in all coculture models tested (e.g., between primary cells, immortalized cells, and in cocultures of immortalized human and murine cells). Rapid mRNA transfer is dependent upon actin but is independent of de novo protein synthesis and is modulated by stress conditions and gene-expression levels. Hence, this work supports the hypothesis that full-length mRNAs undergo transfer between cells through a refined structural connection. Importantly, unlike the transfer of miRNA or RNA fragments, this process of communication transfers genetic information that could potentially alter the acceptor cell proteome. This phenomenon may prove important for the proper development and functioning of tissues as well as for host–parasite or symbiotic interactions.


2019 ◽  
Author(s):  
Raúl A. González-Pech ◽  
Timothy G. Stephens ◽  
Yibi Chen ◽  
Amin R. Mohamed ◽  
Yuanyuan Cheng ◽  
...  

AbstractSymbiodiniaceae are predominantly symbiotic dinoflagellates critical to corals and other reef organisms. Symbiodinium is a basal symbiodiniacean lineage and includes symbiotic and free-living taxa. However, the molecular mechanisms underpinning these distinct lifestyles remain little known. Here, we present high-quality de novo genome assemblies for the symbiotic Symbiodinium tridacnidorum CCMP2592 (genome size 1.3 Gbp) and the free-living Symbiodinium natans CCMP2548 (genome size 0.74 Gbp). These genomes display extensive sequence divergence, sharing only ~1.5% conserved regions (≥90% identity). We predicted 45,474 and 35,270 genes for S. tridacnidorum and S. natans, respectively; of the 58,541 homologous gene families, 28.5% are common to both genomes. We recovered a greater extent of gene duplication and higher abundance of repeats, transposable elements and pseudogenes in the genome of S. tridacnidorum than in that of S. natans. These findings demonstrate that genome structural rearrangements are pertinent to distinct lifestyles in Symbiodinium, and may contribute to the vast genetic diversity within the genus, and more broadly in Symbiodiniaceae. Moreover, the results from our whole-genome comparisons against a free-living outgroup support the notion that the symbiotic lifestyle is a derived trait in, and that the free-living lifestyle is ancestral to, Symbiodinium.


Author(s):  
Wei Li ◽  
Kui Li ◽  
Ying Huang ◽  
Cong Shi ◽  
Wu-Shu Hu ◽  
...  

AbstractAsian cultivated rice is believed to have been domesticated from an immediate ancestral progenitor, Oryza rufipogon, which provides promising sources of novel alleles for world rice improvement. Here we first present a high-quality de novo assembly of the typical O. rufipogon genome through the integration of single-molecule sequencing (SMRT), 10× and Hi-C technologies. This chromosome-based reference genome allows a multi-species comparative analysis of the annual selfing O. sativa and its two wild progenitors, the annual selfing O. nivara and perennial outcrossing O. rufipogon, identifying massive numbers of dispensable genes that are functionally enriched in reproductive process. Comparative genomic analyses identified millions of genomic variants, of which large-effect mutations (e.g., SVs, CNV and PAVs) may affect the variation of agronomically significant traits. We demonstrate how lineage-specific expansion of rice gene families may have contributed to the formation of reproduction isolation (e.g., the recognition of pollen and male sterility), thus brightening the role in driving mating system evolution during the evolutionary process of recent speciation. We document thousands of positively selected genes that are mainly involved in flower development, ripening, pollination, reproduction and response to biotic- and abiotic stresses. We show that selection pressures may serve as crucial forces to govern substantial genomic alterations among the three rice species that form the genetic basis of rapid evolution of mating and reproductive systems under diverse habitats. This first chromosome-based wild rice genome in the genus Oryza will become powerful to accelerate the exploration of untapped genomic diversity from wild rice for the enhancement of elite rice cultivars.


Sign in / Sign up

Export Citation Format

Share Document