scholarly journals Genome analysis of Plectus murrayi, a nematode from continental Antarctica

2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Xia Xue ◽  
Anton Suvorov ◽  
Stanley Fujimoto ◽  
Adler R Dilman ◽  
Byron J Adams

Abstract Plectus murrayi is one of the most common and locally abundant invertebrates of continental Antarctic ecosystems. Because it is readily cultured on artificial medium in the laboratory and highly tolerant to an extremely harsh environment, P. murrayi is emerging as a model organism for understanding the evolutionary origin and maintenance of adaptive responses to multiple environmental stressors, including freezing and desiccation. The de novo assembled genome of P. murrayi contains 225.741 million base pairs and a total of 14,689 predicted genes. Compared to Caenorhabditis elegans, the architectural components of P. murrayi are characterized by a lower number of protein-coding genes, fewer transposable elements, but more exons, than closely related taxa from less harsh environments. We compared the transcriptomes of lab-reared P. murrayi with wild-caught P. murrayi and found genes involved in growth and cellular processing were up-regulated in lab-cultured P. murrayi, while a few genes associated with cellular metabolism and freeze tolerance were expressed at relatively lower levels. Preliminary comparative genomic and transcriptomic analyses suggest that the observed constraints on P. murrayi genome architecture and functional gene expression, including genome decay and intron retention, may be an adaptive response to persisting in a biotically simplified, yet consistently physically harsh environment.

2014 ◽  
Vol 2014 ◽  
pp. 1-10 ◽  
Author(s):  
Dmitrii E. Polev ◽  
Iuliia K. Karnaukhova ◽  
Larisa L. Krukovskaya ◽  
Andrei P. Kozlov

Human geneLOC100505644 uncharacterized LOC100505644 [Homo sapiens](Entrez Gene ID 100505644) is abundantly expressed in tumors but weakly expressed in few normal tissues. Till now the function of this gene remains unknown. Here we identified the chromosomal borders of the transcribed region and the major splice form of theLOC100505644-specific transcript. We characterised the major regulatory motifs of the gene and its splice sites. Analysis of the secondary structure of the major transcript variant revealed a hairpin-like structure characteristic for precursor microRNAs. Comparative genomic analysis of the locus showed that it originated in primatesde novo. Taken together, our data indicate that human geneLOC100505644encodes some non-protein coding RNA, likely a microRNA. It was assigned a gene symbolELFN1-AS1(ELFN1 antisense RNA 1 (non-protein coding)). This gene combines features of evolutionary novelty and predominant expression in tumors.


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Andreas Lange ◽  
Prajal H. Patel ◽  
Brennen Heames ◽  
Adam M. Damry ◽  
Thorsten Saenger ◽  
...  

AbstractComparative genomic studies have repeatedly shown that new protein-coding genes can emerge de novo from noncoding DNA. Still unknown is how and when the structures of encoded de novo proteins emerge and evolve. Combining biochemical, genetic and evolutionary analyses, we elucidate the function and structure of goddard, a gene which appears to have evolved de novo at least 50 million years ago within the Drosophila genus. Previous studies found that goddard is required for male fertility. Here, we show that Goddard protein localizes to elongating sperm axonemes and that in its absence, elongated spermatids fail to undergo individualization. Combining modelling, NMR and circular dichroism (CD) data, we show that Goddard protein contains a large central α-helix, but is otherwise partially disordered. We find similar results for Goddard’s orthologs from divergent fly species and their reconstructed ancestral sequences. Accordingly, Goddard’s structure appears to have been maintained with only minor changes over millions of years.


2015 ◽  
Author(s):  
Katarzyna B Hooks ◽  
Samina Naseeb ◽  
Sam Griffiths-Jones ◽  
Daniela Delneri

The Saccharomyces cerevisiae genome has undergone extensive intron loss during its evolutionary history. It has been suggested that the few remaining introns (in only 5% of protein-coding genes) are retained because of their impact on function under stress conditions. Here, we explore the possibility that novel non-coding RNA structures (ncRNAs) are embedded within intronic sequences and are contributing to phenotype and intron retention in yeast. We employed de novo RNA structure prediction tools to screen intronic sequences in S. cerevisiae and 36 other fungi. We identified and validated 19 new intronic RNAs via RNAseq and RT-PCR. Contrary to common belief that excised introns are rapidly degraded, we found that, in six cases, the excised introns were maintained intact in the cells. In other two cases we showed that the ncRNAs were further processed from their introns. RNAseq analysis confirmed higher expression of introns in the ribosomial protein genes containing predicted RNA structures. We deleted the novel intronic RNA structure within the GLC7 intron and showed that this predicted ncRNA, rather than the intron itself, is responsible for the cell???s ability to respond to salt stress. We also showed a direct association between the presence of the intronic ncRNA and GLC7 expression. Overall, these data support the notion that some introns may have been maintained in the genome because they harbour functional ncRNAs.


2019 ◽  
Author(s):  
Nabil Girollet ◽  
Bernadette Rubio ◽  
Pierre-François Bert

AbstractGrapevine is one of the most important fruit species in the world. In order to better understand genetic basis of traits variation and facilitate the breeding of new genotypes, we sequenced, assembled, and annotated the genome of the American native Vitis riparia, one of the main species used worldwide for rootstock and scion breeding. A total of 164 Gb raw DNA reads were obtained from Vitis riparia resulting in a 225X depth of coverage. We generated a genome assembly of the V. riparia grape de novo using the PacBio long-reads that was phased with the 10x Genomics Chromium linked-reads. At the chromosome level, a 500 Mb genome was generated with a scaffold N50 size of 1 Mb. More than 34% of the whole genome were identified as repeat sequences, and 37,207 protein-coding genes were predicted. This genome assembly sets the stage for comparative genomic analysis of the diversification and adaptation of grapevine and will provide a solid resource for further genetic analysis and breeding of this economically important species.


2021 ◽  
Vol 9 (7) ◽  
pp. 1488
Author(s):  
Anna Grankvist ◽  
Daniel Jaén-Luchoro ◽  
Linda Wass ◽  
Per Sikora ◽  
Christine Wennerås

Tick-borne ‘Neoehrlichia (N.) mikurensis’ is the cause of neoehrlichiosis, an infectious vasculitis of humans. This strict intracellular pathogen is a member of the family Anaplasmataceae and has been unculturable until recently. The only available genetic data on this new pathogen are six partially sequenced housekeeping genes. The aim of this study was to advance the knowledge regarding ‘N. mikurensis’ genomic relatedness with other Anaplasmataceae members, intra-species genotypic variability and potential virulence factors explaining its tropism for vascular endothelium. Here, we present the de novo whole-genome sequences of three ‘N. mikurensis’ strains derived from Swedish patients diagnosed with neoehrlichiosis. The genomes were obtained by extraction of DNA from patient plasma, library preparation using 10x Chromium technology, and sequencing by Illumina Hiseq-4500. ‘N. mikurensis’ was found to have the next smallest genome of the Anaplasmataceae family (1.1 Mbp with 27% GC contents) consisting of 845 protein-coding genes, every third of which with unknown function. Comparative genomic analyses revealed that ‘N. mikurensis’ was more closely related to Ehrlichia chaffeensis than to Ehrlichia ruminantium, the opposite of what 16SrRNA sequence-based phylogenetic analyses determined. The genetic variability of the three whole-genome-sequenced ‘N. mikurensis’ strains was extremely low, between 0.14 and 0.22‰, a variation that was associated with geographic origin. No protein-coding genes exclusively shared by N. mikurensis and E. ruminantium were identified to explain their common tropism for vascular endothelium.


Author(s):  
Alex Dornburg ◽  
Zheng Wang ◽  
Junrui Wang ◽  
Elizabeth S Mo ◽  
Francesc Lopez-Giraldez ◽  
...  

Abstract Comparative genomic analyses have enormous potential for identifying key genes central to human health phenotypes, including those that promote cancers. In particular, the successful development of novel therapeutics using model species requires phylogenetic analyses to determine molecular homology. Accordingly, we investigate the evolutionary histories of anaplastic lymphoma kinase (ALK)—which can underlie tumorigenesis in neuroblastoma, non-small cell lung cancer, and anaplastic large-cell lymphoma—its close relative leukocyte tyrosine kinase (LTK) and their candidate ligands. Homology of ligands identified in model organisms to those functioning in humans remains unclear. Therefore, we searched for homologs of the human genes across metazoan genomes, finding that the candidate ligands Jeb and Hen-1 were restricted to non-vertebrate species. In contrast, the ligand AUG was only identified in vertebrates. We found two ALK-like and four AUG-like protein-coding genes in lamprey. Of these six genes, only one ALK-like and two AUG-like genes exhibited early embryonic expression that parallels model mammal systems. Two copies of AUG are present in nearly all jawed vertebrates. Our phylogenetic analysis strongly supports the presence of previously unrecognized functional convergences of ALK and LTK between actinopterygians and sarcopterygians—despite contemporaneous, highly conserved synteny of ALK and LTK. These findings provide critical guidance regarding the propriety of fish and mammal models with regard to model-organism-based investigation of these medically important genes. In sum, our results provide the phylogenetic context necessary for effective investigations of the functional roles and biology of these critically important receptors.


2021 ◽  
Author(s):  
Andreas Lange ◽  
Prajal H. Patel ◽  
Brennen Heames ◽  
Adam M. Damry ◽  
Thorsten Saenger ◽  
...  

AbstractComparative genomic studies have repeatedly shown that new protein-coding genes can emerge de novo from non-coding DNA. Still unknown is how and when the structures of encoded de novo proteins emerge and evolve. Combining biochemical, genetic and evolutionary analyses, we elucidate the function and structure of goddard, a gene which appears to have evolved de novo at least 50 million years ago within the Drosophila genus.Previous studies found that goddard is required for male fertility. Here, we show that Goddard protein localizes to elongating sperm axonemes and that in its absence, elongated spermatids fail to undergo individualization. Combining modelling, NMR and CD data, we show that Goddard protein contains a large central α-helix, but is otherwise partially disordered. We find similar results for Goddard’s orthologs from divergent fly species and their reconstructed ancestral sequences. Accordingly, Goddard’s structure appears to have been maintained with only minor changes over millions of years.


2015 ◽  
Vol 112 (11) ◽  
pp. E1257-E1262 ◽  
Author(s):  
Yan-Bo Sun ◽  
Zi-Jun Xiong ◽  
Xue-Yan Xiang ◽  
Shi-Ping Liu ◽  
Wei-Wei Zhou ◽  
...  

The development of efficient sequencing techniques has resulted in large numbers of genomes being available for evolutionary studies. However, only one genome is available for all amphibians, that of Xenopus tropicalis, which is distantly related from the majority of frogs. More than 96% of frogs belong to the Neobatrachia, and no genome exists for this group. This dearth of amphibian genomes greatly restricts genomic studies of amphibians and, more generally, our understanding of tetrapod genome evolution. To fill this gap, we provide the de novo genome of a Tibetan Plateau frog, Nanorana parkeri, and compare it to that of X. tropicalis and other vertebrates. This genome encodes more than 20,000 protein-coding genes, a number similar to that of Xenopus. Although the genome size of Nanorana is considerably larger than that of Xenopus (2.3 vs. 1.5 Gb), most of the difference is due to the respective number of transposable elements in the two genomes. The two frogs exhibit considerable conserved whole-genome synteny despite having diverged approximately 266 Ma, indicating a slow rate of DNA structural evolution in anurans. Multigenome synteny blocks further show that amphibians have fewer interchromosomal rearrangements than mammals but have a comparable rate of intrachromosomal rearrangements. Our analysis also identifies 11 Mb of anuran-specific highly conserved elements that will be useful for comparative genomic analyses of frogs. The Nanorana genome offers an improved understanding of evolution of tetrapod genomes and also provides a genomic reference for other evolutionary studies.


2019 ◽  
Vol 6 (1) ◽  
Author(s):  
Fen Zhang ◽  
Wei Li ◽  
Cheng-wen Gao ◽  
Dan Zhang ◽  
Li-zhi Gao

Abstract Tea is the most popular non-alcoholic caffeine-containing and the oldest beverage in the world. In this study, we de novo assembled the chloroplast (cp) and mitochondrial (mt) genomes of C. sinensis var. assamica cv. Yunkang10 into a circular contig of 157,100 bp and two complete circular scaffolds (701719 bp and 177329 bp), respectively. We correspondingly annotated a total of 141 cp genes and 71 mt genes. Comparative analysis suggests repeat-rich nature of the mt genome compared to the cp genome, for example, with the characterization of 37,878 bp and 149 bp of long repeat sequences and 665 and 214 SSRs, respectively. We also detected 478 RNA-editing sites in 42 protein-coding mt genes, which are ~4.4-fold more than 54 RNA-editing sites detected in 21 protein-coding cp genes. The high-quality cp and mt genomes of C. sinensis var. assamica presented in this study will become an important resource for a range of genetic, functional, evolutionary and comparative genomic studies in tea tree and other Camellia species of the Theaceae family.


2021 ◽  
Author(s):  
Wei-Hsuan Chuang ◽  
Hsueh-Chien Cheng ◽  
Yu-Jung Chang ◽  
Pao-Yin Fu ◽  
Yi-Chen Huang ◽  
...  

AbstractWe propose a novel method, GABOLA, which utilizes long-range genomic information provided by accurate linked short reads jointly with long reads to improve the integrity and resolution of whole genome assemblies especially in complex genetic regions. We validated GABOLA on human and Japanese eel genomes. On the two human samples, we filled in more bases spanning 23.3Mbp and 46.2Mbp than Supernova assembler, covering over 3,200 functional genes which includes 8,500 exons and 15,000 transcripts. Among them, multiple genes related to various types of cancer were identified. Moreover, we discovered additional 11,031,487 base pairs of repeat sequences and 218 exclusive repeat patterns, some of which are known to be linked to several disorders such as neuron degenerative diseases. As for the eel genome, we successfully raised the genetic benchmarking score to 94.6% while adding 24.7 million base pairs. These results manifest the capability of GABOLA in the optimization of whole genome assembly and the potential in precise disease diagnosis and high-quality non-model organism breeding.Availability: The docker image and source code of GABOLA assembler are available at https://hub.docker.com/r/lsbnb/gabola and https://github.com/lsbnb/gabola respectively.


Sign in / Sign up

Export Citation Format

Share Document