scholarly journals Do aye-ayes echolocate? Studying convergent genomic evolution in a primate auditory specialist

2016 ◽  
Author(s):  
Richard J. Bankoff ◽  
Michael Jerjos ◽  
Baily Hohman ◽  
M Elise Lauterbur ◽  
Logan Kistler ◽  
...  

AbstractSeveral taxonomically distinct mammalian groups – certain microbats and cetaceans (e.g. dolphins) – share both morphological adaptations related to echolocation behavior and strong signatures of convergent evolution at the amino acid level across seven genes related to auditory processing. Aye-ayes (Daubentonia madagascariensis) are nocturnal lemurs with a derived auditory processing system. Aye-ayes tap rapidly along the surfaces of dead trees, listening to reverberations to identify the mines of wood-boring insect larvae; this behavior has been hypothesized to functionally mimic echolocation. Here we investigated whether there are signals of genomic convergence between aye-ayes and known mammalian echolocators. We developed a computational pipeline (BEAT: Basic Exon Assembly Tool) that produces consensus sequences for regions of interest from shotgun genomic sequencing data for non-model organisms without requiring de novo genome assembly. We reconstructed complete coding region sequences for the seven convergent echolocating bat-dolphin genes for aye-ayes and another lemur. Sequences were compared in a phylogenetic framework to those of bat and dolphin echolocators and appropriate non-echolocating outgroups. Our analysis reaffirms the existence of amino acid convergence at these loci among echolocating bats and dolphins; we also detected unexpected signals of convergence between echolocating bats and both mice and elephants. However, we observed no significant signal of amino acid convergence between aye-ayes and echolocating bats and dolphins; our results thus suggest that aye-aye tap-foraging auditory adaptations represent distinct evolutionary innovations. These results are also consistent with a developing consensus that convergent behavioral ecology is not necessarily a reliable guide to convergent molecular evolution.

2018 ◽  
Vol 35 (15) ◽  
pp. 2654-2656 ◽  
Author(s):  
Guoli Ji ◽  
Wenbin Ye ◽  
Yaru Su ◽  
Moliang Chen ◽  
Guangzao Huang ◽  
...  

Abstract Summary Alternative splicing (AS) is a well-established mechanism for increasing transcriptome and proteome diversity, however, detecting AS events and distinguishing among AS types in organisms without available reference genomes remains challenging. We developed a de novo approach called AStrap for AS analysis without using a reference genome. AStrap identifies AS events by extensive pair-wise alignments of transcript sequences and predicts AS types by a machine-learning model integrating more than 500 assembled features. We evaluated AStrap using collected AS events from reference genomes of rice and human as well as single-molecule real-time sequencing data from Amborella trichopoda. Results show that AStrap can identify much more AS events with comparable or higher accuracy than the competing method. AStrap also possesses a unique feature of predicting AS types, which achieves an overall accuracy of ∼0.87 for different species. Extensive evaluation of AStrap using different parameters, sample sizes and machine-learning models on different species also demonstrates the robustness and flexibility of AStrap. AStrap could be a valuable addition to the community for the study of AS in non-model organisms with limited genetic resources. Availability and implementation AStrap is available for download at https://github.com/BMILAB/AStrap. Supplementary information Supplementary data are available at Bioinformatics online.


2019 ◽  
Author(s):  
Mingye (Christina) Wang ◽  
Erik Mohlhenrich

AbstractRNA editing is a post-transcriptional modification process that alters nucleotides of mRNA and consequently the amino acids of the translated protein without changing the original DNA sequence. In human and other mammals, amino acid recoding from RNA editing is rare, and most edits are non-adaptive and provide no fitness advantage (1). However, recently it was discovered that in soft-bodied cephalopods, which are exceptionally intelligent and include squid, octopus, and cuttlefish, RNA editing is widespread and positively selected (2). To examine the effects of RNA editing on individual genes, we developed a “diversity score” system that quantitatively assesses the amount of diversity generated in each gene, incorporating combinatorial diversity and the radicalness of amino acid changes. Using this metric, we compiled a list of top 100 genes across the cephalopod species that are most diversified by RNA editing. This list of candidate genes provides directions for future research into the specific functional impact of RNA editing in terms of protein structure and cellular function on individual proteins. Additionally, considering the connection of RNA editing to the nervous system, and the exceptional intelligence of cephalopod, the candidate genes may shed light to the molecular development of behavioral complexity and intelligence. To further investigate global influences of RNA editing on the transcriptome, we investigated changes in nucleotide composition and codon usage biases in edited genes and coleoid transcriptome in general. Results show that these features indeed correlate with editing and may correspond to causes or effects of RNA editing. In addition, we characterized the unusual RNA editing in cephalopods by analyzing ratio of radical to conservative amino acid substitutions (R/C) and distribution of amino acid recoding from editing. Our results show that compared to model organisms, editing in cephalopods have significantly decreased R/C ratio and distinct distribution of amino acid substitutions that favor conservative over radical changes, indicating selection at the amino acid level and providing a potential mechanism for the evolution of widespread RNA editing in cephalopods.


Author(s):  
Ann McCartney ◽  
Elena Hilario ◽  
Seung-Sub Choi ◽  
Joseph Guhlin ◽  
Jessie Prebble ◽  
...  

We used long read sequencing data generated from Knightia excelsaI R.Br, a nectar producing Proteaceae tree endemic to Aotearoa New Zealand, to explore how sequencing data type, volume and workflows can impact final assembly accuracy and chromosome construction. Establishing a high-quality genome for this species has specific cultural importance to Māori, the indigenous people, as well as commercial importance to honey producers in Aotearoa New Zealand. Assemblies were produced by five long read assemblers using data subsampled based on read lengths, two polishing strategies, and two Hi-C mapping methods. Our results from subsampling the data by read length showed that each assembler tested performed differently depending on the coverage and the read length of the data. Assemblies that used longer read lengths (>30 kb) and lower coverage were the most contiguous, kmer and gene complete. The final genome assembly was constructed into pseudo-chromosomes using all available data assembled with FLYE, polished using Racon/Medaka/Pilon combined, scaffolded using SALSA2 and AllHiC, curated using Juicebox, and validated by synteny with Macadamia. We highlighted the importance of developing assembly workflows based on the volume and type of sequencing data and establishing a set of robust quality metrics for generating high quality assemblies. Scaffolding analyses highlighted that problems found in the initial assemblies could not be resolved accurately by utilizing Hi-C data and that scaffolded assemblies were more accurate when the underlying contig assembly was of higher accuracy. These findings provide insight into what is required for future high-quality de-novo assemblies of non-model organisms.


2019 ◽  
Vol 151 (7) ◽  
pp. 944-953
Author(s):  
Jae Seung Lee ◽  
Hae-Jin Kweon ◽  
Hyosang Lee ◽  
Byung-Chang Suh

Acid-sensing ion channels (ASICs), sensory molecules that continuously monitor the concentration of extracellular protons and initiate diverse intracellular responses through an influx of cations, are assembled from six subtypes that can differentially combine to form various trimeric channel complexes and elicit unique electrophysiological responses. For instance, homomeric ASIC1a channels have been shown to exhibit prolonged desensitization, and acid-evoked currents become smaller when the channels are repeatedly activated by extracellular protons, whereas homomeric or heteromeric ASIC2a channels continue to respond to repetitive acidic stimuli without exhibiting such desensitization. Although previous studies have provided evidence that both the desensitization of ASIC1a and rapid resensitization of ASIC2a commonly require domains that include the N terminus and the first transmembrane region of these channels, the biophysical basis of channel gating at the amino acid level has not been clearly determined. Here, we confirm that domain-swapping mutations replacing the N terminus of ASIC2a with that of ASIC2b result in de novo prolonged desensitization in homomeric channels following activation by extracellular protons. Such desensitization of chimeric ASIC2a mutants is due neither to internalization nor to degradation of the channel proteins. We use site-directed mutagenesis to narrow down the relevant portion of the N terminus of ASIC2a, identifying three amino acid residues within the N terminus (T25, T39, and I40) whose mutation is sufficient to phenocopy the desensitization exhibited by the chimeric mutants. A similar desensitization is observed in heteromeric ASICs containing the mutant subunit. These results suggest that T25, T39, and I40 of ASIC2a are key residues determining the rapid resensitization of homomeric and heteromeric ASIC2a channels upon proton activation.


1992 ◽  
Vol 285 (3) ◽  
pp. 985-992 ◽  
Author(s):  
B Rajput ◽  
J Ma ◽  
N Muniappa ◽  
L Schantz ◽  
S L Naylor ◽  
...  

A cDNA encoding UDP-GlcNAc-dolichyl-phosphate N-acetylglucosaminephosphotransferase (GPT; EC 2.7.8.15), an enzyme that catalyses the first step in the synthesis of dolichol-linked oligosaccharides, was isolated from mRNA prepared from mouse mammary glands. The cDNA contains an open reading frame that codes for a protein of 410 amino acids with a predicted molecular mass of 46.472 kDa. Mouse GPT has two copies of a putative dolichol-recognition sequence that has so far been identified in all eukaryotic enzymes which interact with dolichol, and four consensus sites for asparagine-linked glycosylation. It shows a high degree of conservation with yeast and hamster GPTs at the amino acid level. The mouse GPT cDNA recognized a single mRNA species of about 2 kb in mouse mammary glands when used as a probe in Northern blot analysis. An antiserum raised against a 15-residue peptide, derived from the predicted amino acid sequence of the cloned mouse cDNA, specifically precipitated the activity of GPT from solubilized mouse mammary gland microsomes, and detected a protein of about 48 kDa on Western blot. This size is in good agreement with that predicted from the cDNA sequence, and also with that (46 and 50 kDa) of purified bovine GPT. With the use of a panel of mouse/hamster somatic-cell hybrids and a specific probe derived from the 3′-non-coding region of the mouse cDNA, the GPT gene was mapped to mouse chromosome 17.


2007 ◽  
Vol 88 (4) ◽  
pp. 1288-1294 ◽  
Author(s):  
Giulietta Venturi ◽  
Massimo Ciccozzi ◽  
Stefania Montieri ◽  
Alessandro Bartoloni ◽  
Daniela Francisci ◽  
...  

Twenty-seven strains of Toscana virus, collected over a period of 23 years and isolated from several localities and from different hosts (humans, arthropods and a bat), were investigated by sequencing of a portion of the M genomic segment comprising the GN glycoprotein coding region. Sequence data indicated that the divergence among isolates ranged from 0 to 5.7 % at the nucleotide level and from 0 to 3.4 % at the amino acid level. Phylogenetic analysis revealed four main clusters. A close correspondence between viral strains and area/year of isolation could not be demonstrated, whilst co-circulation of different viral strains in the same area and in the same time period was observed for both patients and environmental viral isolates. Alignment of the deduced amino acid sequences and evolutionary analysis indicated that most of the sites along the gene may be invariable because of purifying and/or neutral selection.


GigaScience ◽  
2020 ◽  
Vol 9 (5) ◽  
Author(s):  
Graham J Etherington ◽  
Darren Heavens ◽  
David Baker ◽  
Ashleigh Lister ◽  
Rose McNelly ◽  
...  

Abstract Background Whilst much sequencing effort has focused on key mammalian model organisms such as mouse and human, little is known about the relationship between genome sequencing techniques for non-model mammals and genome assembly quality. This is especially relevant to non-model mammals, where the samples to be sequenced are often degraded and of low quality. A key aspect when planning a genome project is the choice of sequencing data to generate. This decision is driven by several factors, including the biological questions being asked, the quality of DNA available, and the availability of funds. Cutting-edge sequencing technologies now make it possible to achieve highly contiguous, chromosome-level genome assemblies, but rely on high-quality high molecular weight DNA. However, funding is often insufficient for many independent research groups to use these techniques. Here we use a range of different genomic technologies generated from a roadkill European polecat (Mustela putorius) to assess various assembly techniques on this low-quality sample. We evaluated different approaches for de novo assemblies and discuss their value in relation to biological analyses. Results Generally, assemblies containing more data types achieved better scores in our ranking system. However, when accounting for misassemblies, this was not always the case for Bionano and low-coverage 10x Genomics (for scaffolding only). We also find that the extra cost associated with combining multiple data types is not necessarily associated with better genome assemblies. Conclusions The high degree of variability between each de novo assembly method (assessed from the 7 key metrics) highlights the importance of carefully devising the sequencing strategy to be able to carry out the desired analysis. Adding more data to genome assemblies does not always result in better assemblies, so it is important to understand the nuances of genomic data integration explained here, in order to obtain cost-effective value for money when sequencing genomes.


2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Rui Xiong ◽  
Ting He ◽  
Yamei Wang ◽  
Shifan Liu ◽  
Yameng Gao ◽  
...  

Abstract Background Panax notoginseng (Burk.) F. H. Chen (P. notoginseng) is a medicinal plant. Cytochrome P450 (CYP450) monooxygenase superfamily is involved in the synthesis of a variety of plant hormones. Studies have shown that CYP450 is involved in the synthesis of saponins, which are the main medicinal component of P. notoginseng. To date, the P. notoginseng CYP450 family has not been systematically studied, and its gene functions remain unclear. Results In this study, a total of 188 PnCYP genes were identified, these genes were divided into 41 subfamilies and clustered into 9 clans. Moreover, we identified 40 paralogous pairs, of which only two had Ka/Ks ratio greater than 1, demonstrating that most PnCYPs underwent purification selection during evolution. In chromosome mapping and gene replication analysis, 8 tandem duplication and 11 segmental duplication events demonstrated that PnCYP genes were continuously replicating during their evolution. Gene ontology (GO) analysis annotated the functions of 188 PnCYPs into 21 functional subclasses, suggesting the functional diversity of these gene families. Functional divergence analyzed the members of the three primitive branches of CYP51, CYP74 and CYP97 at the amino acid level, and found some critical amino acid sites. The expression pattern of PnCYP450 related to nitrogen treatment was studied using transcriptome sequencing data, 10 genes were significantly up-regulated and 37 genes were significantly down-regulated. Combined with transcriptome sequencing analysis, five potential functional genes were screened. Quantitative real-time PCR (qRT-PCR) indicated that these five genes were responded to methyl jasmonate (MEJA) and abscisic acid (ABA) treatment. Conclusions These results provide a valuable basis for comprehending the classification and biological functions of PnCYPs, and offer clues to study their biological functions in response to nitrogen treatment.


2019 ◽  
Author(s):  
Graham J Etherington ◽  
Darren Heavens ◽  
David Baker ◽  
Ashleigh Lister ◽  
Rose McNelly ◽  
...  

AbstractBackgroundWhilst much sequencing effort has focused on key mammalian model organisms such as mouse and human, little is known about the correlation between genome sequencing techniques for non-model mammals and genome assembly quality. This is especially relevant to non-model mammals, where the samples to be sequenced are often degraded and low quality. A key aspect when planning a genome project is the choice of sequencing data to generate. This decision is driven by several factors, including the biological questions being asked, the quality of DNA available, and the availability of funds. Cutting-edge sequencing technologies now make it possible to achieve highly contiguous, chromosome-level genome assemblies, but relies on good quality high-molecular-weight DNA. The funds to generate and combining these data are often only available within large consortiums and sequencing initiatives, and are often not affordable for many independent research groups. For many researchers, value-for-money is a key factor when considering the generation of genomic sequencing data. Here we use a range of different genomic technologies generated from a roadkill European Polecat (Mustela putorius) to assess various assembly techniques on this low-quality sample. We evaluated different approaches for de novo assemblies and discuss their value in relation to biological analyses.ResultsGenerally, assemblies containing more data types achieved better scores in our ranking system. However, when accounting for misassemblies, this was not always the case for Bionano and low-coverage 10x Genomics (for scaffolding only). We also find that the extra cost associated with combining multiple data types is not necessarily associated with better genome assemblies.ConclusionsThe high degree of variability between each de novo assembly method (assessed from the seven key metrics) highlights the importance of carefully devising the sequencing strategy to be able to carry out the desired analysis. Adding more data to genome assemblies not always results in better assemblies so it is important to understand the nuances of genomic data integration explained here, in order to obtain cost-effective value-for-money when sequencing genomes.


2019 ◽  
Author(s):  
Korrawe Karunratanakul ◽  
Hsin-Yao Tang ◽  
David W. Speicher ◽  
Ekapol Chuangsuwanich ◽  
Sira Sriswasdi

ABSTRACTTypical analyses of mass spectrometry data only identify amino acid sequences that exist in reference databases. This restricts the possibility of discovering new peptides such as those that contain uncharacterized mutations or originate from unexpected processing of RNAs and proteins. De novo peptide sequencing approaches address this limitation but often suffer from low accuracy and require extensive validation by experts. Here, we develop SMSNet, a deep learning-based hybrid de novo peptide sequencing framework that achieves >95% amino acid accuracy while retaining good identification coverage. Applications of SMSNet on landmark proteomics and peptideomics studies reveal over 10,000 previously uncharacterized HLA antigens and phosphopeptides and in conjunction with database-search methods, expand the coverage of peptide identification by almost 30%. The power to accurately identify new peptides of SMSNet would make it an invaluable tool for any future proteomics and peptidomics studies – especially cancer neoantigen discovery and proteome characterization of non-model organisms.


Sign in / Sign up

Export Citation Format

Share Document