scholarly journals aYChr-DB: a database of ancient human Y haplogroups

2020 ◽  
Vol 2 (4) ◽  
Author(s):  
Laurence Freeman ◽  
Conrad Stephen Brimacombe ◽  
Eran Elhaik

Abstract Ancient Y-Chromosomal DNA is an invaluable tool for dating and discerning the origins of migration routes and demographic processes that occurred thousands of years ago. Driven by the adoption of high-throughput sequencing and capture enrichment methods in paleogenomics, the number of published ancient genomes has nearly quadrupled within the last three years (2018–2020). Whereas ancient mtDNA haplogroup repositories are available, no similar resource exists for ancient Y-Chromosomal haplogroups. Here, we present aYChr-DB—a comprehensive collection of 1797 ancient Eurasian human Y-Chromosome haplogroups ranging from 44 930 BC to 1945 AD. We include descriptors of age, location, genomic coverage and associated archaeological cultures. We also produced a visualization of ancient Y haplogroup distribution over time. The aYChr-DB database is a valuable resource for population genomic and paleogenomic studies.

Viruses ◽  
2021 ◽  
Vol 13 (7) ◽  
pp. 1304
Author(s):  
Nicolás Bejerman ◽  
Ralf G. Dietzgen ◽  
Humberto Debat

Rhabdoviruses infect a large number of plant species and cause significant crop diseases. They have a negative-sense, single-stranded unsegmented or bisegmented RNA genome. The number of plant-associated rhabdovirid sequences has grown in the last few years in concert with the extensive use of high-throughput sequencing platforms. Here, we report the discovery of 27 novel rhabdovirus genomes associated with 25 different host plant species and one insect, which were hidden in public databases. These viral sequences were identified through homology searches in more than 3000 plant and insect transcriptomes from the National Center for Biotechnology Information (NCBI) Sequence Read Archive (SRA) using known plant rhabdovirus sequences as the query. The identification, assembly and curation of raw SRA reads resulted in sixteen viral genome sequences with full-length coding regions and ten partial genomes. Highlights of the obtained sequences include viruses with unique and novel genome organizations among known plant rhabdoviruses. Phylogenetic analysis showed that thirteen of the novel viruses were related to cytorhabdoviruses, one to alphanucleorhabdoviruses, five to betanucleorhabdoviruses, one to dichorhaviruses and seven to varicosaviruses. These findings resulted in the most complete phylogeny of plant rhabdoviruses to date and shed new light on the phylogenetic relationships and evolutionary landscape of this group of plant viruses. Furthermore, this study provided additional evidence for the complexity and diversity of plant rhabdovirus genomes and demonstrated that analyzing SRA public data provides an invaluable tool to accelerate virus discovery, gain evolutionary insights and refine virus taxonomy.


Blood ◽  
2012 ◽  
Vol 120 (21) ◽  
pp. 2326-2326
Author(s):  
Jennifer Doss ◽  
Dereje Jima ◽  
Deepak Voora ◽  
Sandeep Dave ◽  
Jen-Tsan Ashley Chi

Abstract Abstract 2326 Human mature red blood cells (RBC) and platelets are both terminally differentiated cells lacking nuclei. However, these two cell types do possess a diverse and abundant set of microRNAs (miRNAs), a set of small, non-coding RNAs acting as post-transcriptional regulators. To identify novel microRNAs associated with differentiation of RBCs and platelets from common progenitors, we performed high-throughput sequencing of these differentiated cells types. In particular, these accessible cells may prove valuable to identify disease biomarkers. We identified an unbiased set of both known and novel microRNAs by preparing small RNA libraries for application to the Illumina GAII high-throughput sequencing platform. We used a modified version of the probabilistic modeling algorithm, miRDeep (Friedländer 2008), to identify many novel and known microRNAs. Genomic loci that overlapped with miRNAs described in miRBase were identified as known miRNAs. The remaining genomic loci were identified as encoding candidate novel miRNAs. In RBCs we identified 253 predicted miRNA precursor loci, with 226 miRNA precursor loci annotated in miRBase (known miRNAs), whereas the remaining 27 precursor loci were identified as novel miRNAs. In platelets we identified 566 predicted miRNA precursor loci, with 488 known miRNAs and 78 novel miRNAs. Other small RNAs that did not pass miRDeep criteria were also analyzed. One of the most abundant RNA sequences in the RBC sample consisted of a distinct fragment of Ro-associated Y4 RNA (hY4). Y RNAs have been shown to be involved in chromosomal DNA replication, and Y1 and Y4 have been shown to be present in mature erythrocytes (O'Brien 1990). These distinct non-coding RNAs may possess a unique role in erythroid cell expansion. In addition, we assessed dynamic changes in the expression level of several selected microRNAs during human erythropoiesis. We are currently investigating relevant targets and regulatory functions of these microRNAs during erythropoiesis and platelet development. This global analysis will enhance our understanding of events dictating red cell and platelet maintenance and development. Disclosures: No relevant conflicts of interest to declare.


2021 ◽  
Author(s):  
Nicolas Bejerman ◽  
Humberto Debat

Tymovirales is an order of viruses with positive-sense, single-stranded RNA genomes that mostly infect plants, but also fungi and insects. The number of tymovirid sequences has been growing in the last few years with the extensive use of high-throughput sequencing platforms. Here we report the discovery of 31 novel tymovirid genomes associated with 27 different host plant species, which were hidden in public databases. These viral sequences were identified through a homology searches in more than 3,000 plant transcriptomes from the NCBI Sequence Read Archive (SRA) using known tymovirids sequences as query. Identification, assembly and curation of raw SRA reads resulted in 29 viral genome sequences with full-length coding regions, and two partial genomes. Highlights of the obtained sequences include viruses with unique and novel genome organizations among known tymovirids. Phylogenetic analysis showed that six of the novel viruses were related to alphaflexiviruses, seventeen to betaflexiviruses, two to deltaflexiviruses and six to tymoviruses. These findings resulted in the most complete phylogeny of tymovirids to date and shed new light on the phylogenetic relationships and evolutionary landscape of this group of viruses. Furthermore, this study illustrates the complexity and diversity of tymovirids genomes and demonstrates that analyzing SRA public data provides an invaluable tool to accelerate virus discovery and refine virus taxonomy.


2017 ◽  
Author(s):  
Jingjing Zhai ◽  
Jie Song ◽  
Qian Cheng ◽  
Yunjia Tang ◽  
Chuang Ma

AbstractMotivationThe epitranscriptome, also known as chemical modifications of RNA (CMRs), is a newly discovered layer of gene regulation, the biological importance of which emerged through analysis of only a small fraction of CMRs detected by high-throughput sequencing technologies. Understanding of the epitranscriptome is hampered by the absence of computational tools for the systematic analysis of epitranscriptome sequencing data. In addition, no tools have yet been designed for accurate prediction of CMRs in plants, or to extend epitranscriptome analysis from a fraction of the transcriptome to its entirety.ResultsHere, we introduce PEA, an integrated R toolkit to facilitate the analysis of plant epitranscriptome data. The PEA toolkit contains a comprehensive collection of functions required for read mapping, CMR calling, motif scanning and discovery, and gene functional enrichment analysis. PEA also takes advantage of machine learning technologies for transcriptome-scale CMR prediction, with high prediction accuracy, using the Positive Samples Only Learning algorithm, which addresses the two-class classification problem by using only positive samples (CMRs), in the absence of negative samples (non-CMRs). Hence PEA is a versatile epitranscriptome analysis pipeline covering CMR calling, prediction, and annotation, and we describe its application to predict N6-methyladenosine (m6A) modifications in Arabidopsis thaliana. Experimental results demonstrate that the toolkit achieved 71.6% sensitivity and 73.7% specificity, which is superior to existing m6A predictors. PEA is potentially broadly applicable to the in-depth study of epitranscriptomics.AvailabilityPEA is implemented using R and available at https://github.com/cma2015/PEA.


2021 ◽  
Author(s):  
nicolas bejerman ◽  
Ralf Dietzgen ◽  
Humberto Debat

Rhabdoviruses infect a large number of plant species and cause significant crop diseases. They have a negative-sense, single-stranded unsegmented or bisegmented RNA genome. The number of plant-associated rhabdovirid sequences has grown in the last few years in concert with the extensive use of high-throughput sequencing platforms. Here we report the discovery of 26 novel rhabdovirus genomes associated with 24 different host plant species and one insect, which were hidden in public databases. These viral sequences were identified through homology searches in more than 3,000 plant and insect transcriptomes from the NCBI Sequence Read Archive (SRA) using known plant rhabdovirus sequences as query. Identification, assembly and curation of raw SRA reads resulted in sixteen viral genome sequences with full-length coding regions and ten partial genomes. Highlights of the obtained sequences include viruses with unique and novel genome organizations among known plant rhabdoviruses. Phylogenetic analysis showed that thirteen of the novel viruses were related to cytorhabdoviruses, one to alphanucleorhabdoviruses, five to betanucleorhabdoviruses, one to dichorhaviruses, and six to varicosaviruses. These findings resulted in the most complete phylogeny of plant rhabdoviruses to date and shed new light on the phylogenetic relationships and evolutionary landscape of this group of plant viruses. Furthermore, this study provides additional evidence for the complexity and diversity of plant rhabdovirus genomes and demonstrates that analyzing SRA public data provides an invaluable tool to accelerate virus discovery, gain evolutionary insights and refine virus taxonomy.


2018 ◽  
pp. 214-223
Author(s):  
AM Faria ◽  
MM Pimenta ◽  
JY Saab Jr. ◽  
S Rodriguez

Wind energy expansion is worldwide followed by various limitations, i.e. land availability, the NIMBY (not in my backyard) attitude, interference on birds migration routes and so on. This undeniable expansion is pushing wind farms near populated areas throughout the years, where noise regulation is more stringent. That demands solutions for the wind turbine (WT) industry, in order to produce quieter WT units. Focusing in the subject of airfoil noise prediction, it can help the assessment and design of quieter wind turbine blades. Considering the airfoil noise as a composition of many sound sources, and in light of the fact that the main noise production mechanisms are the airfoil self-noise and the turbulent inflow (TI) noise, this work is concentrated on the latter. TI noise is classified as an interaction noise, produced by the turbulent inflow, incident on the airfoil leading edge (LE). Theoretical and semi-empirical methods for the TI noise prediction are already available, based on Amiet’s broadband noise theory. Analysis of many TI noise prediction methods is provided by this work in the literature review, as well as the turbulence energy spectrum modeling. This is then followed by comparison of the most reliable TI noise methodologies, qualitatively and quantitatively, with the error estimation, compared to the Ffowcs Williams-Hawkings solution for computational aeroacoustics. Basis for integration of airfoil inflow noise prediction into a wind turbine noise prediction code is the final goal of this work.


2019 ◽  
Vol 13 (1-2) ◽  
pp. 95-115
Author(s):  
Brandon Plewe

Historical place databases can be an invaluable tool for capturing the rich meaning of past places. However, this richness presents obstacles to success: the daunting need to simultaneously represent complex information such as temporal change, uncertainty, relationships, and thorough sourcing has been an obstacle to historical GIS in the past. The Qualified Assertion Model developed in this paper can represent a variety of historical complexities using a single, simple, flexible data model based on a) documenting assertions of the past world rather than claiming to know the exact truth, and b) qualifying the scope, provenance, quality, and syntactics of those assertions. This model was successfully implemented in a production-strength historical gazetteer of religious congregations, demonstrating its effectiveness and some challenges.


Sign in / Sign up

Export Citation Format

Share Document