Target enrichment of long open reading frames and ultraconserved elements to link microevolution and macroevolution in non-model organisms

Despite the increasing accessibility of high-throughput sequencing, obtaining high-quality genomic data on non-model organisms without proximate well-assembled and annotated genomes remains challenging. Here we describe a workflow that takes advantage of distant genomic resources and ingroup transcriptomes to select and jointly enrich long open reading frames (ORFs) and ultraconserved elements (UCEs) from genomic samples for integrative studies of microevolutionary and macroevolutionary dynamics. This workflow is applied to samples of the African unionid bivalve tribe Coelaturini (Parreysiinae) at basin and continent-wide scales. Our results indicate that ORFs are efficiently captured without prior identification of intron-exon boundaries. The enrichment of UCEs was less successful, but nevertheless produced a substantial dataset. Exploratory continent-wide phylogenetic analyses with ORF supercontigs (>515,000 parsimony informative sites) resulted in a fully resolved phylogeny, the backbone of which was also retrieved with UCEs (>11,000 informative sites), although some branches lack support in the latter case. Variant calling on the exome of Coelaturini from the Malawi Basin produced ~2,000 SNPs per population pair. Nucleotide diversity and population differentiation was low compared to previous estimates in mollusks, but comparable to those in recently diversifying Malawi cichlids and other taxa at an early stage of speciation. Skimming non-specific sequence data obtained for Coelaturini of the Malawi Basin, we reconstructed the maternally-inherited mitogenome, which displays an identical gene order to that of the most recent common ancestor of Unionidae. Overall, our workflow and results provide exciting perspectives for the development of integrative genomic studies on micro- and macroevolutionary dynamics in non-model organisms.

Download Full-text

Sixty years from the first disease description, a novel badnavirus associated with chestnut mosaic disease

Phytopathology ◽

10.1094/phyto-09-20-0420-r ◽

2020 ◽

Author(s):

Armelle Marais ◽

Sergio Murolo ◽

Chantal Faure ◽

Yoann Brans ◽

Clement Larue ◽

...

Keyword(s):

High Throughput Sequencing ◽

Molecular Detection ◽

Phylogenetic Analyses ◽

Epidemiological Studies ◽

Mosaic Disease ◽

Open Reading Frames ◽

Sequence Comparisons ◽

High Incidence ◽

The Usa ◽

Reading Frames

Although the chestnut mosaic disease (ChMD) was described several decades ago, its etiology is still not elucidated. Here, using classical approaches in combination with high throughput sequencing (HTS) techniques, we identify a novel Badnavirus that is a strong etiological candidate for ChMD. Two disease sources from Italy and France were submitted to HTS-based viral indexing. Total RNAs were extracted, ribodepleted and sequenced on an Illumina NextSeq500 (2x150 or 2x 75 nt). In each source, we identified a single contig of about 7.2 kilobases that corresponds to a complete circular viral genome and shares homologies with various badnaviruses. The genomes of the two isolates have an average nucleotide identity of 90.5% with a typical badnaviral genome organization comprising three open reading frames. Phylogenetic analyses and sequence comparisons show that this virus is a novel species for which we propose the name Chestnut mosaic virus (ChMV). Using a newly developed molecular detection test, we systematically detected the virus in symptomatic graft-inoculated indicator plants (chestnut and American oak), as well in chestnut trees presenting typical ChMD symptoms in the field (100% and 87% in France and Italy surveys, respectively). Datamining of publicly available chestnut SRA transcriptomic data allowed the reconstruction of two additional complete ChMV genomes from two Castanea mollissima sources from the USA, as well as ChMV detection in C. dentata from the USA. Preliminary epidemiological studies, performed in France and in Central Eastern Italy, showed that ChMV has a high incidence in some commercial orchards, with a low within-orchard genetic diversity.

Download Full-text

The global phylogeny of Plum pox virus is emerging

Journal of General Virology ◽

10.1099/jgv.0.001308 ◽

2019 ◽

Vol 100 (10) ◽

pp. 1457-1468 ◽

Cited By ~ 13

Author(s):

Mohammad Hajizadeh ◽

Adrian J. Gibbs ◽

Fahimeh Amirnia ◽

Miroslav Glasa

Keyword(s):

Strain Differences ◽

Open Reading Frames ◽

Recent Common Ancestor ◽

Plum Pox Virus ◽

Migrant Populations ◽

Most Recent Common Ancestor ◽

Strong Negative Selection ◽

Show Evidence ◽

Reading Frames ◽

Complete Genomic

The 206 complete genomic sequences of Plum pox virus in GenBank (January 2019) were downloaded. Their main open reading frames (ORF)s were compared by phylogenetic and population genetic methods. All fell into the nine previously recognized strain clusters; the PPV-Rec and PPV-T strain ORFs were all recombinants, whereas most of those in the PPV-C, PPV-CR, PPV-CV, PPV-D, PPV-EA, PPV-M and PPV-W strain clusters were not. The strain clusters ranged in size from 2 (PPV-CV and PPV-EA) to 74 (PPV-D). The isolates of eight of the nine strains came solely from Europe and the Levant (with an exception resulting from a quarantine breach), but many PPV-D strain isolates also came from east and south Asia and the Americas. The estimated time to the most recent common ancestor (TMRCA) of all 134 non-recombinant ORFs was 820 (865–775) BCE. Most strain populations were only a few decades old, and had small intra-strain, but large inter-strain, differences; strain PPV-W was the oldest. Eurasia is clearly the ‘centre of emergence’ of PPV and the several PPV-D strain populations found elsewhere only show evidence of gene flow with Europe, so have come from separate introductions from Europe. All ORFs and their individual genes show evidence of strong negative selection, except the positively selected pipo gene of the recently migrant populations. The possible ancient origins of PPV are discussed.

Download Full-text

Novel Ampeloviruses Infecting Cassava in Central Africa and the South-West Indian Ocean Islands

Viruses ◽

10.3390/v13061030 ◽

2021 ◽

Vol 13 (6) ◽

pp. 1030

Author(s):

Yves Kwibuka ◽

Espoir Bisimwa ◽

Arnaud G. Blouin ◽

Claude Bragard ◽

Thierry Candresse ◽

...

Keyword(s):

Heat Shock ◽

Heat Shock Protein ◽

High Throughput Sequencing ◽

Transmembrane Protein ◽

Phylogenetic Analyses ◽

Protein A ◽

Open Reading Frames ◽

Evolutionary Forces ◽

South West Indian Ocean ◽

Reading Frames

Cassava is one of the most important staple crops in Africa and its production is seriously damaged by viral diseases. In this study, we identify for the first time and characterize the genome organization of novel ampeloviruses infecting cassava plants in diverse geographical locations using three high-throughput sequencing protocols [Virion-Associated Nucleotide Acid (VANA), dsRNA and total RNA], and we provide a first analysis of the diversity of these agents and of the evolutionary forces acting on them. Thirteen new Closteroviridae isolates were characterized in field-grown cassava plants from the Democratic Republic of Congo (DR Congo), Madagascar, Mayotte, and Reunion islands. The analysis of the sequences of the corresponding contigs (ranging between 10,417 and 13,752 nucleotides in length) revealed seven open reading frames. The replication-associated polyproteins have three expected functional domains: methyltransferase , helicase, and RNA-dependent RNA polymerase (RdRp). Additional open reading frames code for a small transmembrane protein, a heat-shock protein 70 homolog (HSP70h), a heat shock protein 90 homolog (HSP90h), and a major and a minor coat protein (CP and CPd respectively). Defective genomic variants were also identified in some cassava accessions originating from Madagascar and Reunion. The isolates were found to belong to two species tentatively named Manihot esculenta-associated virus 1 and 2 (MEaV-1 and MEaV-2). Phylogenetic analyses showed that MEaV-1 and MEaV-2 belong to the genus Ampelovirus, in particular to its subgroup II. MEaV-1 was found in all of the countries of study, while MEaV-2 was only detected in Madagascar and Mayotte. Recombination analysis provided evidence of intraspecies recombination occurring between the isolates from Madagascar and Mayotte. No clear association with visual symptoms in the cassava host could be identified.

Download Full-text

Expanding the Diversity of the IS630-Tc1-mariner Superfamily: Discovery of a Unique DD37E Transposon and Reclassification of the DD37D and DD39D Transposons

Genetics ◽

10.1093/genetics/159.3.1103 ◽

2001 ◽

Vol 159 (3) ◽

pp. 1103-1115 ◽

Cited By ~ 1

Author(s):

Hongguang Shao ◽

Zhijian Tu

Keyword(s):

Phylogenetic Analyses ◽

Mosquito Species ◽

Open Reading Frames ◽

Inverted Repeats ◽

Sequence Comparisons ◽

Wide Range ◽

Multiple Copies ◽

Catalytic Motif ◽

Open The Doors ◽

Reading Frames

Abstract A novel transposon named ITmD37E was discovered in a wide range of mosquito species. Sequence analysis of multiple copies in three Aedes species showed similar terminal inverted repeats and common putative TA target site duplications. The ITmD37E transposases contain a conserved DD37E catalytic motif, which is unique among reported transposons of the IS630-Tc1-mariner superfamily. Sequence comparisons and phylogenetic analyses suggest that ITmD37E forms a novel family distinct from the widely distributed Tc1 (DD34E), mariner (DD34D), and pogo (DDxD) families in the IS630-Tc1-mariner superfamily. The inclusion in the phylogenetic analysis of recently reported transposons and transposons uncovered in our database survey provided revisions to previous classifications and identified two additional families, ITmD37D and ITmD39D, which contain DD37D and DD39D motifs, respectively. The above expansion and reorganization may open the doors to the discovery of related transposons in a broad range of organisms and help illustrate the evolution and structure-function relationships among these distinct transposases in the IS630-Tc1-mariner superfamily. The presence of intact open reading frames and highly similar copies in some of the newly characterized transposons suggests recent transposition. Studies of these novel families may add to the limited repertoire of transgenesis and mutagenesis tools for a wide range of organisms, including the medically important mosquitoes.

Download Full-text

Evolutionary History of Cucumber Mosaic Virus Deduced by Phylogenetic Analyses

Journal of Virology ◽

10.1128/jvi.76.7.3382-3387.2002 ◽

2002 ◽

Vol 76 (7) ◽

pp. 3382-3387 ◽

Cited By ~ 155

Author(s):

Marilyn J. Roossinck

Keyword(s):

Cucumber Mosaic Virus ◽

Mosaic Virus ◽

Virus Isolate ◽

Phylogenetic Analyses ◽

Open Reading Frames ◽

Broad Host Range ◽

Nontranslated Region ◽

History Of ◽

Evolutionary Success ◽

Reading Frames

ABSTRACT Cucumber mosaic virus (CMV) is an RNA plant virus with a tripartite genome and an extremely broad host range. Previous evolutionary analyses with the coat protein (CP) and 5′ nontranslated region (NTR) of RNA 3 suggested subdivision of the virus into three groups, subgroups IA, IB, and II. In this study 15 strains of CMV whose nucleotide sequences have been determined were used for a complete phylogenetic analysis of the virus. The trees estimated for open reading frames (ORFs) located on the different RNAs were not congruent and did not completely support the subgrouping indicated by the CP ORF, indicating that different RNAs had independent evolutionary histories. This is consistent with a reassortment mechanism playing an important role in the evolution of the virus. The evolutionary trees of the 1a and 3a ORFs were more compact and displayed more branching than did those of the 2a and CP ORFs. This may reflect more rigid host-interactive constraints exerted on the 1a and 3a ORFs. In addition, analysis of the 3′ NTR that is conserved among all RNAs indicated that evolutionary constraints on this region are specific to the RNA component rather than the virus isolate. This indicates that functions other than replication are encoded in the 3′ NTR. Reassortment may have led to the genetic diversity found among CMV strains and contributed to its enormous evolutionary success.

Download Full-text

Phylogenetic and genetic characterization of Treponema pallidum strains from syphilis patients in Japan by whole-genome sequence analysis from global perspectives

Scientific Reports ◽

10.1038/s41598-021-82337-7 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Shingo Nishiki ◽

Kenichi Lee ◽

Mizue Kanai ◽

Shu-ichi Nakayama ◽

Makoto Ohnishi

Keyword(s):

Genetic Difference ◽

Phylogenetic Analyses ◽

Treponema Pallidum ◽

Whole Genome Sequence ◽

Epidemiological Surveillance ◽

Recent Common Ancestor ◽

Whole Genome ◽

Global Perspectives ◽

Most Recent Common Ancestor ◽

Sex With Men

AbstractJapan has had a substantial increase in syphilis cases since 2013. However, research on the genomic features of the Treponema pallidum subspecies pallidum (TPA) strains from these cases has been limited. Here, we elucidated the genetic variations and relationships between TPA strains in Japan (detected between 2014 and 2018) and other countries by whole-genome sequencing and phylogenetic analyses, including syphilis epidemiological surveillance data and information on patient sexual orientation. Seventeen of the 20 strains in Japan were SS14- and the remaining 3 were Nichols-lineage. Sixteen of the 17 SS14-lineage strains were classified into previously reported Sub-lineage 1B. Sub-lineage 1B strains in Japan have formed distinct sub-clusters of strains from heterosexuals and strains from men who have sex with men. These strains were closely related to reported TPA strains in China, forming an East-Asian cluster. However, those strains in these countries evolved independently after diverging from their most recent common ancestor and expanded their genetic diversity during the time of syphilis outbreak in each country. The genetic difference between the TPA strains in these countries was characterized by single-nucleotide-polymorphism analyses of their penicillin binding protein genes. Taken together, our results elucidated the detailed phylogenetic features and transmission networks of syphilis.

Download Full-text

Complete genome sequence of GII.9 norovirus

Archives of Virology ◽

10.1007/s00705-021-05257-x ◽

2021 ◽

Author(s):

Zilong Zhang ◽

Danlei Liu ◽

Zilei Zhang ◽

Peng Tian ◽

Shenwei Li ◽

...

Keyword(s):

Genome Sequence ◽

Complete Genome Sequence ◽

Complete Genome ◽

High Throughput Sequencing ◽

Open Reading Frames ◽

Viral Sequence ◽

Sequence Comparisons ◽

Rapid Amplification ◽

Genome Level ◽

Reading Frames

AbstractNorovirus is recognized as one of the leading causes of acute gastroenteritis outbreaks. Genotype GII.9 was first detected in Norfolk, VA, USA, in 1997. However, the complete genome sequence of this genotype has not yet been determined. In this study, a complete genome sequence of GII.9[P7] norovirus, SCD1878_GII.9[P7], from a patient was determined using high-throughput sequencing and rapid amplification of cDNA ends (RACE) technology. The complete genome sequence of SCD1878_GII.9[P7] is 7544 nucleotides (nt) in length with a 3’ poly(A) tail and contains three open reading frames. Sequence comparisons indicated that SCD1878_GII.9[P7] shares 92.1%-92.3% nucleotide sequence identity with GII.P7 (AB258331 and AB039777) and 96.7%-97.4% identity with GII.9 (AY038599 and DQ379715). The results suggested that SCD1878_GII.9[P7] is a member of P genotype GII.P7 and G genotype GII.9. This viral sequence fills a gap at the whole-genome level for the GII.9 genotype.

Download Full-text

A FOUNDER EFFECT LED EARLY SARS-COV-2 TRANSMISSION IN SPAIN

Journal of Virology ◽

10.1128/jvi.01583-20 ◽

2020 ◽

Cited By ~ 1

Author(s):

Francisco Díez-Fuertes ◽

María Iglesias-Caballero ◽

Javier García Pérez ◽

Sara Monzón ◽

Pilar Jiménez ◽

...

Keyword(s):

Founder Effect ◽

Phylogenetic Analyses ◽

Protein A ◽

Spike Protein ◽

Recent Common Ancestor ◽

Whole Genome ◽

Whole Genome Analysis ◽

Most Recent Common Ancestor ◽

Fitness Advantage

SARS-CoV-2 whole-genome analysis has identified five large clades worldwide, emerged in 2019 (19A and 19B) and in 2020 (20A, 20B and 20C). This study aims to analyze the diffusion of SARS-CoV-2 in Spain using maximum likelihood phylogenetic and Bayesian phylodynamic analyses. The most recent common ancestor (MRCA) of the SARS-CoV-2 pandemic was estimated in Wuhan, China, around November 24, 2019. Phylogenetic analyses of the first 12,511 SARS-CoV-2 whole genome sequences obtained worldwide, including 290 from 11 different regions of Spain, revealed 62 independent introductions of the virus in the country. Most sequences from Spain were distributed in clades characterized by D614G substitution in S gene (20A, 20B and 20C) and L84S substitution in ORF8 (19B) with 163 and 118 sequences, respectively, with the remaining sequences branching in 19A. A total of 110 (38%) sequences from Spain grouped in four different monophyletic clusters of 20A clade (20A-Sp1 and 20A-Sp2) and 19B clade (19B-Sp1 and 19B-Sp2) along with sequences from 29 countries worldwide. The MRCA of 19A-Sp1, 20A-Sp1, 19A-Sp2 and 20A-Sp2 clusters were estimated in Spain around January 21 and 29, and February 6 and 17, 2020, respectively. The prevalence of 19B clade in Spain (40%) was by far higher than in any other European country during the first weeks of the epidemic, probably by a founder effect. However, this variant was replaced by G614-bearing viruses in April. In vitro assays showed an enhanced infectivity of pseudotyped virions displaying G614 substitution compared with D614, suggesting a fitness advantage of D614G. IMPORTANCE Multiple SARS-CoV-2 introductions have been detected in Spain and at least four resulted in the emergence of locally transmitted clusters originated not later than mid-February, with further dissemination to many other countries around the world and a few weeks before the explosion of COVID-19 cases detected in Spain during the first week of March. The majority of the earliest variants detected in Spain branched in 19B clade (D614 viruses), which was the most prevalent clade during the first weeks of March, pointing to a founder effect. However, from mid-March to June, 2020, G614-bearing viruses (20A, 20B and 20C clades) overcame D614 variants in Spain, probably as a consequence of an evolutionary advantage of this substitution in the spike protein. A higher infectivity of G614-bearing viruses compared to D614 variants was detected, suggesting that this substitution in SARS-CoV-2 spike protein could be behind the variant shift observed in Spain.

Download Full-text

Comprehensive pathogen detection in sera of Kawasaki disease patients by high-throughput sequencing: a retrospective exploratory study

BMC Pediatrics ◽

10.1186/s12887-020-02380-7 ◽

2020 ◽

Vol 20 (1) ◽

Author(s):

Yuka Torii ◽

Kazuhiro Horiba ◽

Satoshi Hayano ◽

Taichi Kato ◽

Takako Suzuki ◽

...

Keyword(s):

Kawasaki Disease ◽

High Throughput ◽

High Throughput Sequencing ◽

Sequence Data ◽

Early Stage ◽

Rna Virus ◽

Systemic Vasculitis ◽

Human Herpesvirus ◽

Acute Stage ◽

Serum Samples

Abstract Background Kawasaki disease (KD) is an idiopathic systemic vasculitis that predominantly damages coronary arteries in children. Various pathogens have been investigated as triggers for KD, but no definitive causative pathogen has been determined. As KD is diagnosed by symptoms, several days are needed for diagnosis. Therefore, at the time of diagnosis of KD, the pathogen of the trigger may already be diminished. The aim of this study was to explore comprehensive pathogens in the sera at the acute stage of KD using high-throughput sequencing (HTS). Methods Sera of 12 patients at an extremely early stage of KD and 12 controls were investigated. DNA and RNA sequences were read separately using HTS. Sequence data were imported into the home-brew meta-genomic analysis pipeline, PATHDET, to identify the pathogen sequences. Results No RNA virus reads were detected in any KD case except for that of equine infectious anemia, which is known as a contaminant of commercial reverse transcriptase. Concerning DNA viruses, human herpesvirus 6B (HHV-6B, two cases) and Anelloviridae (eight cases) were detected among KD cases as well as controls. Multiple bacterial reads were obtained from KD and controls. Bacteria of the genera Acinetobacter, Pseudomonas, Delfita, Roseomonas, and Rhodocyclaceae appeared to be more common in KD sera than in the controls. Conclusion No single pathogen was identified in serum samples of patients at the acute phase of KD. With multiple bacteria detected in the serum samples, it is difficult to exclude the possibility of contamination; however, it is possible that these bacteria might stimulate the immune system and induce KD.

Download Full-text

Identification and Characterization of a Novel Robigovirus Species from Sweet Cherry in Turkey

Pathogens ◽

10.3390/pathogens8020057 ◽

2019 ◽

Vol 8 (2) ◽

pp. 57 ◽

Cited By ~ 5

Author(s):

Kadriye Çağlayan ◽

Vahid Roumi ◽

Mona Gazel ◽

Eminur Elçi ◽

Mehtap Acioğlu ◽

...

Keyword(s):

High Throughput Sequencing ◽

Sweet Cherry ◽

Prunus Avium ◽

Open Reading Frames ◽

Cherry Tree ◽

Coding Regions ◽

Sweet Cherries ◽

Identification And Characterization ◽

Reading Frames

High throughput sequencing of total RNA isolated from symptomatic leaves of a sweet cherry tree (Prunus avium cv. 0900 Ziraat) from Turkey identified a new member of the genus Robigovirus designated cherry virus Turkey (CVTR). The presence of the virus was confirmed by electron microscopy and overlapping RT-PCR for sequencing its whole-genome. The virus has a ssRNA genome of 8464 nucleotides which encodes five open reading frames (ORFs) and comprises two non-coding regions, 5′ UTR and 3′ UTR of 97 and 296 nt, respectively. Compared to the five most closely related robigoviruses, RdRp, TGB1, TGB2, TGB3 and CP share amino acid identities ranging from 43–53%, 44–60%, 39–43%, 38–44% and 45–50%, respectively. Unlike the four cherry robigoviruses, CVTR lacks ORFs 2a and 5a. Its genome organization is therefore more similar to African oil palm ringspot virus (AOPRV). Using specific primers, the presence of CVTR was confirmed in 15 sweet cherries and two sour cherries out of 156 tested samples collected from three regions in Turkey. Among them, five samples were showing slight chlorotic symptoms on the leaves. It seems that CVTR infects cherry trees with or without eliciting obvious symptoms, but these data should be confirmed by bioassays in woody and possible herbaceous hosts in future studies.

Download Full-text