scholarly journals High-Throughput Genomic Data Reveal Complex Phylogenetic Relationships in Stylosanthes Sw (Leguminosae)

2021 ◽  
Vol 12 ◽  
Author(s):  
Maria Alice Silva Oliveira ◽  
Tomáz Nunes ◽  
Maria Aparecida Dos Santos ◽  
Danyelle Ferreira Gomes ◽  
Iara Costa ◽  
...  

Allopolyploidy is widely present across plant lineages. Though estimating the correct phylogenetic relationships and origin of allopolyploids may sometimes become a hard task. In the genus Stylosanthes Sw. (Leguminosae), an important legume crop, allopolyploidy is a key speciation force. This makes difficult adequate species recognition and breeding efforts on the genus. Based on comparative analysis of nine high-throughput sequencing (HTS) samples, including three allopolyploids (S. capitata Vogel cv. “Campo Grande,” S. capitata “RS024” and S. scabra Vogel) and six diploids (S. hamata Taub, S. viscosa (L.) Sw., S. macrocephala M. B. Ferreira and Sousa Costa, S. guianensis (Aubl.) Sw., S. pilosa M. B. Ferreira and Sousa Costa and S. seabrana B. L. Maass & 't Mannetje) we provide a working pipeline to identify organelle and nuclear genome signatures that allowed us to trace the origin and parental genome recognition of allopolyploids. First, organelle genomes were de novo assembled and used to identify maternal genome donors by alignment-based phylogenies and synteny analysis. Second, nuclear-derived reads were subjected to repetitive DNA identification with RepeatExplorer2. Identified repeats were compared based on abundance and presence on diploids in relation to allopolyploids by comparative repeat analysis. Third, reads were extracted and grouped based on the following groups: chloroplast, mitochondrial, satellite DNA, ribosomal DNA, repeat clustered- and total genomic reads. These sets of reads were then subjected to alignment and assembly free phylogenetic analyses and were compared to classical alignment-based phylogenetic methods. Comparative analysis of shared and unique satellite repeats also allowed the tracing of allopolyploid origin in Stylosanthes, especially those with high abundance such as the StyloSat1 in the Scabra complex. This satellite was in situ mapped in the proximal region of the chromosomes and made it possible to identify its previously proposed parents. Hence, with simple genome skimming data we were able to provide evidence for the recognition of parental genomes and understand genome evolution of two Stylosanthes allopolyploids.

2015 ◽  
Author(s):  
Simon Uribe-Convers ◽  
Matthew L Settles ◽  
David C Tank

Advances in high-throughput sequencing (HTS) have allowed researchers to obtain large amounts of biological sequence information at speeds and costs unimaginable only a decade ago. Phylogenetics, and the study of evolution in general, is quickly migrating towards using HTS to generate larger and more complex molecular datasets. In this paper, we present a method that utilizes microfluidic PCR and HTS to generate large amounts of sequence data suitable for phylogenetic analyses. The approach uses a Fluidigm microfluidic PCR array and two sets of PCR primers to simultaneously amplify 48 target regions across 48 samples, incorporating sample-specific barcodes and HTS adapters (2,304 unique amplicons per microfluidic array). The final product is a pooled set of amplicons ready to be sequenced, and thus, there is no need to construct separate, costly genomic libraries for each sample. Further, we present a bioinformatics pipeline to process the raw HTS reads to either generate consensus sequences (with or without ambiguities) for every locus in every sample or—more importantly—recover the separate alleles from heterozygous target regions in each sample. This is important because it adds allelic information that is well suited for coalescent-based phylogenetic analyses that are becoming very common in conservation and evolutionary biology. To test our subgenomic method and bioinformatics pipeline, we sequenced 576 samples across 96 target regions belonging to the South American clade of the genus Bartsia L. in the plant family Orobanchaceae. After sequencing cleanup and alignment, the experiment resulted in ~25,300bp across 486 samples for a set of 48 primer pairs targeting the plastome, and ~13,500bp for 363 samples for a set of primers targeting regions in the nuclear genome. Finally, we constructed a combined concatenated matrix from all 96 primer combinations, resulting in a combined aligned length of ~40,500bp for 349 samples.


2021 ◽  
Vol 167 ◽  
pp. 104077
Author(s):  
Yunhe Ban ◽  
Xiang Li ◽  
Yuqi Li ◽  
Xinyu Li ◽  
Xu Li ◽  
...  

Plants ◽  
2021 ◽  
Vol 10 (4) ◽  
pp. 753
Author(s):  
Miroslav Glasa ◽  
Richard Hančinský ◽  
Katarína Šoltys ◽  
Lukáš Predajňa ◽  
Jana Tomašechová ◽  
...  

In recent years, high throughput sequencing (HTS) has brought new possibilities to the study of the diversity and complexity of plant viromes. Mixed infection of a single plant with several viruses is frequently observed in such studies. We analyzed the virome of 10 tomato and sweet pepper samples from Slovakia, all showing the presence of potato virus Y (PVY) infection. Most datasets allow the determination of the nearly complete sequence of a single-variant PVY genome, belonging to one of the PVY recombinant strains (N-Wi, NTNa, or NTNb). However, in three to-mato samples (T1, T40, and T62) the presence of N-type and O-type sequences spanning the same genome region was documented, indicative of mixed infections involving different PVY strains variants, hampering the automated assembly of PVY genomes present in the sample. The N- and O-type in silico data were further confirmed by specific RT-PCR assays targeting UTR-P1 and NIa genomic parts. Although full genomes could not be de novo assembled directly in this situation, their deep coverage by relatively long paired reads allowed their manual re-assembly using very stringent mapping parameters. These results highlight the complexity of PVY infection of some host plants and the challenges that can be met when trying to precisely identify the PVY isolates involved in mixed infection.


2019 ◽  
Vol 42 (4) ◽  
pp. 601-611 ◽  
Author(s):  
Yan Li ◽  
Liukun Jia ◽  
Zhihua Wang ◽  
Rui Xing ◽  
Xiaofeng Chi ◽  
...  

Abstract Saxifraga sinomontana J.-T. Pan & Gornall belongs to Saxifraga sect. Ciliatae subsect. Hirculoideae, a lineage containing ca. 110 species whose phylogenetic relationships are largely unresolved due to recent rapid radiations. Analyses of complete chloroplast genomes have the potential to significantly improve the resolution of phylogenetic relationships in this young plant lineage. The complete chloroplast genome of S. sinomontana was de novo sequenced, assembled and then compared with that of other six Saxifragaceae species. The S. sinomontana chloroplast genome is 147,240 bp in length with a typical quadripartite structure, including a large single-copy region of 79,310 bp and a small single-copy region of 16,874 bp separated by a pair of inverted repeats (IRs) of 25,528 bp each. The chloroplast genome contains 113 unique genes, including 79 protein-coding genes, four rRNAs and 30 tRNAs, with 18 duplicates in the IRs. The gene content and organization are similar to other Saxifragaceae chloroplast genomes. Sixty-one simple sequence repeats were identified in the S. sinomontana chloroplast genome, mostly represented by mononucleotide repeats of polyadenine or polythymine. Comparative analysis revealed 12 highly divergent regions in the intergenic spacers, as well as coding genes of matK, ndhK, accD, cemA, rpoA, rps19, ndhF, ccsA, ndhD and ycf1. Phylogenetic reconstruction of seven Saxifragaceae species based on 66 protein-coding genes received high bootstrap support values for nearly all identified nodes, suggesting a promising opportunity to resolve infrasectional relationships of the most species-rich section Ciliatae of Saxifraga.


Author(s):  
Yuansheng Liu ◽  
Xiaocai Zhang ◽  
Quan Zou ◽  
Xiangxiang Zeng

Abstract Summary Removing duplicate and near-duplicate reads, generated by high-throughput sequencing technologies, is able to reduce computational resources in downstream applications. Here we develop minirmd, a de novo tool to remove duplicate reads via multiple rounds of clustering using different length of minimizer. Experiments demonstrate that minirmd removes more near-duplicate reads than existing clustering approaches and is faster than existing multi-core tools. To the best of our knowledge, minirmd is the first tool to remove near-duplicates on reverse-complementary strand. Availability and implementation https://github.com/yuansliu/minirmd. Supplementary information Supplementary data are available at Bioinformatics online.


Viruses ◽  
2020 ◽  
Vol 12 (7) ◽  
pp. 749 ◽  
Author(s):  
Melanie Hiltbrunner ◽  
Gerald Heckel

Research on the ecology and evolution of viruses is often hampered by the limitation of sequence information to short parts of the genomes or single genomes derived from cultures. In this study, we use hybrid sequence capture enrichment in combination with high-throughput sequencing to provide efficient access to full genomes of European hantaviruses from rodent samples obtained in the field. We applied this methodology to Tula (TULV) and Puumala (PUUV) orthohantaviruses for which analyses from natural host samples are typically restricted to partial sequences of their tri-segmented RNA genome. We assembled a total of ten novel hantavirus genomes de novo with very high coverage (on average >99%) and sequencing depth (average >247×). A comparison with partial Sanger sequences indicated an accuracy of >99.9% for the assemblies. An analysis of two common vole (Microtus arvalis) samples infected with two TULV strains each allowed for the de novo assembly of all four TULV genomes. Combining the novel sequences with all available TULV and PUUV genomes revealed very similar patterns of sequence diversity along the genomes, except for remarkably higher diversity in the non-coding region of the S-segment in PUUV. The genomic distribution of polymorphisms in the coding sequence was similar between the species, but differed between the segments with the highest sequence divergence of 0.274 for the M-segment, 0.265 for the S-segment, and 0.248 for the L-segment (overall 0.258). Phylogenetic analyses showed the clustering of genome sequences consistent with their geographic distribution within each species. Genome-wide data yielded extremely high node support values, despite the impact of strong mutational saturation that is expected for hantavirus sequences obtained over large spatial distances. We conclude that genome sequencing based on capture enrichment protocols provides an efficient means for ecological and evolutionary investigations of hantaviruses at an unprecedented completeness and depth.


Blood ◽  
2015 ◽  
Vol 126 (23) ◽  
pp. 1383-1383
Author(s):  
Kezhi Huang ◽  
Min Yang ◽  
Zengkai Pan ◽  
Florian H. Heidel ◽  
Michaela Scherr ◽  
...  

Abstract Using high-throughput sequencing, an increased number of gene mutations has been identified in cancer. Among the up to hundreds of acquired mutations in cancer clones, only a few cooperating mutations are believed to be needed for initiation of the malignant disease. Recently, we reported a single amino acid substitution at position 676 (N676K) within the FLT3 kinase domain as the sole cause of resistance to PKC412 in one patient with FLT3-ITD associated acute myeloid leukemia (AML). The FLT3-N676K mutation was more recently identified independently in up to 6% of de novo AML patients with inv(16) by other groups. As FLT3-TKD mutations are strongly associated with inv(16) in AML and particularly FLT3-N676K was found almost exclusively in AML patients with inv(16), this prompted us to investigate the transforming activity of FLT3-N676K and to test whether FLT3-N676K would cooperate with inv(16) to promote AML. First, we analyzed in vivo leukemogenesis mediated by FLT3-N676K. Retroviral expression of FLT3-N676K in myeloid 32D cells induced AML in syngeneic C3H/HeJ mice (n=11/13, latency ~8 weeks), with a transforming activity similar to FLT3-ITD (n=8/8), FLT3-TKD D835Y (n=8/9), and FLT3-ITD-N676K (n=9/9) mutations. Three out of 14 C57BL/6J mice transplanted with FLT3-N676K-transduced primary lineage negative (Lin-) bone marrow cells died of acute leukemia (latency of 68, 77, and 273 days), while none of 16 animals in the control groups including FLT3-ITD and CBFß-SMMHC developed any hematological malignancy. Secondly, co-expression of FLT3-N676K and CBFß-SMMHC did not promote acute leukemia in 3 independent experiments using C3H/HeJ and C57BL/6J mice (n=16). So far only 1 out of 11 C57BL/6J mice co-expressing FLT3-N676K and CBFß-SMMHC developed acute leukemia (AML with latency of 166 days). In comparison with FLT3-ITD, FLT3-N676K tended to result in stronger phosphorylation of FLT3, MAPK and AKT, and diseased animals carrying FLT3-N676K demonstrated much lower frequency of leukemic stem cells in the majority of analyzed cases. Importantly, leukemic cells co-expressing FLT3-N676K and CBFß-SMMHC were still highly sensitive to the FLT3 inhibitor AC220. Taken together, we show that FLT3-N676K mutant is potent to transform murine hematopoietic stem/progenitor cells in vivo independently of the inv(16) chimeric gene CBFB-MYH11. This is the first report of acute leukemia induced by an activating FLT3 mutation in C57BL/6J mice. Moreover, our data suggest that targeting FLT3-N676K mutation may be an attractive treatment option for FLT3-N676K-positive patients without concurrent ITD. Our data emphasize more careful analysis of the cooperating network of mutations identified in AML by high-throughput sequencing. This work was supported by DJCLS (grant: 13/22) and the Deutsche Forschungsgemeinschaft (grant: Li 1608/2-1). KH and ZP were supported by the China Scholarship Council (2011638024 and 201406100008). Disclosures Heidel: Novartis: Consultancy, Honoraria, Membership on an entity's Board of Directors or advisory committees, Research Funding.


Sign in / Sign up

Export Citation Format

Share Document