scholarly journals Improvement of the banana “Musa acuminata” reference sequence using NGS data and semi-automated bioinformatics methods

BMC Genomics ◽  
2016 ◽  
Vol 17 (1) ◽  
Author(s):  
Guillaume Martin ◽  
Franc-Christophe Baurens ◽  
Gaëtan Droc ◽  
Mathieu Rouard ◽  
Alberto Cenci ◽  
...  
BMC Genomics ◽  
2020 ◽  
Vol 21 (S6) ◽  
Author(s):  
Chi-Ming Leung ◽  
Dinghua Li ◽  
Yan Xin ◽  
Wai-Chun Law ◽  
Yifan Zhang ◽  
...  

Abstract Background Next-generation sequencing (NGS) enables unbiased detection of pathogens by mapping the sequencing reads of a patient sample to the known reference sequence of bacteria and viruses. However, for a new pathogen without a reference sequence of a close relative, or with a high load of mutations compared to its predecessors, read mapping fails due to a low similarity between the pathogen and reference sequence, which in turn leads to insensitive and inaccurate pathogen detection outcomes. Results We developed MegaPath, which runs fast and provides high sensitivity in detecting new pathogens. In MegaPath, we have implemented and tested a combination of polishing techniques to remove non-informative human reads and spurious alignments. MegaPath applies a global optimization to the read alignments and reassigns the reads incorrectly aligned to multiple species to a unique species. The reassignment not only significantly increased the number of reads aligned to distant pathogens, but also significantly reduced incorrect alignments. MegaPath implements an enhanced maximum-exact-match prefix seeding strategy and a SIMD-accelerated Smith-Waterman algorithm to run fast. Conclusions In our benchmarks, MegaPath demonstrated superior sensitivity by detecting eight times more reads from a low-similarity pathogen than other tools. Meanwhile, MegaPath ran much faster than the other state-of-the-art alignment-based pathogen detection tools (and compariable with the less sensitivity profile-based pathogen detection tools). The running time of MegaPath is about 20 min on a typical 1 Gb dataset.


Plants ◽  
2020 ◽  
Vol 9 (4) ◽  
pp. 439 ◽  
Author(s):  
Hanna Marie Schilbert ◽  
Andreas Rempel ◽  
Boas Pucker

High-throughput sequencing technologies have rapidly developed during the past years and have become an essential tool in plant sciences. However, the analysis of genomic data remains challenging and relies mostly on the performance of automatic pipelines. Frequently applied pipelines involve the alignment of sequence reads against a reference sequence and the identification of sequence variants. Since most benchmarking studies of bioinformatics tools for this purpose have been conducted on human datasets, there is a lack of benchmarking studies in plant sciences. In this study, we evaluated the performance of 50 different variant calling pipelines, including five read mappers and ten variant callers, on six real plant datasets of the model organism Arabidopsis thaliana. Sets of variants were evaluated based on various parameters including sensitivity and specificity. We found that all investigated tools are suitable for analysis of NGS data in plant research. When looking at different performance metrics, BWA-MEM and Novoalign were the best mappers and GATK returned the best results in the variant calling step.


2019 ◽  
Vol 124 (2) ◽  
pp. 319-329 ◽  
Author(s):  
Marion Dupouy ◽  
Franc-Christophe Baurens ◽  
Paco Derouault ◽  
Catherine Hervouet ◽  
Céline Cardi ◽  
...  

Abstract Background and Aims Banana cultivars are derived from hybridizations involving Musa acuminata subspecies. The latter diverged following geographical isolation in distinct South-east Asian continental regions and islands. Observation of chromosome pairing irregularities in meiosis of hybrids between these subspecies suggested the presence of large chromosomal structural variations. The aim of this study was to characterize such rearrangements. Methods Marker (single nucleotide polymorphism) segregation in a self-progeny of the ‘Calcutta 4’ accession and mate-pair sequencing were used to search for chromosomal rearrangements in comparison with the M. acuminata ssp. malaccensis genome reference sequence. Signature segment junctions of the revealed chromosome structures were identified and searched in whole-genome sequencing data from 123 wild and cultivated Musa accessions. Key Results Two large reciprocal translocations were characterized in the seedy banana M. acuminata ssp. burmannicoides ‘Calcutta 4’ accession. One consisted of an exchange of a 240 kb distal region of chromosome 2 with a 7.2 Mb distal region of chromosome 8. The other involved an exchange of a 20.8 Mb distal region of chromosome 1 with a 11.6 Mb distal region of chromosome 9. Both translocations were found only in wild accessions belonging to the burmannicoides/burmannica/siamea subspecies. Only two of the 87 cultivars analysed displayed the 2/8 translocation, while none displayed the 1/9 translocation. Conclusion Two large reciprocal translocations were identified that probably originated in the burmannica genetic group. Accurate characterization of these translocations should enhance the use of this disease resistance-rich burmannica group in breeding programmes.


Cancers ◽  
2019 ◽  
Vol 11 (12) ◽  
pp. 2030 ◽  
Author(s):  
Guy Froyen ◽  
Marie Le Mercier ◽  
Els Lierman ◽  
Karl Vandepoele ◽  
Friedel Nollet ◽  
...  

In most diagnostic laboratories, targeted next-generation sequencing (NGS) is currently the default assay for the detection of somatic variants in solid as well as haematological tumours. Independent of the method, the final outcome is a list of variants that differ from the human genome reference sequence of which some may relate to the establishment of the tumour in the patient. A critical point towards a uniform patient management is the assignment of the biological contribution of each variant to the malignancy and its subsequent clinical impact in a specific malignancy. These so-called biological and clinical classifications of somatic variants are currently not standardized and are vastly dependent on the subjective analysis of each laboratory. This subjectivity can thus result in a different classification and subsequent clinical interpretation of the same variant. Therefore, the ComPerMed panel of Belgian experts in cancer diagnostics set up a working group with the goal to harmonize the biological classification and clinical interpretation of somatic variants detected by NGS. This effort resulted in the establishment of a uniform, two-level classification workflow system that should enable high consistency in diagnosis, prognosis, treatment and follow-up of cancer patients. Variants are first classified into a tumour-independent biological five class system and subsequently in a four tier ACMG clinical classification. Here, we describe the ComPerMed workflow in detail including examples for each step of the pipeline. Moreover, this workflow can be implemented in variant classification software tools enabling automatic reporting of NGS data, independent of panel, method or analysis software.


2020 ◽  
Author(s):  
Hanna Marie Schilbert ◽  
Andreas Rempel ◽  
Boas Pucker

AbstractHigh-throughput sequencing technologies have rapidly developed during the past years and became an essential tool in plant sciences. However, the analysis of genomic data remains challenging and relies mostly on the performance of automatic pipelines. Frequently applied pipelines involve the alignment of sequence reads against a reference sequence and the identification of sequence variants. Since most benchmarking studies of bioinformatics tools for this purpose have been conducted on human datasets, there is a lack of benchmarking studies in plant sciences. In this study, we evaluated the performance of 50 different variant calling pipelines, including five read mappers and ten variant callers, on six real plant datasets of the model organism Arabidopsis thaliana. Sets of variants were evaluated based on various parameters including sensitivity and specificity. We found that all investigated tools are suitable for analysis of NGS data in plant research. When looking at different performance metrices, BWA-MEM and Novoalign were the best mappers and GATK returned the best results in the variant calling step.


2018 ◽  
Vol 34 (2) ◽  
pp. 173-183 ◽  
Author(s):  
Arpit V. Joshi ◽  
◽  
Nilanjana S. Baraiya ◽  
Pinal B. Vyas ◽  
T. V. Ramana Rao ◽  
...  

2020 ◽  

The banana agro-export sector in Ecuador provides millions of dollars in income for this concept, but with this development, a series of quality standards have been established that must be met to enter the export system. This has contributed to establishing good post-harvest production and management practices that guarantee the optimal production of bananas and plantains. The objective of this study was to determine the factors involved in the rejection of bananas (Musa acuminata) destined for international commercialization. The methodology considered the design modality of non-experimental transactional research, with a quantitative approach. The methodological design was developed in three phases at Finca 6 Hermanas located in the Barraganete sector of the San Juan parish in the Puebloviejo canton of the Los Ríos Province, Ecuador. The results highlight that the main causes for which banana rejection is generated are due to abiotic factors (damage, dry latex, scar, insect damage, broken neck, overgrowth) in a higher percentage of 79.55 % and biotic factors ( twins, diseases, short finger) by 20.45 %. The average rejection was 6 361 fingers and1 269 Kilograms (K) over the 6-week study duration. The analysis of variance turned out to be significant for variable 1 (biotic and abiotic). Ho is rejected; with the criterion of p-value < 0.0001 and F (9; 45) = 2.10., F = 13.17> F critic. In the case of variable (2) “work weeks”, Ho is accepted with the criteria obtained of p-value of 0.7694 and F (5; 45) = 2.4., As F = 0.51 < F critic, it is concludes, that with a significance level of 5% the null hypothesis is accepted. It is concluded that these figures lead to the elaboration of strategies that systemically mitigate the damages, by correcting each one of the causes that cause the deterioration of the banana and increasing the economic gains of the commercialization process.


Author(s):  
Anne Krogh Nøhr ◽  
Kristian Hanghøj ◽  
Genis Garcia Erill ◽  
Zilong Li ◽  
Ida Moltke ◽  
...  

Abstract Estimation of relatedness between pairs of individuals is important in many genetic research areas. When estimating relatedness, it is important to account for admixture if this is present. However, the methods that can account for admixture are all based on genotype data as input, which is a problem for low-depth next-generation sequencing (NGS) data from which genotypes are called with high uncertainty. Here we present a software tool, NGSremix, for maximum likelihood estimation of relatedness between pairs of admixed individuals from low-depth NGS data, which takes the uncertainty of the genotypes into account via genotype likelihoods. Using both simulated and real NGS data for admixed individuals with an average depth of 4x or below we show that our method works well and clearly outperforms all the commonly used state-of-the-art relatedness estimation methods PLINK, KING, relateAdmix, and ngsRelate that all perform quite poorly. Hence, NGSremix is a useful new tool for estimating relatedness in admixed populations from low-depth NGS data. NGSremix is implemented in C/C ++ in a multi-threaded software and is freely available on Github https://github.com/KHanghoj/NGSremix.


Cells ◽  
2021 ◽  
Vol 10 (2) ◽  
pp. 416
Author(s):  
Lorena Landuzzi ◽  
Maria Cristina Manara ◽  
Pier-Luigi Lollini ◽  
Katia Scotlandi

Osteosarcoma (OS) is a rare malignant primary tumor of mesenchymal origin affecting bone. It is characterized by a complex genotype, mainly due to the high frequency of chromothripsis, which leads to multiple somatic copy number alterations and structural rearrangements. Any effort to design genome-driven therapies must therefore consider such high inter- and intra-tumor heterogeneity. Therefore, many laboratories and international networks are developing and sharing OS patient-derived xenografts (OS PDX) to broaden the availability of models that reproduce OS complex clinical heterogeneity. OS PDXs, and new cell lines derived from PDXs, faithfully preserve tumor heterogeneity, genetic, and epigenetic features and are thus valuable tools for predicting drug responses. Here, we review recent achievements concerning OS PDXs, summarizing the methods used to obtain ectopic and orthotopic xenografts and to fully characterize these models. The availability of OS PDXs across the many international PDX platforms and their possible use in PDX clinical trials are also described. We recommend the coupling of next-generation sequencing (NGS) data analysis with functional studies in OS PDXs, as well as the setup of OS PDX clinical trials and co-clinical trials, to enhance the predictive power of experimental evidence and to accelerate the clinical translation of effective genome-guided therapies for this aggressive disease.


Sign in / Sign up

Export Citation Format

Share Document