original genome
Recently Published Documents


TOTAL DOCUMENTS

21
(FIVE YEARS 9)

H-INDEX

4
(FIVE YEARS 1)

mSystems ◽  
2021 ◽  
Author(s):  
Rocío Aguilar Suárez ◽  
Minia Antelo-Varela ◽  
Sandra Maaß ◽  
Jolanda Neef ◽  
Dörte Becher ◽  
...  

Our present study showcases a genome-minimized nonpathogenic bacterium, the so-called midi Bacillus , as a chassis for the development of future industrial strains that serve in the production of high-value difficult-to-produce proteins. In particular, we explain how midi Bacillus , which lacks about one-third of the original genome, effectively secretes a protein of the major human pathogen Staphylococcus aureus that cannot be produced by the parental Bacillus subtilis strain.


2021 ◽  
Vol 53 (8) ◽  
pp. 1229-1237 ◽  
Author(s):  
Kijong Yi ◽  
Su Yeon Kim ◽  
Thomas Bleazard ◽  
Taewoo Kim ◽  
Jeonghwan Youk ◽  
...  

AbstractViruses accumulate mutations under the influence of natural selection and host–virus interactions. Through a systematic comparison of 351,525 full viral genome sequences collected during the recent COVID-19 pandemic, we reveal the spectrum of SARS-CoV-2 mutations. Unlike those of other viruses, the mutational spectrum of SARS-CoV-2 exhibits extreme asymmetry, with a much higher rate of C>U than U>C substitutions, as well as a higher rate of G>U than U>G substitutions. This suggests directional genome sequence evolution during transmission. The substantial asymmetry and directionality of the mutational spectrum enable pseudotemporal tracing of SARS-CoV-2 without prior information about the root sequence, collection time, and sampling region. This shows that the viral genome sequences collected in Asia are similar to the original genome sequence. Adjusted estimation of the dN/dS ratio accounting for the asymmetrical mutational spectrum also shows evidence of negative selection on viral genes, consistent with previous reports. Our findings provide deep insights into the mutational processes in SARS-CoV-2 viral infection and advance the understanding of the history and future evolution of the virus.


2021 ◽  
Author(s):  
Kleber Padovani ◽  
Roberto Xavier ◽  
André Carvalho ◽  
Anna Reali ◽  
Annie Chateau ◽  
...  

Abstract Genome assembly is one of the most relevant and computationally complex tasks in genomics projects. It aims to reconstruct a genome through the analysis of several small textual fragments of such genome — named reads. Ideally, besides ignoring any errors contained in reads, the reconstructed genome should also optimally combine these reads, thus reaching the original genome. The quality of the genome assembly is relevant because the more reliable the genomes, the more accurate the understanding of the characteristics and functions of living beings, and it allows generating many positive impacts on society, including the prevention and treatment of diseases. The assembly becomes even more complex (and it is termed de novo in this case) when the assembler software is not supplied with a similar genome to be used as a reference. Current assemblers have predominantly used heuristic strategies on computational graphs. Despite being widely used in genomics projects, there is still no irrefutably best assembler for any genome, and the proper choice of these assemblers and their configurations depends on Bioinformatics experts. The use of reinforcement learning has proven to be very promising for solving complex activities without human supervision during their learning process. However, their successful applications are predominantly focused on fictional and entertainment problems-such as games. Based on the above, this work aims to shed light on the application of reinforcement learning to solve this relevant real-world problem, the genome assembly. By expanding the only approach found in the literature that addresses this problem, we carefully explored the aspects of intelligent agent learning, performed by the Q-learning algorithm, to understand its suitability to be applied in scenarios whose characteristics are more similar to those faced by real genome projects. The improvements proposed here include changing the previously proposed reward system and including state space exploration optimization strategies based on dynamic pruning and mutual collaboration with evolutionary computing. These investigations were tried on 23 new environments with larger inputs than those used previously. All these environments are freely available on the internet for the evolution of this research by the scientific community. The results suggest consistent performance progress using the proposed improvements, however, they also demonstrate the limitations of them, especially related to the high dimensionality of state and action spaces. We also present, later, the paths that can be traced to tackle genome assembly efficiently in real scenarios considering recent, successfully reinforcement learning applications — including deep reinforcement learning — from other domains dealing with high-dimensional inputs.


PLoS ONE ◽  
2021 ◽  
Vol 16 (5) ◽  
pp. e0252414
Author(s):  
Mônica Silva de Oliveira ◽  
Jorianne Thyeska Castro Alves ◽  
Pablo Henrique Caracciolo Gomes de Sá ◽  
Adonney Allan de Oliveira Veras

Advances in next-generation sequencing (NGS) platforms have had a positive impact on biological research, leading to the development of numerous omics approaches, including genomics, transcriptomics, metagenomics, and pangenomics. These analyses provide insights into the gene contents of various organisms. However, to understand the evolutionary processes of these genes, comparative analysis, which is an important tool for annotation, is required. Using comparative analysis, it is possible to infer the functions of gene contents and identify orthologs and paralogous genes via their homology. Although several comparative analysis tools currently exist, most of them are limited to complete genomes. PAN2HGENE, a computational tool that allows identification of gene products missing from the original genome sequence, with automated comparative analysis for both complete and draft genomes, can be used to address this limitation. In this study, PAN2HGENE was used to identify new products, resulting in altering the alpha value behavior in the pangenome without altering the original genomic sequence. Our findings indicate that this tool represents an efficient alternative for comparative analysis, with a simple and intuitive graphical interface. The PAN2HGENE have been uploaded to SourceForge and are available via: https://sourceforge.net/projects/pan2hgene-software


2020 ◽  
Vol 11 ◽  
Author(s):  
Soma S. Marla ◽  
Pallavi Mishra ◽  
Ranjeet Maurya ◽  
Mohar Singh ◽  
Dhammaprakash Pandhari Wankhede ◽  
...  

Genome assembly of short reads from large plant genomes remains a challenge in computational biology despite major developments in next generation sequencing. Of late several draft assemblies have been reported in sequenced plant genomes. The reported draft genome assemblies of Cajanus cajan have different levels of genome completeness, a large number of repeats, gaps, and segmental duplications. Draft assemblies with portions of genome missing are shorter than the referenced original genome. These assemblies come with low map accuracy affecting further functional annotation and the prediction of gene components as desired by crop researchers. Genome coverage, i.e., the number of sequenced raw reads mapped onto a certain location of the genome is an important quality indicator of completeness and assembly quality in draft assemblies. The present work aimed to improve the coverage in reported de novo sequenced draft genomes (GCA_000340665.1 and GCA_000230855.2) of pigeonpea, a legume widely cultivated in India. The two recently sequenced assemblies, A1 and A2 comprised 72% and 75% of the estimated coverage of the genome, respectively. We employed an assembly reconciliation approach to compare the draft assemblies and merge them, filling the gaps by employing an algorithm size sorting mate-pair library to generate a high quality and near complete assembly with enhanced contiguity. The majority of gaps present within scaffolds were filled with right-sized mate-pair reads. The improved assembly reduced the number of gaps than those reported in draft assemblies resulting in an improved genome coverage of 82.4%. Map accuracy of the improved assembly was evaluated using various quality metrics and for the presence of specific trait-related functional genes. Employed pair-end and mate-pair local libraries helped us to reduce gaps, repeats, and other sequence errors resulting in lengthier scaffolds compared to the two draft assemblies. We reported the prediction of putative host resistance genes against Fusarium wilt disease by their performance and evaluated them both in wet laboratory and field phenotypic conditions.


eLife ◽  
2020 ◽  
Vol 9 ◽  
Author(s):  
Jun Kurushima ◽  
Nathalie Campo ◽  
Renske van Raaphorst ◽  
Guillaume Cerckel ◽  
Patrice Polard ◽  
...  

The spread of antimicrobial resistance and vaccine escape in the human pathogen Streptococcus pneumoniae can be largely attributed to competence-induced transformation. Here, we studied this process at the single-cell level. We show that within isogenic populations, all cells become naturally competent and bind exogenous DNA. We find that transformation is highly efficient and that the chromosomal location of the integration site or whether the transformed gene is encoded on the leading or lagging strand has limited influence on recombination efficiency. Indeed, we have observed multiple recombination events in single recipients in real-time. However, because of saturation and because a single-stranded donor DNA replaces the original allele, transformation efficiency has an upper threshold of approximately 50% of the population. The fixed mechanism of transformation results in a fail-safe strategy for the population as half of the population generally keeps an intact copy of the original genome.


2020 ◽  
Author(s):  
Soma Marla ◽  
Pallavi Mishra ◽  
Ranjeet Maurya ◽  
Mohar Singh ◽  
D. P. Wankhede ◽  
...  

AbstractGenome assembly of short reads from large plant genomes remains a challenge in computational biology despite major developments in Next Generation sequencing. Of late multiple draft assemblies of plant genomes are reported in many organisms. The draft assemblies of Cajanus cajan are with different levels of genome completeness; contain large number of repeats, gaps and segmental duplications. Draft assemblies with portions of genome missing, are shorter than the referenced original genome. These assemblies come with low map accuracy affecting further functional annotation and prediction of gene component as desired by crop researchers. Genome coverage i.e. number of sequenced raw reads mapped on to certain locations of the genome is an important quality indicator of completeness and assembly quality in draft assemblies. Present work was aimed at improvement of coverage in reported de novo sequenced draft genomes (GCA_000340665.1 and GCA_000230855.2) of Pigeonpea, a legume widely cultivated in India. The two assemblies comprised 72% and 75% of estimated coverage of genome respectively. We employed assembly reconciliation approach to compare draft assemblies and merged them to generate a high quality near complete assembly with enhanced contiguity. Finished assembly has reduced number of gaps than reported in draft assemblies and improved genome coverage of 82.4%. Quality of the finished assembly was evaluated using various quality metrics and for presence of specific trait related functional genes. Employed pair-end and mate-pair local library data sets enabled to resolve gaps, repeats and other sequence errors yielding lengthier scaffolds compared to two draft assemblies. We report prediction of putative host resistance genes from improved sequence against Fusarium wilt disease and evaluated them in both wet laboratory and field phenotypic conditions.


Author(s):  
Jun Kurushima ◽  
Nathalie Campo ◽  
Renske van Raaphorst ◽  
Guillaume Cerckel ◽  
Patrice Polard ◽  
...  

AbstractThe rapid spread of antimicrobial resistance and vaccine escape in the opportunistic human pathogen Streptococcus pneumoniae can be largely attributed to competence-induced transformation. To better understand the dynamics of competence-induced transformation, we studied this process at the single-cell level. We show that within isogenic populations, all cells become naturally competent and bind exogenous DNA. In addition, we find that transformation is highly efficient and that the chromosomal location of the integration site or whether the transformed gene is encoded on the leading or lagging strand has limited influence on recombination efficiency. Indeed, we have observed multiple recombination events in single recipients in real-time. However, because of saturation of the DNA uptake and integration machinery and because a single stranded donor DNA replaces the original allele, we find that transformation efficiency has an upper threshold of approximately 50% of the population. Counterintuitively, in the presence of multiple transforming DNAs, the fraction of untransformed cells increases to more than 50%. The fixed mechanism of transformation results in a fail-safe strategy for the population as half of the population generally keeps an intact copy of the original genome. Together, this work advances our understanding of pneumococcal genome plasticity.


Genes ◽  
2020 ◽  
Vol 11 (2) ◽  
pp. 170 ◽  
Author(s):  
Marcelo R. J. Castro ◽  
Clément Goubert ◽  
Fernando A. Monteiro ◽  
Cristina Vieira ◽  
Claudia M. A. Carareto

Transposable elements (TEs) are widely distributed repetitive sequences in the genomes across the tree of life, and represent an important source of genetic variability. Their distribution among genomes is specific to each lineage. A phenomenon associated with this feature is the sudden expansion of one or several TE families, called bursts of transposition. We previously proposed that bursts of the Mariner family (DNA transposons) contributed to the speciation of Rhodnius prolixus Stål, 1859. This hypothesis motivated us to study two additional species of the R. prolixus complex: Rhodnius montenegrensis da Rosa et al., 2012 and Rhodnius marabaensis Souza et al., 2016, together with a new, de novo annotation of the R. prolixus repeatome using unassembled short reads. Our analysis reveals that the total amount of TEs present in Rhodnius genomes (19% to 23.5%) is three to four times higher than that expected based on the original quantifications performed for the original genome description of R. prolixus. We confirm here that the repeatome of the three species is dominated by Class II elements of the superfamily Tc1-Mariner, as well as members of the LINE order (Class I). In addition to R. prolixus, we also identified a recent burst of transposition of the Mariner family in R. montenegrensis and R. marabaensis, suggesting that this phenomenon may not be exclusive to R. prolixus. Rather, we hypothesize that whilst the expansion of Mariner elements may have contributed to the diversification of the R. prolixus-R. robustus species complex, the distinct ecological characteristics of these new species did not drive the general evolutionary trajectories of these TEs.


2018 ◽  
Author(s):  
Shaun D Jackman ◽  
Lauren Coombe ◽  
Justin Chu ◽  
Rene L Warren ◽  
Benjamin P Vandervalk ◽  
...  

Genome sequencing yields the sequence of many short snippets of DNA (reads) from a genome. Genome assembly attempts to reconstruct the original genome from which these reads were derived. This task is difficult due to gaps and errors in the sequencing data, repetitive sequence in the underlying genome, and heterozygosity, and assembly errors are common. These misassemblies may be identified by comparing the sequencing data to the assembly, and by looking for discrepancies between the two. Once identified, these misassemblies may be corrected, improving the quality of the assembly. Although tools exist to identify and correct misassemblies using Illumina pair-end and mate-pair sequencing, no such tool yet exists that makes use of the long distance information of the large molecules provided by linked reads, such as those offered by the 10x Genomics Chromium platform. We have developed the tool Tigmint for this purpose. To demonstrate the effectiveness of Tigmint, we corrected assemblies of a human genome using short reads assembled with ABySS 2.0 and other assemblers. Tigmint reduced the number of misassemblies identified by QUAST in the ABySS assembly by 216 (27%). While scaffolding with ARCS alone more than doubled the scaffold NGA50 of the assembly from 3 to 8 Mbp, the combination of Tigmint and ARCS improved the scaffold NGA50 of the assembly over five-fold to 16.4 Mbp. This notable improvement in contiguity highlights the utility of assembly correction in refining assemblies. We demonstrate its usefulness in correcting the assemblies of multiple tools, as well as in using Chromium reads to correct and scaffold assemblies of long single-molecule sequencing. The source code of Tigmint is available for download from https://github.com/bcgsc/tigmint, and is distributed under the GNU GPL v3.0 license.


Sign in / Sign up

Export Citation Format

Share Document