scholarly journals InMut-finder: a software tool for insertion identification in mutagenesis using Nanopore long reads

BMC Genomics ◽  
2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Rui Song ◽  
Ziyao Wang ◽  
Hui Wang ◽  
Han Zhang ◽  
Xuemeng Wang ◽  
...  

Abstract Background Biological mutagens (such as transposon) with sequences inserted, play a crucial role to link observed phenotype and genotype in reverse genetic studies. For this reason, accurate and efficient software tools for identifying insertion sites based on the analysis of sequencing reads are desired. Results We developed a bioinformatics tool, a Finder, to identify genome-wide Insertions in Mutagenesis (named as “InMut-Finder”), based on target sequences and flanking sequences from long reads, such as Oxford Nanopore Sequencing. InMut-Finder succeeded in identify > 100 insertion sites in Medicago truncatula and soybean mutants based on sequencing reads of whole-genome DNA or enriched insertion-site DNA fragments. Insertion sites discovered by InMut-Finder were validated by PCR experiments. Conclusion InMut-Finder is a comprehensive and powerful tool for automated insertion detection from Nanopore long reads. The simplicity, efficiency, and flexibility of InMut-Finder make it a valuable tool for functional genomics and forward and reverse genetics. InMut-Finder was implemented with Perl, R, and Shell scripts, which are independent of the OS. The source code and instructions can be accessed at https://github.com/jsg200830/InMut-Finder.

2019 ◽  
Author(s):  
Maria Artesi ◽  
Vincent Hahaut ◽  
Fereshteh Ashrafi ◽  
Ambroise Marçais ◽  
Olivier Hermine ◽  
...  

AbstractRetroviral infections create a large population of cells, each defined by a unique proviral insertion site. Methods based on short-read high throughput sequencing can identify thousands of insertion sites, but the proviruses within remain unobserved. We have developed Pooled CRISPR Inverse PCR sequencing (PCIP-seq), a method that leverages long reads on the Oxford Nanopore MinION platform to sequence the insertion site and its associated provirus. We have applied the technique to three exogenous retroviruses, HTLV-1, HIV-1 and BLV, as well as endogenous retroviruses in both cattle and sheep. The long reads of PCIP-seq improved the accuracy of insertion site identification in repetitive regions of the genome. The high efficiency of the method facilitated the identification of tens of thousands of insertion sites in a single sample. We observed thousands of SNPs and dozens of structural variants within proviruses and uncovered evidence of viral hypermutation, recombination and recurrent selection.


2019 ◽  
Author(s):  
Doruk Beyter ◽  
Helga Ingimundardottir ◽  
Asmundur Oddsson ◽  
Hannes P. Eggertsson ◽  
Eythor Bjornsson ◽  
...  

Long-read sequencing (LRS) promises to improve characterization of structural variants (SVs), a major source of genetic diversity. We generated LRS data on 3,622 Icelanders using Oxford Nanopore Technologies, and identified a median of 22,636 SVs per individual (a median of 13,353 insertions and 9,474 deletions), spanning a median of 10 Mb per haploid genome. We discovered a set of 133,886 reliably genotyped SV alleles and imputed them into 166,281 individuals to explore their effects on diseases and other traits. We discovered an association with a rare (AF = 0.037%) deletion of the first exon of PCSK9. Carriers of this deletion have 0.93 mmol/L (1.31 SD) lower LDL cholesterol levels than the population average (p-value = 7.0·10−20). We also discovered an association with a multi-allelic SV inside a large repeat region, contained within single long reads, in an exon of ACAN. Within this repeat region we found 11 alleles that differ in the number of a 57 bp-motif repeat, and observed a linear relationship (0.016 SD per motif inserted, p = 6.2·10−18) between the number of repeats carried and height. These results show that SVs can be accurately characterized at population scale using long read sequence data in a genome-wide non-targeted approach and demonstrate how SVs impact phenotypes.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Vahid Akbari ◽  
Jean-Michel Garant ◽  
Kieran O’Neill ◽  
Pawan Pandoh ◽  
Richard Moore ◽  
...  

AbstractThe ability of nanopore sequencing to simultaneously detect modified nucleotides while producing long reads makes it ideal for detecting and phasing allele-specific methylation. However, there is currently no complete software for detecting SNPs, phasing haplotypes, and mapping methylation to these from nanopore sequence data. Here, we present NanoMethPhase, a software tool to phase 5-methylcytosine from nanopore sequencing. We also present SNVoter, which can post-process nanopore SNV calls to improve accuracy in low coverage regions. Together, these tools can accurately detect allele-specific methylation genome-wide using nanopore sequence data with low coverage of about ten-fold redundancy.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Maria Artesi ◽  
Vincent Hahaut ◽  
Basiel Cole ◽  
Laurens Lambrechts ◽  
Fereshteh Ashrafi ◽  
...  

AbstractThe integration of a viral genome into the host genome has a major impact on the trajectory of the infected cell. Integration location and variation within the associated viral genome can influence both clonal expansion and persistence of infected cells. Methods based on short-read sequencing can identify viral insertion sites, but the sequence of the viral genomes within remains unobserved. We develop PCIP-seq, a method that leverages long reads to identify insertion sites and sequence their associated viral genome. We apply the technique to exogenous retroviruses HTLV-1, BLV, and HIV-1, endogenous retroviruses, and human papillomavirus.


2021 ◽  
Vol 3 (2) ◽  
Author(s):  
Jean-Marc Aury ◽  
Benjamin Istace

Abstract Single-molecule sequencing technologies have recently been commercialized by Pacific Biosciences and Oxford Nanopore with the promise of sequencing long DNA fragments (kilobases to megabases order) and then, using efficient algorithms, provide high quality assemblies in terms of contiguity and completeness of repetitive regions. However, the error rate of long-read technologies is higher than that of short-read technologies. This has a direct consequence on the base quality of genome assemblies, particularly in coding regions where sequencing errors can disrupt the coding frame of genes. In the case of diploid genomes, the consensus of a given gene can be a mixture between the two haplotypes and can lead to premature stop codons. Several methods have been developed to polish genome assemblies using short reads and generally, they inspect the nucleotide one by one, and provide a correction for each nucleotide of the input assembly. As a result, these algorithms are not able to properly process diploid genomes and they typically switch from one haplotype to another. Herein we proposed Hapo-G (Haplotype-Aware Polishing Of Genomes), a new algorithm capable of incorporating phasing information from high-quality reads (short or long-reads) to polish genome assemblies and in particular assemblies of diploid and heterozygous genomes.


2002 ◽  
Vol 46 (8) ◽  
pp. 2337-2343 ◽  
Author(s):  
Julien Haroche ◽  
Jeanine Allignet ◽  
Névine El Solh

ABSTRACT We characterized a new transposon, Tn5406 (5,467 bp), in a clinical isolate of Staphylococcus aureus (BM3327). It carries a variant of vgaA, which encodes a putative ABC protein conferring resistance to streptogramin A but not to mixtures of streptogramins A and B. It also carries three putative genes, the products of which exhibit significant similarities (61 to 73% amino acid identity) to the three transposases of the staphylococcal transposon Tn554. Like Tn554, Tn5406 failed to generate target repeats. In BM3327, the single copy of Tn5406 was inserted into the chromosomal att554 site, which is the preferential insertion site of Tn554. In three other independent S. aureus clinical isolates, Tn5406 was either present as a single plasmid copy (BM3318), as two chromosomal copies (BM3252), or both in the chromosome and on a plasmid (BM3385). The Tn5406-carrying plasmids also contain two other genes, vgaB and vatB. The insertion sites of Tn5406 in BM3252 were studied: one copy was in att554, and one copy was in the additional SCCmec element. Amplification experiments revealed circular forms of Tn5406, indicating that this transposon might be active. To our knowledge, a transposon conferring resistance to streptogramin A and related compounds has not been previously described.


Gut ◽  
2021 ◽  
pp. gutjnl-2020-323585
Author(s):  
Long V. Pham ◽  
Martin Schou Pedersen ◽  
Ulrik Fahnøe ◽  
Carlota Fernandez-Antunez ◽  
Daryl Humes ◽  
...  

ObjectiveHCV-genotype 4 infections are a major cause of liver diseases in the Middle East/Africa with certain subtypes associated with increased risk of direct-acting antiviral (DAA) treatment failures. We aimed at developing infectious genotype 4 cell culture systems to understand the evolutionary genetic landscapes of antiviral resistance, which can help preserve the future efficacy of DAA-based therapy.DesignHCV recombinants were tested in liver-derived cells. Long-term coculture with DAAs served to induce antiviral-resistance phenotypes. Next-generation sequencing (NGS) of the entire HCV-coding sequence identified mutation networks. Resistance-associated substitutions (RAS) were studied using reverse-genetics.ResultThe in-vivo infectious ED43(4a) clone was adapted in Huh7.5 cells, using substitutions identified in ED43(Core-NS5A)/JFH1-chimeric viruses combined with selected NS5B-changes. NGS, and linkage analysis, permitted identification of multiple genetic branches emerging during culture adaptation, one of which had 31 substitutions leading to robust replication/propagation. Treatment of culture-adapted ED43 with nine clinically relevant protease-DAA, NS5A-DAA and NS5B-DAA led to complex dynamics of drug-target-specific RAS with coselection of genome-wide substitutions. Approved DAA combinations were efficient against the original virus, but not against variants with RAS in corresponding drug targets. However, retreatment with glecaprevir/pibrentasvir remained efficient against NS5A inhibitor and sofosbuvir resistant variants. Recombinants with specific RAS at NS3-156, NS5A-28, 30, 31 and 93 and NS5B-282 were viable, but NS3-A156M and NS5A-L30Δ (deletion) led to attenuated phenotypes.ConclusionRapidly emerging complex evolutionary landscapes of mutations define the persistence of HCV-RASs conferring resistance levels leading to treatment failure in genotype 4. The high barrier to resistance of glecaprevir/pibrentasvir could prevent persistence and propagation of antiviral resistance.


2015 ◽  
Vol 2015 ◽  
pp. 1-7 ◽  
Author(s):  
Yue-miao Zhang ◽  
Fa-juan Cheng ◽  
Xu-jie Zhou ◽  
Yuan-yuan Qi ◽  
Ping Hou ◽  
...  

Objectives. Numerous loci were identified to perturb gene expression intrans. As elevatedATG5expression was observed in systemic lupus erythematosus (SLE), the study was conducted to analyze the genome-wide genetic regulatory mechanisms associated withATG5expression in a Chinese population with lupus nephritis (LN).Methods. The online expression quantitative trait loci database was searched fortrans-expression single nucleotide polymorphisms (trans-eSNPs) ofATG5. Taggingtrans-eSNPs were genotyped by a custom-made genotyping chip in 280 patients and 199 controls. For positive findings, clinical information and bioinformation analyses were performed.Results. Fourtrans-eSNPs were observed to be associated with susceptibility to LN (P< 0.05), including ANKRD50 rs17008504, AGA rs2271100, PAK7 rs6056923, and TET2 rs1391441, while seven othertrans-eSNPs showed marginal significant associations (0.05 <P< 0.1). Correlations between thetrans-eSNPs andATG5expression and different expression levels ofATG5in SLE patients and controls were validated, and their regulatory effects were annotated. However, no significant associations were observed between different genotypes oftrans-eSNPs and severity or outcome of the patients.Conclusion. Using the new systemic genetics approach, we identified 10 loci associated with susceptibility to LN potentially, which may be complementary to future pathway based genetic studies.


Sign in / Sign up

Export Citation Format

Share Document