scholarly journals An integrated strategy for target SSR genotyping with toleration of nucleotide variations in the SSRs and flanking regions

2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Yongxue Huo ◽  
Yikun Zhao ◽  
Liwen Xu ◽  
Hongmei Yi ◽  
Yunlong Zhang ◽  
...  

Abstract Background With the broad application of high-throughput sequencing and its reduced cost, simple sequence repeat (SSR) genotyping by sequencing (SSR-GBS) has been widely used for interpreting genetic data across different fields, including population genetic diversity and structure analysis, the construction of genetic maps, and the investigation of intraspecies relationships. The development of accurate and efficient typing strategies for SSR-GBS is urgently needed and several tools have been published. However, to date, no suitable accurate genotyping method can tolerate single nucleotide variations (SNVs) in SSRs and flanking regions. These SNVs may be caused by PCR and sequencing errors or SNPs among varieties, and they directly affect sequence alignment and genotyping accuracy. Results Here, we report a new integrated strategy named the accurate microsatellite genotyping tool based on targeted sequencing (AMGT-TS) and provide a user-friendly web-based platform and command-line version of AMGT-TS. To handle SNVs in the SSRs or flanking regions, we developed a broad matching algorithm (BMA) that can quickly and accurately achieve SSR typing for ultradeep coverage and high-throughput analysis of loci with SNVs compatibility and grouping of typed reads for further in-depth information mining. To evaluate this tool, we tested 21 randomly sampled loci in eight maize varieties, accompanied by experimental validation on actual and simulated sequencing data. Our evaluation showed that, compared to other tools, AMGT-TS presented extremely accurate typing results with single base resolution for both homozygous and heterozygous samples. Conclusion This integrated strategy can achieve accurate SSR genotyping based on targeted sequencing, and it can tolerate single nucleotide variations in the SSRs and flanking regions. This method can be readily applied to divergent sequencing platforms and species and has excellent application prospects in genetic and population biology research. The web-based platform and command-line version of AMGT-TS are available at https://amgt-ts.plantdna.site:8445 and https://github.com/plantdna/amgt-ts, respectively.

2014 ◽  
Vol 18 (1) ◽  
pp. 86-91 ◽  
Author(s):  
Aniket Mishra ◽  
Stuart Macgregor

Gene-based tests such as versatile gene-based association study (VEGAS) are commonly used following per-single nucleotide polymorphism (SNP) GWAS (genome-wide association studies) analysis. Two limitations of VEGAS were that the HapMap2 reference set was used to model the correlation between SNPs and only autosomal genes were considered. HapMap2 has now been superseded by the 1,000 Genomes reference set, and whereas early GWASs frequently ignored the X chromosome, it is now commonly included. Here we have developed VEGAS2, an extension that uses 1,000 Genomes data to model SNP correlations across the autosomes and chromosome X. VEGAS2 allows greater flexibility when defining gene boundaries. VEGAS2 offers both a user-friendly, web-based front end and a command line Linux version. The online version of VEGAS2 can be accessed through https://vegas2.qimrberghofer.edu.au/. The command line version can be downloaded from https://vegas2.qimrberghofer.edu.au/zVEGAS2offline.tgz. The command line version is developed in Perl, R and shell scripting languages; source code is available for further development.


2015 ◽  
Vol 8 (2) ◽  
pp. 192-199 ◽  
Author(s):  
Maulik R. Upadhyay ◽  
Anand B. Patel ◽  
Ramalingam B. Subramanian ◽  
Tejas M. Shah ◽  
Subhash J. Jakhesara ◽  
...  

F1000Research ◽  
2013 ◽  
Vol 2 ◽  
pp. 258 ◽  
Author(s):  
Ilya Minkin ◽  
Hoa Pham ◽  
Ekaterina Starostina ◽  
Nikolay Vyahhi ◽  
Son Pham

We present C-Sibelia, a highly accurate and easy-to-use software tool for comparing two closely related bacterial genomes, which can be presented as either finished sequences or fragmented assemblies. C-Sibelia takes as input two FASTA files and produces: (1) a VCF file containing all identified single nucleotide variations and indels; (2) an XMFA file containing alignment information. The software also produces Circos diagrams visualizing high level genomic architecture for rearrangement analyses. C-Sibelia is a part of the Sibelia comparative genomics suite, which is freely available under the GNU GPL v.2 license at http://sourceforge.net/projects/sibelia-bio. C-Sibelia is compatible with Unix-like operating systems. A web-based version of the software is available at http://etool.me/software/csibelia.


Author(s):  
Alexandre Yahi ◽  
Paul Hoffman ◽  
Margot Brandt ◽  
Pejman Mohammadi ◽  
Nicholas P. Tatonetti ◽  
...  

AbstractGenome editing experiments are generating an increasing amount of targeted sequencing data with specific mutational patterns indicating the success of the experiments and genotypes of clonal cell lines. We present EdiTyper, a high-throughput command line tool specifically designed for analysis of sequencing data from polyclonal and monoclonal cell populations from CRISPR gene editing. It requires simple inputs of sequencing data and reference sequences, and provides comprehensive outputs including summary statistics, plots, and SAM/BAM alignments. Analysis of simulated data showed that EdiTyper is highly accurate for detection of both single nucleotide mutations and indels, robust to sequencing errors, as well as fast and scalable to large experimental batches. EdiTyper is available in github (https://github.com/LappalainenLab/edityper) under the MIT license.


2020 ◽  
Vol 49 (D1) ◽  
pp. D660-D666
Author(s):  
Rafael Mamede ◽  
Pedro Vila-Cerqueira ◽  
Mickael Silva ◽  
João A Carriço ◽  
Mário Ramirez

Abstract Chewie Nomenclature Server (chewie-NS, https://chewbbaca.online/) allows users to share genome-based gene-by-gene typing schemas and to maintain a common nomenclature, simplifying the comparison of results. The combination between local analyses and a public repository of allelic data strikes a balance between potential confidentiality issues and the need to compare results. The possibility of deploying private instances of chewie-NS facilitates the creation of nomenclature servers with a restricted user base to allow compliance with the strictest data policies. Chewie-NS allows users to easily share their own schemas and to explore publicly available schemas, including informative statistics on schemas and loci presented in interactive charts and tables. Users can retrieve all the information necessary to run a schema locally or all the alleles identified at a particular locus. The integration with the chewBBACA suite enables users to directly upload new schemas to chewie-NS, download existing schemas and synchronize local and remote schemas from chewBBACA command line version, allowing an easier integration into high-throughput analysis pipelines. The same REST API linking chewie-NS and the chewBBACA suite supports the interaction of other interfaces or pipelines with the databases available at chewie-NS, facilitating the reusability of the stored data.


2014 ◽  
Vol 2014 ◽  
pp. 1-7 ◽  
Author(s):  
Majid Masso ◽  
Iosif I. Vaisman

The AUTO-MUTE 2.0 stand-alone software package includes a collection of programs for predicting functional changes to proteins upon single residue substitutions, developed by combining structure-based features with trained statistical learning models. Three of the predictors evaluate changes to protein stability upon mutation, each complementing a distinct experimental approach. Two additional classifiers are available, one for predicting activity changes due to residue replacements and the other for determining the disease potential of mutations associated with nonsynonymous single nucleotide polymorphisms (nsSNPs) in human proteins. These five command-line driven tools, as well as all the supporting programs, complement those that run our AUTO-MUTE web-based server. Nevertheless, all the codes have been rewritten and substantially altered for the new portable software, and they incorporate several new features based on user feedback. Included among these upgrades is the ability to perform three highly requested tasks: to run “big data” batch jobs; to generate predictions using modified protein data bank (PDB) structures, and unpublished personal models prepared using standard PDB file formatting; and to utilize NMR structure files that contain multiple models.


BMC Genomics ◽  
2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Xue Lin ◽  
Yingying Hua ◽  
Shuanglin Gu ◽  
Li Lv ◽  
Xingyu Li ◽  
...  

Abstract Background Genomic localized hypermutation regions were found in cancers, which were reported to be related to the prognosis of cancers. This genomic localized hypermutation is quite different from the usual somatic mutations in the frequency of occurrence and genomic density. It is like a mutations “violent storm”, which is just what the Greek word “kataegis” means. Results There are needs for a light-weighted and simple-to-use toolkit to identify and visualize the localized hypermutation regions in genome. Thus we developed the R package “kataegis” to meet these needs. The package used only three steps to identify the genomic hypermutation regions, i.e., i) read in the variation files in standard formats; ii) calculate the inter-mutational distances; iii) identify the hypermutation regions with appropriate parameters, and finally one step to visualize the nucleotide contents and spectra of both the foci and flanking regions, and the genomic landscape of these regions. Conclusions The kataegis package is available on Bionconductor/Github (https://github.com/flosalbizziae/kataegis), which provides a light-weighted and simple-to-use toolkit for quickly identifying and visualizing the genomic hypermuation regions.


2021 ◽  
Vol 4 (1) ◽  
Author(s):  
Jiao Fan ◽  
Yige Ding ◽  
Chao Ren ◽  
Ziguo Song ◽  
Jie Yuan ◽  
...  

AbstractCytosine or adenine base editors (CBEs or ABEs) hold great promise in therapeutic applications because they enable the precise conversion of targeted base changes without generating of double-strand breaks. However, both CBEs and ABEs induce substantial off-target DNA editing, and extensive off-target RNA single nucleotide variations in transfected cells. Therefore, the potential effects of deaminases induced by DNA base editors are of great importance for their clinical applicability. Here, the transcriptome-wide deaminase effects on gene expression and splicing is examined. Differentially expressed genes (DEGs) and differential alternative splicing (DAS) events, induced by base editors, are identified. Both CBEs and ABEs generated thousands of DEGs and hundreds of DAS events. For engineered CBEs or ABEs, base editor-induced variants had little effect on the elimination of DEGs and DAS events. Interestingly, more DEGs and DAS events are observed as a result of over expressions of cytosine and adenine deaminases. This study reveals a previously overlooked aspect of deaminase effects in transcriptome-wide gene expression and splicing, and underscores the need to fully characterize such effects of deaminase enzymes in base editor platforms.


Sign in / Sign up

Export Citation Format

Share Document