command line version
Recently Published Documents


TOTAL DOCUMENTS

13
(FIVE YEARS 8)

H-INDEX

4
(FIVE YEARS 2)

2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Yongxue Huo ◽  
Yikun Zhao ◽  
Liwen Xu ◽  
Hongmei Yi ◽  
Yunlong Zhang ◽  
...  

Abstract Background With the broad application of high-throughput sequencing and its reduced cost, simple sequence repeat (SSR) genotyping by sequencing (SSR-GBS) has been widely used for interpreting genetic data across different fields, including population genetic diversity and structure analysis, the construction of genetic maps, and the investigation of intraspecies relationships. The development of accurate and efficient typing strategies for SSR-GBS is urgently needed and several tools have been published. However, to date, no suitable accurate genotyping method can tolerate single nucleotide variations (SNVs) in SSRs and flanking regions. These SNVs may be caused by PCR and sequencing errors or SNPs among varieties, and they directly affect sequence alignment and genotyping accuracy. Results Here, we report a new integrated strategy named the accurate microsatellite genotyping tool based on targeted sequencing (AMGT-TS) and provide a user-friendly web-based platform and command-line version of AMGT-TS. To handle SNVs in the SSRs or flanking regions, we developed a broad matching algorithm (BMA) that can quickly and accurately achieve SSR typing for ultradeep coverage and high-throughput analysis of loci with SNVs compatibility and grouping of typed reads for further in-depth information mining. To evaluate this tool, we tested 21 randomly sampled loci in eight maize varieties, accompanied by experimental validation on actual and simulated sequencing data. Our evaluation showed that, compared to other tools, AMGT-TS presented extremely accurate typing results with single base resolution for both homozygous and heterozygous samples. Conclusion This integrated strategy can achieve accurate SSR genotyping based on targeted sequencing, and it can tolerate single nucleotide variations in the SSRs and flanking regions. This method can be readily applied to divergent sequencing platforms and species and has excellent application prospects in genetic and population biology research. The web-based platform and command-line version of AMGT-TS are available at https://amgt-ts.plantdna.site:8445 and https://github.com/plantdna/amgt-ts, respectively.


2021 ◽  
Author(s):  
Leo Kaindl ◽  
Corinn Small ◽  
Remco Stam

AbstractMulti-gene phylogenies constructed from multiplexed and Sanger sequencing data are regularly used in mycology and other disciplines as a cost-effective way of species identification and as a first means to investigate genetic diversity samples.Today, a number of tools exist for each of the steps in this analysis, including quality control and trimming, the generation of a multiple sequence alignment (MSA), extraction of informative sites, and the construction of the final phylogenetic tree. A BLAST search in a reference database is often performed to identify sequences of type specimens to compare the samples with in the phylogeny. Made over the past decades, these tools are all independent from and often not perfectly adapted to one another.We present AB12PHYLO, an integrated pipeline that can perform all necessary steps from reading in raw Sanger sequencing data through visualizing and editing phylogenies. In addition, AB12PHYLO can calculate basic summary statistics for each gene in the phylogeny.AB12PHYLO is designed as a wrapper of several open access and commonly used tools for each of the intermediate stages, and intended to simplify the phylogenetic pipeline while still allowing a high degree of access. It comes as a command-line version for the highest reproducibility and an intuitive graphical user interface (GUI) for easy adoption by IT-agnostic end-users. The use of AB12PHYLO significantly reduces the hands-on working time for these analyses.


2020 ◽  
Vol 49 (D1) ◽  
pp. D660-D666
Author(s):  
Rafael Mamede ◽  
Pedro Vila-Cerqueira ◽  
Mickael Silva ◽  
João A Carriço ◽  
Mário Ramirez

Abstract Chewie Nomenclature Server (chewie-NS, https://chewbbaca.online/) allows users to share genome-based gene-by-gene typing schemas and to maintain a common nomenclature, simplifying the comparison of results. The combination between local analyses and a public repository of allelic data strikes a balance between potential confidentiality issues and the need to compare results. The possibility of deploying private instances of chewie-NS facilitates the creation of nomenclature servers with a restricted user base to allow compliance with the strictest data policies. Chewie-NS allows users to easily share their own schemas and to explore publicly available schemas, including informative statistics on schemas and loci presented in interactive charts and tables. Users can retrieve all the information necessary to run a schema locally or all the alleles identified at a particular locus. The integration with the chewBBACA suite enables users to directly upload new schemas to chewie-NS, download existing schemas and synchronize local and remote schemas from chewBBACA command line version, allowing an easier integration into high-throughput analysis pipelines. The same REST API linking chewie-NS and the chewBBACA suite supports the interaction of other interfaces or pipelines with the databases available at chewie-NS, facilitating the reusability of the stored data.


Author(s):  
Alessandro Pedretti ◽  
Angelica Mazzolari ◽  
Silvia Gervasoni ◽  
Laura Fumagalli ◽  
Giulio Vistoli

Abstract The purpose of the article is to offer an overview of the latest release of the VEGA suite of programs. This software has been constantly developed and freely released during the last 20 years and has now reached a significant diffusion and technology level as confirmed by the about 22 500 registered users. While being primarily developed for drug design studies, the VEGA package includes cheminformatics and modeling features, which can be fruitfully utilized in various contexts of the computational chemistry. To offer a glimpse of the remarkable potentials of the software, some examples of the implemented features in the cheminformatics field and for structure-based studies are discussed. Finally, the flexible architecture of the VEGA program which can be expanded and customized by plug-in technology or scripting languages will be described focusing attention on the HyperDrive library including highly optimized functions. Availability and implementation: The VEGA suite of programs and the source code of the VEGA command-line version are available free of charge for non-profit organizations at http://www.vegazz.net. Contact: [email protected]


2020 ◽  
Author(s):  
Marie Gramm ◽  
Eduardo Pérez-Palma ◽  
Sarah Schumacher-Bass ◽  
Jarrod Dalton ◽  
Costin Leu ◽  
...  

AbstractLiterature exploration in PubMed on a large number of biomedical entities (e.g., genes, diseases, experiments) can be time consuming and challenging comparing many entities to one other. Here, we describe SimText, a user-friendly toolset that provides customizable and systematic workflows for the analysis of similarities among a set of entities based on words from abstracts and/or other text. SimText can be used for (i) data generation: text collection from PubMed and extraction of words with different text mining approaches, and (ii) interactive analysis of data using unsupervised learning techniques and visualization in a Shiny web application.Availability and ImplementationWe developed SimText as an open-source R software and integrated it into Galaxy, an online data analysis platform. A command line version of the toolset is available for download from GitHub at https://github.com/mgramm1/simtext.


2020 ◽  
Vol 36 (11) ◽  
pp. 3336-3342 ◽  
Author(s):  
Kewei Liu ◽  
Wei Chen

Abstract Motivation RNA modifications play critical roles in a series of cellular and developmental processes. Knowledge about the distributions of RNA modifications in the transcriptomes will provide clues to revealing their functions. Since experimental methods are time consuming and laborious for detecting RNA modifications, computational methods have been proposed for this aim in the past five years. However, there are some drawbacks for both experimental and computational methods in simultaneously identifying modifications occurred on different nucleotides. Results To address such a challenge, in this article, we developed a new predictor called iMRM, which is able to simultaneously identify m6A, m5C, m1A, ψ and A-to-I modifications in Homo sapiens, Mus musculus and Saccharomyces cerevisiae. In iMRM, the feature selection technique was used to pick out the optimal features. The results from both 10-fold cross-validation and jackknife test demonstrated that the performance of iMRM is superior to existing methods for identifying RNA modifications. Availability and implementation A user-friendly web server for iMRM was established at http://www.bioml.cn/XG_iRNA/home. The off-line command-line version is available at https://github.com/liukeweiaway/iMRM. Contact [email protected] Supplementary information Supplementary data are available at Bioinformatics online.


2019 ◽  
Vol 47 (W1) ◽  
pp. W610-W613 ◽  
Author(s):  
Kun Wang ◽  
Haiwei Li ◽  
Yue Xu ◽  
Qianzhi Shao ◽  
Jianming Yi ◽  
...  

Abstract Quality control (QC) for lab-designed primers is crucial for the success of a polymerase chain reaction (PCR). Here, we present MFEprimer-3.0, a functional primer quality control program for checking non-specific amplicons, dimers, hairpins and other parameters. The new features of the current version include: (i) more sensitive binding site search using the updated k-mer algorithm that allows mismatches within the k-mer, except for the first base at the 3′ end. The binding sites of each primer with a stable 3′ end are listed in the output; (ii) new algorithms for rapidly identifying self-dimers, cross-dimers and hairpins; (iii) the command-line version, which has an added option of JSON output to enhance the versatility of MFEprimer by acting as a QC step in the ‘primer design → quality control → redesign’ pipeline; (iv) a function for checking whether the binding sites contain single nucleotide polymorphisms (SNPs), which will affect the consistency of binding efficiency among different samples. In summary, MFEprimer-3.0 is updated with the well-tested PCR primer QC program and it can be integrated into various PCR primer design applications as a QC module. The MFEprimer-3.0 server is freely accessible without any login requirement at: https://mfeprimer3.igenetech.com/ and https://www.mfeprimer.com/. The source code for the command-line version is available upon request.


2018 ◽  
Vol 35 (14) ◽  
pp. 2523-2524 ◽  
Author(s):  
S Castillo-Lara ◽  
J F Abril

Abstract Motivation Protein–protein interactions (PPIs) are very important to build models for understanding many biological processes. Although several databases hold many of these interactions, exploring them, selecting those relevant for a given subject and contextualizing them can be a difficult task for researchers. Extracting PPIs directly from the scientific literature can be very helpful for providing such context, as the sentences describing these interactions may give insights to researchers in helpful ways. Results We have developed PPaxe, a python module and a web application that allows users to extract PPIs and protein occurrence from a given set of PubMed and PubMedCentral articles. It presents the results of the analysis in different ways to help researchers export, filter and analyze the results easily. Availability and implementation PPaxe web demo is freely available at https://compgen.bio.ub.edu/PPaxe. All the software can be downloaded from https://compgen.bio.ub.edu/PPaxe/download, including a command-line version and docker containers for an easy installation. Supplementary information Supplementary data are available at Bioinformatics online.


2017 ◽  
Vol 46 (D1) ◽  
pp. D762-D769 ◽  
Author(s):  
Jonathan Casper ◽  
Ann S Zweig ◽  
Chris Villarreal ◽  
Cath Tyner ◽  
Matthew L Speir ◽  
...  

Abstract The UCSC Genome Browser (https://genome.ucsc.edu) provides a web interface for exploring annotated genome assemblies. The assemblies and annotation tracks are updated on an ongoing basis—12 assemblies and more than 28 tracks were added in the past year. Two recent additions are a display of CRISPR/Cas9 guide sequences and an interactive navigator for gene interactions. Other upgrades from the past year include a command-line version of the Variant Annotation Integrator, support for Human Genome Variation Society variant nomenclature input and output, and a revised highlighting tool that now supports multiple simultaneous regions and colors.


Sign in / Sign up

Export Citation Format

Share Document