command line tool Latest Research Papers

Abstract Traditional corpus-based methods rely on manual inspection and extraction of lexical collocates in the study of selection preferences, which is a very costly, labor-intensive, and time-consuming task. Devising automatic methods for lexical collocate extraction becomes necessary to handle this task and the immensity of corpora available. With a view to leveraging the Sketch Engine platform and in-built corpora, we propose a working prototype of a Lexical Collocate Extractor (LeCoExt) command-line tool that mines lexical collocates from all types of verbs according to their syntactic constituents and Collocate Frequency Score (CFS). This might be the first tool that performs comprehensive corpus-based studies of the selection preferences of individual or groups of verbs exploiting the capabilities offered by Sketch Engine. This tool might facilitate the task of extracting rich lexico-semantic knowledge from diverse corpora in a few seconds and at a click away. We test its performance for ontology building and refinement departing from a previous detailed analysis of stealing verbs carried out by Fernández-Martínez & Faber (2020). We show how the proposed tool is used to extract conceptual-cognitive knowledge from the THEFT scenario and implement it into FunGramKB Core Ontology through the creation and modification of theft-related conceptual units.

Get full-text (via PubEx)

Research on Active Control of Rotating Motion and Vibration of Flexible Manipulator

Shock and Vibration ◽

10.1155/2021/6661910 ◽

2021 ◽

Vol 2021 ◽

pp. 1-8

Author(s):

Qingpeng Han ◽

Wenwen Dong ◽

Bin Wu ◽

Xinhang Shen ◽

Meilin Zhang ◽

...

Keyword(s):

Control Strategy ◽

Piezoelectric Actuators ◽

Flexible Manipulator ◽

Joint Control ◽

Command Line ◽

Control Torque ◽

Pd Control ◽

Lagrange Principle ◽

Lyapunov Approach ◽

Command Line Tool

In this study, PZT (piezoelectric) actuators and PD control (PDs’ command line tool) method is selected to control the vibration of the flexible manipulator. The dynamic equations of the flexible manipulator system are established based on Lagrange principle. The control strategy of PZT actuators and joint control torque are designed. It is investigated by a Lyapunov approach that a combined scheme of PD feedback and command voltages applies to segmented PZT actuators. By comparison, only PD feedback control is also considered to control the flexible manipulator. The numerical simulations prove that the method of the designed PZT actuators’ control strategy and PD control is effective to compress the vibration of the flexible manipulator.

Get full-text (via PubEx)

Paratype: A genotyping framework and an open-source tool for Salmonella Paratyphi A

10.1101/2021.11.13.21266165 ◽

2021 ◽

Author(s):

Arif M. Tanmoy ◽

Yogesh Hooda ◽

Mohammasd S. I. Sajib ◽

Kesia E. da Silva ◽

Junaid Iqbal ◽

...

Keyword(s):

Antimicrobial Resistance ◽

Large Scale ◽

Vaccine Development ◽

Global Analysis ◽

Treatment Strategies ◽

Genomic Analysis ◽

Genomic Variation ◽

Global Database ◽

Command Line Tool ◽

Conserved Gene

Background: Salmonella enterica serovar Paratyphi A (Salmonella Paratyphi A) is the primary causative agent of paratyphoid fever, which is responsible for an estimated 3.4 million infections annually. However, little genomic information is available on population structure, antimicrobial resistance (AMR), and spatiotemporal distribution of the pathogen. With rising antimicrobial resistance and no licensed vaccines, genomic surveillance is important to track the evolution of this pathogen and monitor transmission. Results: We performed whole-genome sequencing of 817 Salmonella Paratyphi A isolates collected from Bangladesh, Nepal, and Pakistan and added publicly available 562 genomes to build a global database representing 37 countries, covering 1917-2019. To track the evolution of Salmonella Paratyphi A, we used the existing lineage scheme, developed earlier based on a small dataset, but certain sub-lineages were not homologous, and many isolates could not be assigned a lineage. Therefore, we developed a single nucleotide polymorphism based genotyping scheme, Paratype, a tool that segregates Salmonella Paratyphi A into three primary and nine secondary clades, and 18 genotypes. Each genotype has been assigned a unique allele definition located on a conserved gene. Using Paratype, we identified genomic variation between different sampling locations and specific AMR markers, and mutations in the O2-polysaccharide synthesis locus, a candidate for vaccine development. Conclusions: This large-scale global analysis proposes the first genotyping tool for Salmonella Paratyphi A. Paratype has already been released (https://github.com/CHRF-Genomics/Paratype) as an open-access, command-line tool and is being adopted for large scale genomic analysis. This tool will assist future genomic surveillance and help inform prevention and treatment strategies.

Get full-text (via PubEx)

Sentiment analysis of tweets on prior authorization.

Journal of Clinical Oncology ◽

10.1200/jco.2020.39.28_suppl.322 ◽

2021 ◽

Vol 39 (28_suppl) ◽

pp. 322-322

Author(s):

Shivika Prasanna ◽

Naveen Premnath ◽

Suveen Angraal ◽

Ramy Sedhom ◽

Rohan Khera ◽

...

Keyword(s):

Sentiment Analysis ◽

Language Processing ◽

Disease Outbreaks ◽

Ground Truth ◽

Inclusion Criteria ◽

Shared Content ◽

Appropriate Care ◽

Command Line Tool ◽

Human Interpretation ◽

Care Worker

322 Background: Natural language processing (NLP) algorithms can be leveraged to better understand prevailing themes in healthcare conversations. Sentiment analysis, an NLP technique to analyze and interpret sentiments from text, has been validated on Twitter in tracking natural disasters and disease outbreaks. To establish its role in healthcare discourse, we sought to explore the feasibility and accuracy of sentiment analysis on Twitter posts (‘’tweets’’) related to prior authorizations (PAs), a common occurrence in oncology built to curb payer-concerns about costs of cancer care, but which can obstruct timely and appropriate care and increase administrative burden and clinician frustration. Methods: We identified tweets related to PAs between 03/09/2021-04/29/2021 using pre-specified keywords [e.g., #priorauth etc.] and used Twarc, a command-line tool and Python library for archiving Twitter JavaScript Object Notation data. We performed sentiment analysis using two NLP models: (1) TextBlob (trained on movie reviews); and (2) VADER (trained on social media). These models provide results as polarity, a score between 0-1, and a sentiment as ‘’positive’’ (>0), ‘’neutral’’ (exactly 0), or ‘’negative’’ (<0). We (AG, NP) manually reviewed all tweets to give the ground truth (human interpretation of reality) including a notation for sarcasm since models are not trained to detect sarcasm. We calculated the precision (positive predictive value), recall (sensitivity), and the F1-Score (measure of accuracy, range 0-1, 0=failure, 1=perfect) for the models vs. the ground truth. Results: After preprocessing, 964 tweets (mean 137/ week) met our inclusion criteria for sentiment analysis. The two existing NLP models labeled 42.4%- 43.3% tweets as positive, as compared to the ground truth (5.6% tweets positive). F-1 scores of models across labels ranged from 0.18-0.54. We noted sarcasm in 2.8% of tweets. Detailed results in Table. Conclusions: We demonstrate the feasibility of performing sentiment analysis on a topic of high interest within clinical oncology and the deficiency of existing NLP models to capture sentiment within oncologic Twitter discourse. Ongoing iterations of this work further train these models through better identification of the tweeter (patient vs. health care worker) and other analytics from shared content.[Table: see text]

Get full-text (via PubEx)

GRAFIMO: Variant and haplotype aware motif scanning on pangenome graphs

PLoS Computational Biology ◽

10.1371/journal.pcbi.1009444 ◽

2021 ◽

Vol 17 (9) ◽

pp. e1009444

Author(s):

Manuel Tognon ◽

Vincenzo Bonnici ◽

Erik Garrison ◽

Rosalba Giugno ◽

Luca Pinello

Keyword(s):

Dna Sequences ◽

Binding Sites ◽

Specific Binding ◽

Genomic Variation ◽

Link Type ◽

Scanning Procedure ◽

Expression Of Genes ◽

Potential Binding ◽

Command Line Tool ◽

Reference Genomes

Transcription factors (TFs) are proteins that promote or reduce the expression of genes by binding short genomic DNA sequences known as transcription factor binding sites (TFBS). While several tools have been developed to scan for potential occurrences of TFBS in linear DNA sequences or reference genomes, no tool exists to find them in pangenome variation graphs (VGs). VGs are sequence-labelled graphs that can efficiently encode collections of genomes and their variants in a single, compact data structure. Because VGs can losslessly compress large pangenomes, TFBS scanning in VGs can efficiently capture how genomic variation affects the potential binding landscape of TFs in a population of individuals. Here we present GRAFIMO (GRAph-based Finding of Individual Motif Occurrences), a command-line tool for the scanning of known TF DNA motifs represented as Position Weight Matrices (PWMs) in VGs. GRAFIMO extends the standard PWM scanning procedure by considering variations and alternative haplotypes encoded in a VG. Using GRAFIMO on a VG based on individuals from the 1000 Genomes project we recover several potential binding sites that are enhanced, weakened or missed when scanning only the reference genome, and which could constitute individual-specific binding events. GRAFIMO is available as an open-source tool, under the MIT license, at https://github.com/pinellolab/GRAFIMO and https://github.com/InfOmics/GRAFIMO.

Get full-text (via PubEx)

clipplotr - a comparative visualisation and analysis tool for CLIP data

10.1101/2021.09.10.459763 ◽

2021 ◽

Author(s):

Anob M Chakrabarti ◽

Charlotte Capitanchik ◽

Jernej Ule ◽

Nicholas Luscombe

Keyword(s):

Protein Interactions ◽

High Performance ◽

Genomic Data ◽

Analysis Tool ◽

Functional Genomic ◽

Functional Genomic Data ◽

Laptop Computer ◽

Data Repositories ◽

Command Line Tool ◽

Biological Insight

CLIP technologies are now widely used to study RNA-protein interactions and many datasets are now publicly available. An important first step in CLIP data exploration is the visual inspection and assessment of processed genomic data on selected genes or regions and performing comparisons: either across conditions within a particular project, or incorporating publicly available data. However, the output files produced by data processing pipelines or preprocessed files available to download from data repositories are often not suitable for direct comparison and usually need further processing. Furthermore, to derive biological insight it is usually necessary to visualise CLIP signal alongside other data such as annotations, or orthogonal functional genomic data (e.g. RNA-seq). We have developed a simple, but powerful, command-line tool: clipplotr, which facilitates these visual comparative and integrative analyses with normalisation and smoothing options for CLIP data and the ability to show these alongside reference annotation tracks and functional genomic data. These data can be supplied as input to clipplotr in a range of file formats, which will output a publication quality figure. It is written in R and can both run on a laptop computer independently, or be integrated into computational workflows on a high-performance cluster. Releases, source code and documentation are freely available at: https://github.com/ulelab/clipplotr.

Get full-text (via PubEx)

Systematic and quantitative view of the antiviral arsenal of prokaryotes

10.1101/2021.09.02.458658 ◽

2021 ◽

Author(s):

Florian Tesson ◽

Alexandre Herve ◽

Marie Touchon ◽

Camille d'Humieres ◽

Jean Cury ◽

...

Keyword(s):

Open Source Software ◽

Large Scale ◽

Bacterial Species ◽

Detailed Comparison ◽

Command Line ◽

Comprehensive Description ◽

The Past ◽

Prokaryotic Genomes ◽

Command Line Tool ◽

Abundance And Diversity

Facing the abundance and diversity of phages, bacteria have developed multiple anti-phage mechanisms. In the past three years, the number of known anti-phage mechanisms has been expanded by at least 5-fold rendering our view of prokaryotic immunity obsolete. Most anti-phage systems have been studied as standalone mechanisms, however many examples demonstrate strains encode not one but several anti-viral mechanisms. How these different systems integrate into an anti-viral arsenal at the strain level remains to be elucidated. Much could be learned from establishing fundamental description of features such as the number and diversity of anti-phage systems encoded in a given genome. To address this question, we developed DefenseFinder, a tool that automatically detects known anti-phage systems in prokaryotic genomes. We applied DefenseFinder to >20 000 fully sequenced genomes, generating a systematic and quantitative view of the anti-viral arsenal of prokaryotes. We show prokaryotic genomes encode on average five anti-phage systems from three different families of systems. This number varies drastically from one strain to another and is influenced by the genome size and the number of prophages encoded. Distributions of different systems are also very heterogenous with some systems being enriched in prophages and in specific clades. Finally, we provide a detailed comparison of the anti-viral arsenal of 15 common bacterial species, revealing drastic differences in anti-viral strategies. Overall, our work provides a free and open-source software, available as a command line tool or, on webserver. It allows the rapid detection of anti-phage systems, enables a comprehensive description of the anti-viral arsenal of prokaryotes and paves the way for large scale genomics study in the field of anti-phage defense.

Get full-text (via PubEx)

Easy-Prime: a machine learning–based prime editor design tool

Genome Biology ◽

10.1186/s13059-021-02458-0 ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Yichao Li ◽

Jingjing Chen ◽

Shengdar Q. Tsai ◽

Yong Cheng

Keyword(s):

Machine Learning ◽

Genome Editing ◽

Rna Folding ◽

Design Tool ◽

Data Sources ◽

Published Data ◽

Command Line ◽

Wide Range ◽

Command Line Tool ◽

Folding Structure

AbstractPrime editing is a revolutionary genome-editing technology that can make a wide range of precise edits in DNA. However, designing highly efficient prime editors (PEs) remains challenging. We develop Easy-Prime, a machine learning–based program trained with multiple published data sources. Easy-Prime captures both known and novel features, such as RNA folding structure, and optimizes feature combinations to improve editing efficiency. We provide optimized PE design for installation of 89.5% of 152,351 GWAS variants. Easy-Prime is available both as a command line tool and an interactive PE design server at: http://easy-prime.cc/.

Get full-text (via PubEx)

Pout2Prot: an efficient tool to create protein (sub)groups from Percolator output files

10.1101/2021.08.11.455803 ◽

2021 ◽

Author(s):

Tim Van Den Bossche ◽

Kay Schallert ◽

Pieter Verschaffelt ◽

Bart Mesuere ◽

Dirk Benndorf ◽

...

Keyword(s):

Web Application ◽

Software Library ◽

Homologous Proteins ◽

Efficient Tool ◽

Inference Problem ◽

Closely Related Species ◽

Reusable Software ◽

Protein Inference ◽

Grouping Strategies ◽

Command Line Tool

The protein inference problem is complicated in metaproteomics due to the presence of homologous proteins from closely related species. Nevertheless, this process is vital to assign taxonomy and functions to identified proteins of microbial species, a task for which specialized tools such as Prophane have been developed. We here present Pout2Prot, which takes Percolator Output (.pout) files from multiple experiments and creates protein (sub)group output files (.tsv) that can be used directly with Prophane. Pout2Prot offers different grouping strategies, allows distinction between sample categories and replicates for multiple files, and uses a weighted spectral count for protein (sub)groups to reflect (sub)group abundance. Pout2Prot is available as a web application at https://pout2prot.ugent.be and is installable via pip as a standalone command line tool and reusable software library. All code is open source under the Apache License 2.0 and is available at https://github.com/compomics/pout2prot.

Get full-text (via PubEx)

MiSDEED: a synthetic multi-omics engine for microbiome power analysis and study design

10.1101/2021.08.09.455682 ◽

2021 ◽

Author(s):

Philippe Chlenski ◽

Melody Hsu ◽

Itsik Pe'er

Keyword(s):

Study Design ◽

Relative Abundance ◽

Arbitrary Number ◽

Power Analysis ◽

Command Line ◽

Omics Data ◽

Command Line Tool ◽

Simulation Parameters ◽

Python Package ◽

Analysis And Study

MiSDEED is a command-line tool for generating synthetic longitudinal multi-omics data from simulated microbial environments. It generates relative-abundance timecourses under perturbations for an arbitrary number of samples and patients. All simulation parameters are exposed to the user to facilitate rapid power analysis and aid in study design. Users who want additional flexibility may also use MiSDEED as a Python package. Availability and implementation: MiSDEED is written in Python and is freely available at https://github.com/pchlenski/misdeed. Contact: [email protected]

Get full-text (via PubEx)

command line tool
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Automatic lexical collocate extraction for corpus-based ontology building and refinement

Research on Active Control of Rotating Motion and Vibration of Flexible Manipulator

Paratype: A genotyping framework and an open-source tool for Salmonella Paratyphi A

Sentiment analysis of tweets on prior authorization.

GRAFIMO: Variant and haplotype aware motif scanning on pangenome graphs

clipplotr - a comparative visualisation and analysis tool for CLIP data

Systematic and quantitative view of the antiviral arsenal of prokaryotes

Easy-Prime: a machine learning–based prime editor design tool

Pout2Prot: an efficient tool to create protein (sub)groups from Percolator output files

MiSDEED: a synthetic multi-omics engine for microbiome power analysis and study design

Export Citation Format

command line toolRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Automatic lexical collocate extraction for corpus-based ontology building and refinement

Research on Active Control of Rotating Motion and Vibration of Flexible Manipulator

Paratype: A genotyping framework and an open-source tool for Salmonella Paratyphi A

Sentiment analysis of tweets on prior authorization.

GRAFIMO: Variant and haplotype aware motif scanning on pangenome graphs

clipplotr - a comparative visualisation and analysis tool for CLIP data

Systematic and quantitative view of the antiviral arsenal of prokaryotes

Easy-Prime: a machine learning–based prime editor design tool

Pout2Prot: an efficient tool to create protein (sub)groups from Percolator output files

MiSDEED: a synthetic multi-omics engine for microbiome power analysis and study design

command line tool
Recently Published Documents