command line tool
Recently Published Documents


TOTAL DOCUMENTS

154
(FIVE YEARS 109)

H-INDEX

11
(FIVE YEARS 6)

Author(s):  
Nicolás José Fernández-Martínez ◽  
Ángel Miguel Felices-Lago

Abstract Traditional corpus-based methods rely on manual inspection and extraction of lexical collocates in the study of selection preferences, which is a very costly, labor-intensive, and time-consuming task. Devising automatic methods for lexical collocate extraction becomes necessary to handle this task and the immensity of corpora available. With a view to leveraging the Sketch Engine platform and in-built corpora, we propose a working prototype of a Lexical Collocate Extractor (LeCoExt) command-line tool that mines lexical collocates from all types of verbs according to their syntactic constituents and Collocate Frequency Score (CFS). This might be the first tool that performs comprehensive corpus-based studies of the selection preferences of individual or groups of verbs exploiting the capabilities offered by Sketch Engine. This tool might facilitate the task of extracting rich lexico-semantic knowledge from diverse corpora in a few seconds and at a click away. We test its performance for ontology building and refinement departing from a previous detailed analysis of stealing verbs carried out by Fernández-Martínez & Faber (2020). We show how the proposed tool is used to extract conceptual-cognitive knowledge from the THEFT scenario and implement it into FunGramKB Core Ontology through the creation and modification of theft-related conceptual units.


2021 ◽  
Vol 2021 ◽  
pp. 1-8
Author(s):  
Qingpeng Han ◽  
Wenwen Dong ◽  
Bin Wu ◽  
Xinhang Shen ◽  
Meilin Zhang ◽  
...  

In this study, PZT (piezoelectric) actuators and PD control (PDs’ command line tool) method is selected to control the vibration of the flexible manipulator. The dynamic equations of the flexible manipulator system are established based on Lagrange principle. The control strategy of PZT actuators and joint control torque are designed. It is investigated by a Lyapunov approach that a combined scheme of PD feedback and command voltages applies to segmented PZT actuators. By comparison, only PD feedback control is also considered to control the flexible manipulator. The numerical simulations prove that the method of the designed PZT actuators’ control strategy and PD control is effective to compress the vibration of the flexible manipulator.


2021 ◽  
Author(s):  
Arif M. Tanmoy ◽  
Yogesh Hooda ◽  
Mohammasd S. I. Sajib ◽  
Kesia E. da Silva ◽  
Junaid Iqbal ◽  
...  

Background: Salmonella enterica serovar Paratyphi A (Salmonella Paratyphi A) is the primary causative agent of paratyphoid fever, which is responsible for an estimated 3.4 million infections annually. However, little genomic information is available on population structure, antimicrobial resistance (AMR), and spatiotemporal distribution of the pathogen. With rising antimicrobial resistance and no licensed vaccines, genomic surveillance is important to track the evolution of this pathogen and monitor transmission. Results: We performed whole-genome sequencing of 817 Salmonella Paratyphi A isolates collected from Bangladesh, Nepal, and Pakistan and added publicly available 562 genomes to build a global database representing 37 countries, covering 1917-2019. To track the evolution of Salmonella Paratyphi A, we used the existing lineage scheme, developed earlier based on a small dataset, but certain sub-lineages were not homologous, and many isolates could not be assigned a lineage. Therefore, we developed a single nucleotide polymorphism based genotyping scheme, Paratype, a tool that segregates Salmonella Paratyphi A into three primary and nine secondary clades, and 18 genotypes. Each genotype has been assigned a unique allele definition located on a conserved gene. Using Paratype, we identified genomic variation between different sampling locations and specific AMR markers, and mutations in the O2-polysaccharide synthesis locus, a candidate for vaccine development. Conclusions: This large-scale global analysis proposes the first genotyping tool for Salmonella Paratyphi A. Paratype has already been released (https://github.com/CHRF-Genomics/Paratype) as an open-access, command-line tool and is being adopted for large scale genomic analysis. This tool will assist future genomic surveillance and help inform prevention and treatment strategies.


2021 ◽  
Vol 39 (28_suppl) ◽  
pp. 322-322
Author(s):  
Shivika Prasanna ◽  
Naveen Premnath ◽  
Suveen Angraal ◽  
Ramy Sedhom ◽  
Rohan Khera ◽  
...  

322 Background: Natural language processing (NLP) algorithms can be leveraged to better understand prevailing themes in healthcare conversations. Sentiment analysis, an NLP technique to analyze and interpret sentiments from text, has been validated on Twitter in tracking natural disasters and disease outbreaks. To establish its role in healthcare discourse, we sought to explore the feasibility and accuracy of sentiment analysis on Twitter posts (‘’tweets’’) related to prior authorizations (PAs), a common occurrence in oncology built to curb payer-concerns about costs of cancer care, but which can obstruct timely and appropriate care and increase administrative burden and clinician frustration. Methods: We identified tweets related to PAs between 03/09/2021-04/29/2021 using pre-specified keywords [e.g., #priorauth etc.] and used Twarc, a command-line tool and Python library for archiving Twitter JavaScript Object Notation data. We performed sentiment analysis using two NLP models: (1) TextBlob (trained on movie reviews); and (2) VADER (trained on social media). These models provide results as polarity, a score between 0-1, and a sentiment as ‘’positive’’ (>0), ‘’neutral’’ (exactly 0), or ‘’negative’’ (<0). We (AG, NP) manually reviewed all tweets to give the ground truth (human interpretation of reality) including a notation for sarcasm since models are not trained to detect sarcasm. We calculated the precision (positive predictive value), recall (sensitivity), and the F1-Score (measure of accuracy, range 0-1, 0=failure, 1=perfect) for the models vs. the ground truth. Results: After preprocessing, 964 tweets (mean 137/ week) met our inclusion criteria for sentiment analysis. The two existing NLP models labeled 42.4%- 43.3% tweets as positive, as compared to the ground truth (5.6% tweets positive). F-1 scores of models across labels ranged from 0.18-0.54. We noted sarcasm in 2.8% of tweets. Detailed results in Table. Conclusions: We demonstrate the feasibility of performing sentiment analysis on a topic of high interest within clinical oncology and the deficiency of existing NLP models to capture sentiment within oncologic Twitter discourse. Ongoing iterations of this work further train these models through better identification of the tweeter (patient vs. health care worker) and other analytics from shared content.[Table: see text]


2021 ◽  
Vol 17 (9) ◽  
pp. e1009444
Author(s):  
Manuel Tognon ◽  
Vincenzo Bonnici ◽  
Erik Garrison ◽  
Rosalba Giugno ◽  
Luca Pinello

Transcription factors (TFs) are proteins that promote or reduce the expression of genes by binding short genomic DNA sequences known as transcription factor binding sites (TFBS). While several tools have been developed to scan for potential occurrences of TFBS in linear DNA sequences or reference genomes, no tool exists to find them in pangenome variation graphs (VGs). VGs are sequence-labelled graphs that can efficiently encode collections of genomes and their variants in a single, compact data structure. Because VGs can losslessly compress large pangenomes, TFBS scanning in VGs can efficiently capture how genomic variation affects the potential binding landscape of TFs in a population of individuals. Here we present GRAFIMO (GRAph-based Finding of Individual Motif Occurrences), a command-line tool for the scanning of known TF DNA motifs represented as Position Weight Matrices (PWMs) in VGs. GRAFIMO extends the standard PWM scanning procedure by considering variations and alternative haplotypes encoded in a VG. Using GRAFIMO on a VG based on individuals from the 1000 Genomes project we recover several potential binding sites that are enhanced, weakened or missed when scanning only the reference genome, and which could constitute individual-specific binding events. GRAFIMO is available as an open-source tool, under the MIT license, at https://github.com/pinellolab/GRAFIMO and https://github.com/InfOmics/GRAFIMO.


2021 ◽  
Author(s):  
Anob M Chakrabarti ◽  
Charlotte Capitanchik ◽  
Jernej Ule ◽  
Nicholas Luscombe

CLIP technologies are now widely used to study RNA-protein interactions and many datasets are now publicly available. An important first step in CLIP data exploration is the visual inspection and assessment of processed genomic data on selected genes or regions and performing comparisons: either across conditions within a particular project, or incorporating publicly available data. However, the output files produced by data processing pipelines or preprocessed files available to download from data repositories are often not suitable for direct comparison and usually need further processing. Furthermore, to derive biological insight it is usually necessary to visualise CLIP signal alongside other data such as annotations, or orthogonal functional genomic data (e.g. RNA-seq). We have developed a simple, but powerful, command-line tool: clipplotr, which facilitates these visual comparative and integrative analyses with normalisation and smoothing options for CLIP data and the ability to show these alongside reference annotation tracks and functional genomic data. These data can be supplied as input to clipplotr in a range of file formats, which will output a publication quality figure. It is written in R and can both run on a laptop computer independently, or be integrated into computational workflows on a high-performance cluster. Releases, source code and documentation are freely available at: https://github.com/ulelab/clipplotr.


2021 ◽  
Author(s):  
Florian Tesson ◽  
Alexandre Herve ◽  
Marie Touchon ◽  
Camille d'Humieres ◽  
Jean Cury ◽  
...  

Facing the abundance and diversity of phages, bacteria have developed multiple anti-phage mechanisms. In the past three years, the number of known anti-phage mechanisms has been expanded by at least 5-fold rendering our view of prokaryotic immunity obsolete. Most anti-phage systems have been studied as standalone mechanisms, however many examples demonstrate strains encode not one but several anti-viral mechanisms. How these different systems integrate into an anti-viral arsenal at the strain level remains to be elucidated. Much could be learned from establishing fundamental description of features such as the number and diversity of anti-phage systems encoded in a given genome. To address this question, we developed DefenseFinder, a tool that automatically detects known anti-phage systems in prokaryotic genomes. We applied DefenseFinder to >20 000 fully sequenced genomes, generating a systematic and quantitative view of the anti-viral arsenal of prokaryotes. We show prokaryotic genomes encode on average five anti-phage systems from three different families of systems. This number varies drastically from one strain to another and is influenced by the genome size and the number of prophages encoded. Distributions of different systems are also very heterogenous with some systems being enriched in prophages and in specific clades. Finally, we provide a detailed comparison of the anti-viral arsenal of 15 common bacterial species, revealing drastic differences in anti-viral strategies. Overall, our work provides a free and open-source software, available as a command line tool or, on webserver. It allows the rapid detection of anti-phage systems, enables a comprehensive description of the anti-viral arsenal of prokaryotes and paves the way for large scale genomics study in the field of anti-phage defense.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Yichao Li ◽  
Jingjing Chen ◽  
Shengdar Q. Tsai ◽  
Yong Cheng

AbstractPrime editing is a revolutionary genome-editing technology that can make a wide range of precise edits in DNA. However, designing highly efficient prime editors (PEs) remains challenging. We develop Easy-Prime, a machine learning–based program trained with multiple published data sources. Easy-Prime captures both known and novel features, such as RNA folding structure, and optimizes feature combinations to improve editing efficiency. We provide optimized PE design for installation of 89.5% of 152,351 GWAS variants. Easy-Prime is available both as a command line tool and an interactive PE design server at: http://easy-prime.cc/.


2021 ◽  
Author(s):  
Tim Van Den Bossche ◽  
Kay Schallert ◽  
Pieter Verschaffelt ◽  
Bart Mesuere ◽  
Dirk Benndorf ◽  
...  

The protein inference problem is complicated in metaproteomics due to the presence of homologous proteins from closely related species. Nevertheless, this process is vital to assign taxonomy and functions to identified proteins of microbial species, a task for which specialized tools such as Prophane have been developed. We here present Pout2Prot, which takes Percolator Output (.pout) files from multiple experiments and creates protein (sub)group output files (.tsv) that can be used directly with Prophane. Pout2Prot offers different grouping strategies, allows distinction between sample categories and replicates for multiple files, and uses a weighted spectral count for protein (sub)groups to reflect (sub)group abundance. Pout2Prot is available as a web application at https://pout2prot.ugent.be and is installable via pip as a standalone command line tool and reusable software library. All code is open source under the Apache License 2.0 and is available at https://github.com/compomics/pout2prot.


2021 ◽  
Author(s):  
Philippe Chlenski ◽  
Melody Hsu ◽  
Itsik Pe'er

MiSDEED is a command-line tool for generating synthetic longitudinal multi-omics data from simulated microbial environments. It generates relative-abundance timecourses under perturbations for an arbitrary number of samples and patients. All simulation parameters are exposed to the user to facilitate rapid power analysis and aid in study design. Users who want additional flexibility may also use MiSDEED as a Python package. Availability and implementation: MiSDEED is written in Python and is freely available at https://github.com/pchlenski/misdeed. Contact: [email protected]


Sign in / Sign up

Export Citation Format

Share Document