scholarly journals CRISPRCasTyper: An automated tool for the identification, annotation and classification of CRISPR-Cas loci

Author(s):  
Jakob Russel ◽  
Rafael Pinilla-Redondo ◽  
David Mayo-Muñoz ◽  
Shiraz A. Shah ◽  
Søren J. Sørensen

AbstractCRISPR-Cas loci encode for highly diversified prokaryotic adaptive defense systems that have recently become popular for their applications in gene editing and beyond. The increasing demand for bioinformatic tools that systematically detect and classify CRISPR-Cas systems has been largely challenged by their complex dynamic nature and rapidly expanding classification. Here, we developed CRISPRCasTyper, a new automated software tool with improved capabilities for identifying and typing CRISPR arrays and cas loci across prokaryotic sequences, based on the latest classification and nomenclature (39 subtypes/variants) (Makarova et al. 2020; Pinilla-Redondo et al. 2019). As a novel feature, CRISPRCasTyper uses a machine learning approach to subtype CRISPR arrays based on the sequences of the direct repeats. This allows the typing of orphan and distant arrays which, for example, are commonly observed in fragmented metagenomic assemblies. Furthermore, the tool provides a graphical output, where CRISPRs and cas operon arrangements are visualized in the form of colored gene maps, thus aiding annotation of partial and novel systems through synteny. Moreover, CRISPRCasTyper can resolve hybrid CRISPR-Cas systems and detect loci spanning the ends of sequences with a circular topology, such as complete genomes and plasmids. CRISPRCasTyper was benchmarked against a manually curated set of 31 subtypes/variants with a median accuracy of 98.6%. Altogether, we present an up-to-date and freely available software pipeline for significantly improved automated predictions of CRISPR-Cas loci across genomic sequences.ImplementationCRISPRCasTyper is available through conda and PyPi under the MIT license (https://github.com/Russel88/CRISPRCasTyper), and is also available as a web server (http://cctyper.crispr.dk).

2020 ◽  
Author(s):  
David L Gibbs

AbstractAs part of the ‘immune landscape of cancer’, six immune subtypes were defined which describe a categorization of tumor-immune states. A number of phenotypic variables were found to associate with immune subtypes, such as nonsilent mutation rates, regulation of immunomodulator genes, and cytokine network structures. An ensemble classifier based on XGBoost is introduced with the goal of classifying tumor samples into one of six immune subtypes. Robust performance was accomplished through feature engineering; quartile-levels, binary gene-pair features, and gene-set-pair features were computed for each sample independently. The classifier is robust to software pipeline and normalization scheme, making it applicable to any expression data format from raw count data to TPMs since the classification is essentially based on simple binary gene-gene level comparisons within a given sample. The classifier is available as an R package or part of the CRI iAtlas portal.Code / Tool availabilitySource Code https://github.com/Gibbsdavidl/ImmuneSubtypeClassifierWeb App Tool https://www.cri-iatlas.org/


Author(s):  
Padmavathi .S ◽  
M. Chidambaram

Text classification has grown into more significant in managing and organizing the text data due to tremendous growth of online information. It does classification of documents in to fixed number of predefined categories. Rule based approach and Machine learning approach are the two ways of text classification. In rule based approach, classification of documents is done based on manually defined rules. In Machine learning based approach, classification rules or classifier are defined automatically using example documents. It has higher recall and quick process. This paper shows an investigation on text classification utilizing different machine learning techniques.


2021 ◽  
Vol 9 (5) ◽  
pp. 1034
Author(s):  
Carlos Sabater ◽  
Lorena Ruiz ◽  
Abelardo Margolles

This study aimed to recover metagenome-assembled genomes (MAGs) from human fecal samples to characterize the glycosidase profiles of Bifidobacterium species exposed to different prebiotic oligosaccharides (galacto-oligosaccharides, fructo-oligosaccharides and human milk oligosaccharides, HMOs) as well as high-fiber diets. A total of 1806 MAGs were recovered from 487 infant and adult metagenomes. Unsupervised and supervised classification of glycosidases codified in MAGs using machine-learning algorithms allowed establishing characteristic hydrolytic profiles for B. adolescentis, B. bifidum, B. breve, B. longum and B. pseudocatenulatum, yielding classification rates above 90%. Glycosidase families GH5 44, GH32, and GH110 were characteristic of B. bifidum. The presence or absence of GH1, GH2, GH5 and GH20 was characteristic of B. adolescentis, B. breve and B. pseudocatenulatum, while families GH1 and GH30 were relevant in MAGs from B. longum. These characteristic profiles allowed discriminating bifidobacteria regardless of prebiotic exposure. Correlation analysis of glycosidase activities suggests strong associations between glycosidase families comprising HMOs-degrading enzymes, which are often found in MAGs from the same species. Mathematical models here proposed may contribute to a better understanding of the carbohydrate metabolism of some common bifidobacteria species and could be extrapolated to other microorganisms of interest in future studies.


Author(s):  
Alexis Falcin ◽  
Jean-Philippe Métaxian ◽  
Jérôme Mars ◽  
Éléonore Stutzmann ◽  
Jean-Christophe Komorowski ◽  
...  

Mekatronika ◽  
2020 ◽  
Vol 2 (2) ◽  
pp. 1-12
Author(s):  
Muhammad Nur Aiman Shapiee ◽  
Muhammad Ar Rahim Ibrahim ◽  
Muhammad Amirul Abdullah ◽  
Rabiu Muazu Musa ◽  
Noor Azuan Abu Osman ◽  
...  

The skateboarding scene has arrived at new statures, particularly with its first appearance at the now delayed Tokyo Summer Olympic Games. Hence, attributable to the size of the game in such competitive games, progressed creative appraisal approaches have progressively increased due consideration by pertinent partners, particularly with the enthusiasm of a more goal-based assessment. This study purposes for classifying skateboarding tricks, specifically Frontside 180, Kickflip, Ollie, Nollie Front Shove-it, and Pop Shove-it over the integration of image processing, Trasnfer Learning (TL) to feature extraction enhanced with tradisional Machine Learning (ML) classifier. A male skateboarder performed five tricks every sort of trick consistently and the YI Action camera captured the movement by a range of 1.26 m. Then, the image dataset were features built and extricated by means of  three TL models, and afterward in this manner arranged to utilize by k-Nearest Neighbor (k-NN) classifier. The perception via the initial experiments showed, the MobileNet, NASNetMobile, and NASNetLarge coupled with optimized k-NN classifiers attain a classification accuracy (CA) of 95%, 92% and 90%, respectively on the test dataset. Besides, the result evident from the robustness evaluation showed the MobileNet+k-NN pipeline is more robust as it could provide a decent average CA than other pipelines. It would be demonstrated that the suggested study could characterize the skateboard tricks sufficiently and could, over the long haul, uphold judges decided for giving progressively objective-based decision.


Author(s):  
Xue Zhang ◽  
Lida Zhang ◽  
XiaoYan Yu ◽  
Jing Zhang ◽  
Yanjie Jiao ◽  
...  

A novel actinobacterium, designated strain NEAU-351T, was isolated from cow dung collected from Shangzhi, Heilongjiang Province, northeast PR China and characterized using a polyphasic approach. Phylogenetic analysis based on 16S rRNA gene sequences indicated that strain NEAU-351T belonged to the genus Nocardia , with the highest similarity (98.96 %) to Nocardia takedensis DSM 44801T and less than 98.0 % identity with other type strains of the genus Nocardia . The polar lipids consisted of diphosphatidylglycerol, phosphatidylethanolamine and phosphatidylinositol. The major menaquinone was observed to contain MK-8(H4, ω-cycl) (78.2 %). The fatty acid profile mainly consisted of C16 : 0, C18 : 1  ω9c and 10-methyl C18 : 0. Mycolic acids were present. The genomic DNA G+C content of strain NEAU-351T was 68.1 mol%. In addition, the average nucleotide identity values between strain NEAU-351T and its reference strains, Nocardia takedensis DSM 44801T and Nocardia arizonensis NBRC 108935T, were found to be 81.4 and 82.9 %, respectively, and the level of digital DNA–DNA hybridization between them were 24.8 % (22.5–27.3 %) and 26.3 % (24–28.8 %), respectively. Here we report on the taxonomic characterization and classification of the isolate and propose that strain NEAU-351T represents a new species of the genus Nocardia , for which the name Nocardia bovistercoris is proposed. The type strain is NEAU-351T (=CCTCC AA 2019090T=DSM 110681T).


2018 ◽  
Vol 25 (11) ◽  
pp. 1481-1487 ◽  
Author(s):  
Vivek Kumar Singh ◽  
Utkarsh Shrivastava ◽  
Lina Bouayad ◽  
Balaji Padmanabhan ◽  
Anna Ialynytchev ◽  
...  

Abstract Objective Develop an approach, One-class-at-a-time, for triaging psychiatric patients using machine learning on textual patient records. Our approach aims to automate the triaging process and reduce expert effort while providing high classification reliability. Materials and Methods The One-class-at-a-time approach is a multistage cascading classification technique that achieves higher triage classification accuracy compared to traditional multiclass classifiers through 1) classifying one class at a time (or stage), and 2) identification and application of the highest accuracy classifier at each stage. The approach was evaluated using a unique dataset of 433 psychiatric patient records with a triage class label provided by “I2B2 challenge,” a recent competition in the medical informatics community. Results The One-class-at-a-time cascading classifier outperformed state-of-the-art classification techniques with overall classification accuracy of 77% among 4 classes, exceeding accuracies of existing multiclass classifiers. The approach also enabled highly accurate classification of individual classes—the severe and mild with 85% accuracy, moderate with 64% accuracy, and absent with 60% accuracy. Discussion The triaging of psychiatric cases is a challenging problem due to the lack of clear guidelines and protocols. Our work presents a machine learning approach using psychiatric records for triaging patients based on their severity condition. Conclusion The One-class-at-a-time cascading classifier can be used as a decision aid to reduce triaging effort of physicians and nurses, while providing a unique opportunity to involve experts at each stage to reduce false positive and further improve the system’s accuracy.


2018 ◽  
Vol 483 (4) ◽  
pp. 5077-5104 ◽  
Author(s):  
Stavros Akras ◽  
Marcelo L Leal-Ferreira ◽  
Lizette Guzman-Ramirez ◽  
Gerardo Ramos-Larios

Sign in / Sign up

Export Citation Format

Share Document