scholarly journals Enhancing filter-based parenthetic abbreviation extraction methods

Author(s):  
Houcemeddine Turki ◽  
Mohamed Ali Hadj Taieb ◽  
Mohamed Ben Aouicha

Abstract This letter discusses the limitations of the use of filters to enhance the accuracy of the extraction of parenthetic abbreviations from scholarly publications and proposes the usage of the parentheses level count algorithm to efficiently extract entities between parentheses from raw texts as well as of machine learning-based supervised classification techniques for the identification of biomedical abbreviations to significantly reduce the removal of acronyms including disallowed punctuations.

Author(s):  
G. RAMASUBBA REDDY ◽  
B. SRINIVASULU ◽  
M. ROSHINI ◽  
V. RAJYA LAKSHMI ◽  
◽  
...  

2021 ◽  
Vol 9 (5) ◽  
pp. 1034
Author(s):  
Carlos Sabater ◽  
Lorena Ruiz ◽  
Abelardo Margolles

This study aimed to recover metagenome-assembled genomes (MAGs) from human fecal samples to characterize the glycosidase profiles of Bifidobacterium species exposed to different prebiotic oligosaccharides (galacto-oligosaccharides, fructo-oligosaccharides and human milk oligosaccharides, HMOs) as well as high-fiber diets. A total of 1806 MAGs were recovered from 487 infant and adult metagenomes. Unsupervised and supervised classification of glycosidases codified in MAGs using machine-learning algorithms allowed establishing characteristic hydrolytic profiles for B. adolescentis, B. bifidum, B. breve, B. longum and B. pseudocatenulatum, yielding classification rates above 90%. Glycosidase families GH5 44, GH32, and GH110 were characteristic of B. bifidum. The presence or absence of GH1, GH2, GH5 and GH20 was characteristic of B. adolescentis, B. breve and B. pseudocatenulatum, while families GH1 and GH30 were relevant in MAGs from B. longum. These characteristic profiles allowed discriminating bifidobacteria regardless of prebiotic exposure. Correlation analysis of glycosidase activities suggests strong associations between glycosidase families comprising HMOs-degrading enzymes, which are often found in MAGs from the same species. Mathematical models here proposed may contribute to a better understanding of the carbohydrate metabolism of some common bifidobacteria species and could be extrapolated to other microorganisms of interest in future studies.


MethodsX ◽  
2021 ◽  
Vol 8 ◽  
pp. 101166
Author(s):  
Timothy J. Fawcett ◽  
Chad S. Cooper ◽  
Ryan J. Longenecker ◽  
Joseph P. Walton

Author(s):  
Sarmad Mahar ◽  
Sahar Zafar ◽  
Kamran Nishat

Headnotes are the precise explanation and summary of legal points in an issued judgment. Law journals hire experienced lawyers to write these headnotes. These headnotes help the reader quickly determine the issue discussed in the case. Headnotes comprise two parts. The first part comprises the topic discussed in the judgment, and the second part contains a summary of that judgment. In this thesis, we design, develop and evaluate headnote prediction using machine learning, without involving human involvement. We divided this task into a two steps process. In the first step, we predict law points used in the judgment by using text classification algorithms. The second step generates a summary of the judgment using text summarization techniques. To achieve this task, we created a Databank by extracting data from different law sources in Pakistan. We labelled training data generated based on Pakistan law websites. We tested different feature extraction methods on judiciary data to improve our system. Using these feature extraction methods, we developed a dictionary of terminology for ease of reference and utility. Our approach achieves 65% accuracy by using Linear Support Vector Classification with tri-gram and without stemmer. Using active learning our system can continuously improve the accuracy with the increased labelled examples provided by the users of the system.


2021 ◽  
Vol 6 (22) ◽  
pp. 51-59
Author(s):  
Mustazzihim Suhaidi ◽  
Rabiah Abdul Kadir ◽  
Sabrina Tiun

Extracting features from input data is vital for successful classification and machine learning tasks. Classification is the process of declaring an object into one of the predefined categories. Many different feature selection and feature extraction methods exist, and they are being widely used. Feature extraction, obviously, is a transformation of large input data into a low dimensional feature vector, which is an input to classification or a machine learning algorithm. The task of feature extraction has major challenges, which will be discussed in this paper. The challenge is to learn and extract knowledge from text datasets to make correct decisions. The objective of this paper is to give an overview of methods used in feature extraction for various applications, with a dataset containing a collection of texts taken from social media.


Author(s):  
Maiyuren Srikumar ◽  
Charles Daniel Hill ◽  
Lloyd Hollenberg

Abstract Quantum machine learning (QML) is a rapidly growing area of research at the intersection of classical machine learning and quantum information theory. One area of considerable interest is the use of QML to learn information contained within quantum states themselves. In this work, we propose a novel approach in which the extraction of information from quantum states is undertaken in a classical representational-space, obtained through the training of a hybrid quantum autoencoder (HQA). Hence, given a set of pure states, this variational QML algorithm learns to identify – and classically represent – their essential distinguishing characteristics, subsequently giving rise to a new paradigm for clustering and semi-supervised classification. The analysis and employment of the HQA model are presented in the context of amplitude encoded states – which in principle can be extended to arbitrary states for the analysis of structure in non-trivial quantum data sets.


Author(s):  
Alex Freitas ◽  
André C.P.L.F. de Carvalho

In machine learning and data mining, most of the works in classification problems deal with flat classification, where each instance is classified in one of a set of possible classes and there is no hierarchical relationship between the classes. There are, however, more complex classification problems where the classes to be predicted are hierarchically related. This chapter presents a tutorial on the hierarchical classification techniques found in the literature. We also discuss how hierarchical classification techniques have been applied to the area of bioinformatics (particularly the prediction of protein function), where hierarchical classification problems are often found.


Author(s):  
Monali Gulhane, T.Sajana

Nowadays many trends are being in the area of medicine to predict the human behaviour and analysis of patient behaviour is being studied but the technical difficulty of cost efficient method to predict the behaviour of user is overcome in the proposed researched methodology .The mental health of the used can lead to good immunity system to be healthy in this pandemic of COVID-19. Hence After a detailed study on different human health disease classification techniques it is found that machine learning techniques are reliable for the feature extraction and analysis of the different human parameters. CNN is the most optimum choice of classification of diseases. Feature extraction and feature selection is automatically managed by the CNN layers, which reduces the training speed. Techniques like sensor-based feature extraction like EEG, ECG, etc. will be further explored using machine learning algorithms for detection of early detections of diseases from human behavior on different platforms in this research. Social behavior and eating habits play a vital role in disease detection. A system that combines such a wide variety of features with effective classification techniques at each stage is needed. The research in this paper contributes the review of the human behavior analysis through different body parameters, food habits and social media influences with social behavior of the person. The main objective of research is to analysis theses different area parameters to predict the early signs of the diseases.


Sign in / Sign up

Export Citation Format

Share Document