scholarly journals Analysis of Machine Learning Models by Solving the Text Data Classification Problem

2021 ◽  
Vol 8 (2) ◽  
pp. 33-45
Author(s):  
A.V. Pchelin A.V. Pchelin ◽  
◽  
N.A. Kononov N.A. Kononov ◽  
V.S. Serova V.S. Serova ◽  
E.V. Bunova E.V. Bunova ◽  
...  
2021 ◽  
Author(s):  
Naoki Miyaguchi ◽  
Koh Takeuchi ◽  
Hisashi Kashima ◽  
Mizuki Morita ◽  
Hiroshi Morimatsu

Abstract Recently, research has been conducted to automatically control anesthesia using machine learning, with the aim of alleviating the shortage of anesthesiologists. In this study, we address the problem of predicting decisions made by anesthesiologists during surgery using machine learning; specifically, we formulate a decision making problem by increasing the flow rate at each time point in the continuous administration of analgesic remifentanil as a supervised binary classification problem. The experiments were conducted to evaluate the prediction performance using six machine learning models: logistic regression, support vector machine, random forest, LightGBM, artificial neural network, and long short-term memory (LSTM), using 210 case data collected during actual surgeries. The results demonstrated that when predicting the future increase in flow rate of remifentanil after 1 min, the model using LSTM was able to predict with scores of 0.659 for sensitivity, 0.732 for specificity, and 0.753 for ROC-AUC; this demonstrates the potential to predict the decisions made by anesthesiologists using machine learning. Furthermore, we examined the importance and contribution of the features of each model using shapley additive explanations—a method for interpreting predictions made by machine learning models. The trends indicated by the results were partially consistent with known clinical findings.


2021 ◽  
Vol 2078 (1) ◽  
pp. 012056
Author(s):  
Shuang Wu ◽  
Zeyu Li ◽  
Xinqiong Chen ◽  
Peiwen Zhong ◽  
Liangcai Mei ◽  
...  

Abstract In order to better promote garbage classification, machine learning models are used to discover and solve garbage classification problems. First, the factor analysis is used to conduct field investigation and data analysis on residents' perception of waste classification. Second, convolutional neural network (CNN) is used to classify and recognize garbage images, which is used to assist the judgment of garbage classification. We should put forward some reasonable classification suggestions to better promote the problem of garbage classification.


10.2196/17984 ◽  
2020 ◽  
Vol 8 (3) ◽  
pp. e17984 ◽  
Author(s):  
Irena Spasic ◽  
Goran Nenadic

Background Clinical narratives represent the main form of communication within health care, providing a personalized account of patient history and assessments, and offering rich information for clinical decision making. Natural language processing (NLP) has repeatedly demonstrated its feasibility to unlock evidence buried in clinical narratives. Machine learning can facilitate rapid development of NLP tools by leveraging large amounts of text data. Objective The main aim of this study was to provide systematic evidence on the properties of text data used to train machine learning approaches to clinical NLP. We also investigated the types of NLP tasks that have been supported by machine learning and how they can be applied in clinical practice. Methods Our methodology was based on the guidelines for performing systematic reviews. In August 2018, we used PubMed, a multifaceted interface, to perform a literature search against MEDLINE. We identified 110 relevant studies and extracted information about text data used to support machine learning, NLP tasks supported, and their clinical applications. The data properties considered included their size, provenance, collection methods, annotation, and any relevant statistics. Results The majority of datasets used to train machine learning models included only hundreds or thousands of documents. Only 10 studies used tens of thousands of documents, with a handful of studies utilizing more. Relatively small datasets were utilized for training even when much larger datasets were available. The main reason for such poor data utilization is the annotation bottleneck faced by supervised machine learning algorithms. Active learning was explored to iteratively sample a subset of data for manual annotation as a strategy for minimizing the annotation effort while maximizing the predictive performance of the model. Supervised learning was successfully used where clinical codes integrated with free-text notes into electronic health records were utilized as class labels. Similarly, distant supervision was used to utilize an existing knowledge base to automatically annotate raw text. Where manual annotation was unavoidable, crowdsourcing was explored, but it remains unsuitable because of the sensitive nature of data considered. Besides the small volume, training data were typically sourced from a small number of institutions, thus offering no hard evidence about the transferability of machine learning models. The majority of studies focused on text classification. Most commonly, the classification results were used to support phenotyping, prognosis, care improvement, resource management, and surveillance. Conclusions We identified the data annotation bottleneck as one of the key obstacles to machine learning approaches in clinical NLP. Active learning and distant supervision were explored as a way of saving the annotation efforts. Future research in this field would benefit from alternatives such as data augmentation and transfer learning, or unsupervised learning, which do not require data annotation.


Author(s):  
Sandhya Vidyashankar ◽  
◽  
Rakshit Vahi ◽  
Yash Karkhanis ◽  
Gowri Srinivasa ◽  
...  

We present an automated, visual question answering based companion – VisQuelle - to facilitate elementary learning of word-object associations. In particular, we attempt to harness the power of machine learning models for object recognition and the understanding of combined processing of images and text data from visual-question answering to provide variety and nuance in the images associated with letters or words presented to the elementary learner. We incorporate elements such as gamification to motivate the learner by recording scores, errors, etc., to track the learner’s progress. Translation is also provided to reinforce word-object associations in the user’s native tongue, if the learner is using VisQuelle to learn a second language.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Naoki Miyaguchi ◽  
Koh Takeuchi ◽  
Hisashi Kashima ◽  
Mizuki Morita ◽  
Hiroshi Morimatsu

AbstractRecently, research has been conducted to automatically control anesthesia using machine learning, with the aim of alleviating the shortage of anesthesiologists. In this study, we address the problem of predicting decisions made by anesthesiologists during surgery using machine learning; specifically, we formulate a decision making problem by increasing the flow rate at each time point in the continuous administration of analgesic remifentanil as a supervised binary classification problem. The experiments were conducted to evaluate the prediction performance using six machine learning models: logistic regression, support vector machine, random forest, LightGBM, artificial neural network, and long short-term memory (LSTM), using 210 case data collected during actual surgeries. The results demonstrated that when predicting the future increase in flow rate of remifentanil after 1 min, the model using LSTM was able to predict with scores of 0.659 for sensitivity, 0.732 for specificity, and 0.753 for ROC-AUC; this demonstrates the potential to predict the decisions made by anesthesiologists using machine learning. Furthermore, we examined the importance and contribution of the features of each model using Shapley additive explanations—a method for interpreting predictions made by machine learning models. The trends indicated by the results were partially consistent with known clinical findings.


2020 ◽  
Vol 2 (1) ◽  
pp. 3-6
Author(s):  
Eric Holloway

Imagination Sampling is the usage of a person as an oracle for generating or improving machine learning models. Previous work demonstrated a general system for using Imagination Sampling for obtaining multibox models. Here, the possibility of importing such models as the starting point for further automatic enhancement is explored.


Sign in / Sign up

Export Citation Format

Share Document