Automatic Tuning of Rule-Based Evolutionary Machine Learning via Problem Structure Identification

Text classification has grown into more significant in managing and organizing the text data due to tremendous growth of online information. It does classification of documents in to fixed number of predefined categories. Rule based approach and Machine learning approach are the two ways of text classification. In rule based approach, classification of documents is done based on manually defined rules. In Machine learning based approach, classification rules or classifier are defined automatically using example documents. It has higher recall and quick process. This paper shows an investigation on text classification utilizing different machine learning techniques.

Download Full-text

An Experimental Study of Diversity of Diabetes Disease Features by Bagging and Boosting Ensemble Method with Rule Based Machine Learning Classifier Algorithms

SN Computer Science ◽

10.1007/s42979-020-00446-y ◽

2021 ◽

Vol 2 (1) ◽

Author(s):

Dhyan Chandra Yadav ◽

Saurabh Pal

Keyword(s):

Machine Learning ◽

Experimental Study ◽

Ensemble Method ◽

Rule Based ◽

Learning Classifier ◽

Classifier Algorithms

Download Full-text

A Comparison of Rule-Based and Machine Learning Models for Classification of Human Factors Aviation Safety Event Reports

Proceedings of the Human Factors and Ergonomics Society Annual Meeting ◽

10.1177/1071181320641034 ◽

2020 ◽

Vol 64 (1) ◽

pp. 129-133

Author(s):

Katherine Darveau ◽

Daniel Hannon ◽

Chad Foster

Keyword(s):

Machine Learning ◽

Human Factors ◽

Human Error ◽

Data Science ◽

Aircraft Engine ◽

Rule Based ◽

Root Cause ◽

Textual Data ◽

Safety Event

There is growing interest in the study and practice of applying data science (DS) and machine learning (ML) to automate decision making in safety-critical industries. As an alternative or augmentation to human review, there are opportunities to explore these methods for classifying aviation operational events by root cause. This study seeks to apply a thoughtful approach to design, compare, and combine rule-based and ML techniques to classify events caused by human error in aircraft/engine assembly, maintenance or operation. Event reports contain a combination of continuous parameters, unstructured text entries, and categorical selections. A Human Factors approach to classifier development prioritizes the evaluation of distinct data features and entry methods to improve modeling. Findings, including the performance of tested models, led to recommendations for the design of textual data collection systems and classification approaches.

Download Full-text

Rule-based explanations based on ensemble machine learning for detecting sink mark defects in the injection moulding process

Journal of Manufacturing Systems ◽

10.1016/j.jmsy.2021.07.001 ◽

2021 ◽

Vol 60 ◽

pp. 392-405

Author(s):

Josue Obregon ◽

Jihoon Hong ◽

Jae-Yoon Jung

Keyword(s):

Machine Learning ◽

Injection Moulding ◽

Rule Based ◽

Moulding Process ◽

Ensemble Machine Learning ◽

Injection Moulding Process ◽

Sink Mark

Download Full-text

Radiology reports automated annotation performance: rule-based machine learning vs deep learning

3rd Smart Cities Symposium (SCS 2020) ◽

10.1049/icp.2021.0893 ◽

2021 ◽

Author(s):

A. Sahl ◽

S. Hasan

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Rule Based ◽

Automated Annotation ◽

Radiology Reports

Download Full-text

Triage and diagnosis of COVID-19 from medical social media (Preprint)

10.2196/preprints.30397 ◽

2021 ◽

Author(s):

Abul Hasan ◽

Mark Levene ◽

David Weston ◽

Renate Fromson ◽

Nicolas Koslover ◽

...

Keyword(s):

Machine Learning ◽

Social Media ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Learning Models ◽

Rule Based ◽

Additional Information ◽

Processing Pipeline ◽

Machine Learning Models

BACKGROUND The COVID-19 pandemic has created a pressing need for integrating information from disparate sources, in order to assist decision makers. Social media is important in this respect, however, to make sense of the textual information it provides and be able to automate the processing of large amounts of data, natural language processing methods are needed. Social media posts are often noisy, yet they may provide valuable insights regarding the severity and prevalence of the disease in the population. In particular, machine learning techniques for triage and diagnosis could allow for a better understanding of what social media may offer in this respect. OBJECTIVE This study aims to develop an end-to-end natural language processing pipeline for triage and diagnosis of COVID-19 from patient-authored social media posts, in order to provide researchers and other interested parties with additional information on the symptoms, severity and prevalence of the disease. METHODS The text processing pipeline first extracts COVID-19 symptoms and related concepts such as severity, duration, negations, and body parts from patients’ posts using conditional random fields. An unsupervised rule-based algorithm is then applied to establish relations between concepts in the next step of the pipeline. The extracted concepts and relations are subsequently used to construct two different vector representations of each post. These vectors are applied separately to build support vector machine learning models to triage patients into three categories and diagnose them for COVID-19. RESULTS We report that Macro- and Micro-averaged F_{1\ }scores in the range of 71-96% and 61-87%, respectively, for the triage and diagnosis of COVID-19, when the models are trained on human labelled data. Our experimental results indicate that similar performance can be achieved when the models are trained using predicted labels from concept extraction and rule-based classifiers, thus yielding end-to-end machine learning. Also, we highlight important features uncovered by our diagnostic machine learning models and compare them with the most frequent symptoms revealed in another COVID-19 dataset. In particular, we found that the most important features are not always the most frequent ones. CONCLUSIONS Our preliminary results show that it is possible to automatically triage and diagnose patients for COVID-19 from natural language narratives using a machine learning pipeline, in order to provide additional information on the severity and prevalence of the disease through the eyes of social media.

Download Full-text

Feature engineering combined with machine learning and rule-based methods for structured information extraction from narrative clinical discharge summaries

Journal of the American Medical Informatics Association ◽

10.1136/amiajnl-2011-000776 ◽

2012 ◽

Vol 19 (5) ◽

pp. 824-832 ◽

Cited By ~ 38

Author(s):

Yan Xu ◽

Kai Hong ◽

Junichi Tsujii ◽

Eric I-Chao Chang

Keyword(s):

Machine Learning ◽

Information Extraction ◽

Feature Engineering ◽

Rule Based ◽

Structured Information ◽

Discharge Summaries

Download Full-text

From Possibilistic Rule-Based Systems to Machine Learning - A Discussion Paper

Lecture Notes in Computer Science - Scalable Uncertainty Management ◽

10.1007/978-3-030-58449-8_3 ◽

2020 ◽

pp. 35-51

Author(s):

Didier Dubois ◽

Henri Prade

Keyword(s):

Machine Learning ◽

Discussion Paper ◽

Rule Based ◽

Rule Based Systems

Download Full-text

Providing the ‘Best’ Lipophilicity Assessment in a Drug Discovery Environment

10.26434/chemrxiv.14292485 ◽

2021 ◽

Author(s):

george chang ◽

Nathaniel Woody ◽

Christopher Keefer

Keyword(s):

Machine Learning ◽

Drug Discovery ◽

High Throughput ◽

In Silico ◽

Shake Flask ◽

Chromatographic Method ◽

Learning Approach ◽

Rule Based ◽

Machine Learning Approach ◽

High Throughput Screens

Lipophilicity is a fundamental structural property that influences almost every aspect of drug discovery. Within Pfizer, we have two complementary high-throughput screens for measuring lipophilicity as a distribution coefficient (LogD) – a miniaturized shake-flask method (SFLogD) and a chromatographic method (ELogD). The results from these two assays are not the same (see Figure 1), with each assay being applicable or more reliable in particular chemical spaces. In addition to LogD assays, the ability to predict the LogD value for virtual compounds is equally vital. Here we present an in-silico LogD model, applicable to all chemical spaces, based on the integration of the LogD data from both assays. We developed two approaches towards a single LogD model – a Rule-based and a Machine Learning approach. Ultimately, the Machine Learning LogD model was found to be superior to both internally developed and commercial LogD models.<br>

Download Full-text

Introducing Rule-based Machine Learning

Proceedings of the Companion Publication of the 2015 on Genetic and Evolutionary Computation Conference - GECCO Companion '15 ◽

10.1145/2739482.2756590 ◽

2015 ◽

Cited By ~ 1

Author(s):

Ryan Urbanowicz ◽

Will Browne

Keyword(s):

Machine Learning ◽

Rule Based

Download Full-text