From Possibilistic Rule-Based Systems to Machine Learning - A Discussion Paper

In the field of Natural Language Processing, one of the very important research areas of Information Extraction (IE) comes in Named Entity Recognition (NER). NER is a subtask of IE that seeks to identify and classify the predefined categories of named entities in text documents. Considerable amount of work has been done on NER in recent years due to the increasing demand of automated texts and the wide availability of electronic corpora. While it is relatively easy and natural for a human reader to read and understand the context of a given article, getting a machine to understand and differentiate between words is a big challenge. For instance, the word ‘brown’ may refer to a person called Mr. Brown, or the colour of an item which is brown. Human readers can easily discern the meaning of the word by looking at the context of that particular sentence, but it would be almost impossible for a computer to interpret it without any additional information. To deal with the issue, researchers in NER field have proposed various rule-based systems (Wakao, Gaizauskas & Wilks, 1996; Krupka & Hausman, 1998; Maynard, Tablan, Ursu, Cunningham & Wilks, 2001). These systems are able to achieve high accuracy in recognition with the help of some lists of known named entities called gazetteers. The problem with rule-based approach is that it lacks the robustness and portability. It incurs steep maintenance cost especially when new rules need to be introduced for some new information or new domains. A better option is thus to use machine learning approach that is trainable and adaptable. Three wellknown machine learning approaches that have been used extensively in NER are Hidden Markov Model (HMM), Maximum Entropy Model (MEM) and Decision Tree. Many of the existing machine learning-based NER systems (Bikel, Schwartz & Weischedel, 1999; Zhou & Su, 2002; Borthwick, Sterling, Agichten & Grisham, 1998; Bender, Och & Ney, 2003; Chieu & Ng, 2002; Sekine, Grisham & Shinnou, 1998) are able to achieve near-human performance for named entity tagging, even though the overall performance is still about 2% short from the rule-based systems. There have also been many attempts to improve the performance of NER using a hybrid approach with the combination of handcrafted rules and statistical models (Mikheev, Moens & Grover, 1999; Srihari & Li, 2000; Seon, Ko, Kim & Seo, 2001). These systems can achieve relatively good performance in the targeted domains owing to the comprehensive handcrafted rules. Nevertheless, the portability problem still remains unsolved when it comes to dealing with NER in various domains. As such, this article presents a hybrid machine learning approach using MEM and HMM successively. The reason for using two statistical models in succession instead of one is due to the distinctive nature of the two models. HMM is able to achieve better performance than any other statistical models, and is generally regarded as the most successful one in machine learning approach. However, it suffers from sparseness problem, which means considerable amount of data is needed for it to achieve acceptable performance. On the other hand, MEM is able to maintain reasonable performance even when there is little data available for training purpose. The idea is therefore to walkthrough the testing corpus using MEM first in order to generate a temporary tagging result, while this procedure can be simultaneously used as a training process for HMM. During the second walkthrough, the corpus uses HMM for the final tagging. In this process, the temporary tagging result generated by MEM will be used as a reference for subsequent error checking and correction. In the case when there is little training data available, the final result can still be reliable based on the contribution of the initial MEM tagging result.

Download Full-text

Sistem Cerdas Permainan Papan The Battle Of Honor dengan Decision Making dan Machine Learning

Jurnal Buana Informatika ◽

10.24002/jbi.v12i2.4905 ◽

2021 ◽

Vol 12 (2) ◽

pp. 136

Author(s):

Arnan Dwika Diasmara ◽

Aditya Wikan Mahastama ◽

Antonius Rachmat Chrismanto

Keyword(s):

Artificial Intelligence ◽

Machine Learning ◽

Decision Making ◽

Intelligent System ◽

Third Party ◽

Case Based Reasoning ◽

Board Game ◽

Rule Based ◽

Case Based ◽

Rule Based Systems

Abstract. Intelligent System of the Battle of Honor Board Game with Decision Making and Machine Learning. The Battle of Honor is a board game where 2 players face each other to bring down their opponent's flag. This game requires a third party to act as the referee because the players cannot see each other's pawns during the game. The solution to this is to implement Rule-Based Systems (RBS) on a system developed with Unity to support the referee's role in making decisions based on the rules of the game. Researchers also develop Artificial Intelligence (AI) as opposed to applying Case-Based reasoning (CBR). The application of CBR is supported by the nearest neighbor algorithm to find cases that have a high degree of similarity. In the basic test, the results of the CBR test were obtained with the highest formulated accuracy of the 3 examiners, namely 97.101%. In testing the AI scenario as a referee, it is analyzed through colliding pieces and gives the right decision in determining victoryKeywords: The Battle of Honor, CBR, RBS, unity, AIAbstrak. The Battle of Honor merupakan permainan papan dimana 2 pemain saling berhadapan untuk menjatuhkan bendera lawannya. Permainan ini membutuhkan pihak ketiga yang berperan sebagai wasit karena pemain yang saling berhadapan tidak dapat saling melihat bidak lawannya. Solusi dari hal tersebut yaitu mengimplementasikan Rule-Based Systems (RBS) pada sistem yang dikembangkan dengan Unity untuk mendukung peran wasit dalam memberikan keputusan berdasarkan aturan permainan. Peneliti juga mengembangkan Artificial Intelligence (AI) sebagai lawan dengan menerapkan Case-Based reasoning (CBR). Penerapan CBR didukung dengan algoritma nearest neighbour untuk mencari kasus yang memiliki tingkat kemiripan yang tinggi. Pada pengujian dasar didapatkan hasil uji CBR dengan accuracy yang dirumuskan tertinggi dari 3 penguji yaitu 97,101%. Pada pengujian skenario AI sebagai wasit dianalisis lewat bidak yang bertabrakan dan memberikan keputusan yang tepat dalam menentukan kemenangan.Kata Kunci: The Battle of Honor, CBR, RBS, unity, AI

Download Full-text

2010 i2b2/VA challenge on concepts, assertions, and relations in clinical text

Journal of the American Medical Informatics Association ◽

10.1136/amiajnl-2011-000203 ◽

2011 ◽

Vol 18 (5) ◽

pp. 552-556 ◽

Cited By ~ 379

Author(s):

Özlem Uzuner ◽

Brett R South ◽

Shuying Shen ◽

Scott L DuVall

Keyword(s):

Machine Learning ◽

Training Data ◽

Classification Task ◽

Reference Standard ◽

Rule Based ◽

Concept Extraction ◽

Ensembles Of Classifiers ◽

Relation Classification ◽

Clinical Records ◽

Rule Based Systems

Abstract The 2010 i2b2/VA Workshop on Natural Language Processing Challenges for Clinical Records presented three tasks: a concept extraction task focused on the extraction of medical concepts from patient reports; an assertion classification task focused on assigning assertion types for medical problem concepts; and a relation classification task focused on assigning relation types that hold between medical problems, tests, and treatments. i2b2 and the VA provided an annotated reference standard corpus for the three tasks. Using this reference standard, 22 systems were developed for concept extraction, 21 for assertion classification, and 16 for relation classification. These systems showed that machine learning approaches could be augmented with rule-based systems to determine concepts, assertions, and relations. Depending on the task, the rule-based systems can either provide input for machine learning or post-process the output of machine learning. Ensembles of classifiers, information from unlabeled data, and external knowledge sources can help when the training data are inadequate.

Download Full-text

A Brief Survey on Text Classification Using Various Machine Learning Techniques

International Journal of Advanced Research in Computer Science and Software Engineering ◽

10.23956/ijarcsse.v8i1.521 ◽

2018 ◽

Vol 8 (1) ◽

pp. 14

Author(s):

Padmavathi .S ◽

M. Chidambaram

Keyword(s):

Machine Learning ◽

Text Classification ◽

Fixed Number ◽

Machine Learning Techniques ◽

Online Information ◽

Rule Based ◽

Learning Techniques ◽

Machine Learning Approach ◽

Rule Based Approach

Text classification has grown into more significant in managing and organizing the text data due to tremendous growth of online information. It does classification of documents in to fixed number of predefined categories. Rule based approach and Machine learning approach are the two ways of text classification. In rule based approach, classification of documents is done based on manually defined rules. In Machine learning based approach, classification rules or classifier are defined automatically using example documents. It has higher recall and quick process. This paper shows an investigation on text classification utilizing different machine learning techniques.

Download Full-text

Daily Solar Radiation Prediction Using Adaptive Fuzzy Rule Based Systems

Journal of Environmental Science Computer Science and Engineering & Technology ◽

10.24214/jecet.c.6.2.18498 ◽

2017 ◽

Vol 6 (2) ◽

Keyword(s):

Solar Radiation ◽

Fuzzy Rule ◽

Rule Based ◽

Adaptive Fuzzy ◽

Rule Based Systems

Download Full-text

Incorporating Geometric Constraints into Rule-Based Systems Using Nonlinear Optimization

10.21236/ada275093 ◽

1994 ◽

Cited By ~ 1

Author(s):

Jo A. Parikh ◽

Anne Werkheiser

Keyword(s):

Nonlinear Optimization ◽

Geometric Constraints ◽

Rule Based ◽

Rule Based Systems

Download Full-text

A Review of Multi-Objective Evolutionary Based Fuzzy Classifiers

Recent Advances in Computer Science and Communications ◽

10.2174/2213275912666190410142052 ◽

2020 ◽

Vol 13 (1) ◽

pp. 77-85

Author(s):

Praveen Kumar Dwivedi ◽

Surya Prakash Tripathi

Keyword(s):

Fuzzy Systems ◽

Evolutionary Process ◽

Fuzzy Rule ◽

Fuzzy Classification ◽

Multi Objective Optimization ◽

Fuzzy Classifier ◽

Rule Based ◽

Trade Off ◽

Multi Objective ◽

Rule Based Systems

Background: Fuzzy systems are employed in several fields like data processing, regression, pattern recognition, classification and management as a result of their characteristic of handling uncertainty and explaining the feature of the advanced system while not involving a particular mathematical model. Fuzzy rule-based systems (FRBS) or fuzzy rule-based classifiers (mainly designed for classification purpose) are primarily the fuzzy systems that consist of a group of fuzzy logical rules and these FRBS are unit annexes of ancient rule-based systems, containing the "If-then" rules. During the design of any fuzzy systems, there are two main objectives, interpretability and accuracy, which are conflicting with each another, i.e., improvement in any of those two options causes the decrement in another. This condition is termed as Interpretability –Accuracy Trade-off. To handle this condition, Multi-Objective Evolutionary Algorithms (MOEA) are often applied within the design of fuzzy systems. This paper reviews the approaches to the problem of developing fuzzy systems victimization evolutionary process Multi-Objective Optimization (EMO) algorithms considering ‘Interpretability-Accuracy Trade-off, current research trends and improvement in the design of fuzzy classifier using MOEA in the future scope of authors. Methods: The state-of-the-art review has been conducted for various fuzzy classifier designs, and their optimization is reviewed in terms of multi-objective. Results: This article reviews the different Multi-Objective Optimization (EMO) algorithms in the context of Interpretability -Accuracy tradeoff during fuzzy classification. Conclusion: The evolutionary multi-objective algorithms are being deployed in the development of fuzzy systems. Improvement in the design using these algorithms include issues like higher spatiality, exponentially inhabited solution, I-A tradeoff, interpretability quantification, and describing the ability of the system of the fuzzy domain, etc. The focus of the authors in future is to find out the best evolutionary algorithm of multi-objective nature with efficiency and robustness, which will be applicable for developing the optimized fuzzy system with more accuracy and higher interpretability. More concentration will be on the creation of new metrics or parameters for the measurement of interpretability of fuzzy systems and new processes or methods of EMO for handling I-A tradeoff.

Download Full-text

An Experimental Study of Diversity of Diabetes Disease Features by Bagging and Boosting Ensemble Method with Rule Based Machine Learning Classifier Algorithms

SN Computer Science ◽

10.1007/s42979-020-00446-y ◽

2021 ◽

Vol 2 (1) ◽

Author(s):

Dhyan Chandra Yadav ◽

Saurabh Pal

Keyword(s):

Machine Learning ◽

Experimental Study ◽

Ensemble Method ◽

Rule Based ◽

Learning Classifier ◽

Classifier Algorithms

Download Full-text

A Comparison of Rule-Based and Machine Learning Models for Classification of Human Factors Aviation Safety Event Reports

Proceedings of the Human Factors and Ergonomics Society Annual Meeting ◽

10.1177/1071181320641034 ◽

2020 ◽

Vol 64 (1) ◽

pp. 129-133

Author(s):

Katherine Darveau ◽

Daniel Hannon ◽

Chad Foster

Keyword(s):

Machine Learning ◽

Human Factors ◽

Human Error ◽

Data Science ◽

Aircraft Engine ◽

Rule Based ◽

Root Cause ◽

Textual Data ◽

Safety Event

There is growing interest in the study and practice of applying data science (DS) and machine learning (ML) to automate decision making in safety-critical industries. As an alternative or augmentation to human review, there are opportunities to explore these methods for classifying aviation operational events by root cause. This study seeks to apply a thoughtful approach to design, compare, and combine rule-based and ML techniques to classify events caused by human error in aircraft/engine assembly, maintenance or operation. Event reports contain a combination of continuous parameters, unstructured text entries, and categorical selections. A Human Factors approach to classifier development prioritizes the evaluation of distinct data features and entry methods to improve modeling. Findings, including the performance of tested models, led to recommendations for the design of textual data collection systems and classification approaches.

Download Full-text

Dynamic optimization for real-time rule-based systems using predicate dependency

Proceedings Sixth IEEE Real-Time Technology and Applications Symposium. RTAS 2000 ◽

10.1109/rttas.2000.852459 ◽

2002 ◽

Cited By ~ 3

Author(s):

Y.-H. Lee ◽

A.M.K. Cheng

Keyword(s):

Real Time ◽

Dynamic Optimization ◽

Rule Based ◽

Rule Based Systems

Download Full-text