NP Animacy Identification for Anaphora Resolution

In anaphora resolution for English, animacy identification can play an integral role in the application of agreement restrictions between pronouns and candidates, and as a result, can improve the accuracy of anaphora resolution systems. In this paper, two methods for animacy identification are proposed and evaluated using intrinsic and extrinsic measures. The first method is a rule-based one which uses information about the unique beginners in WordNet to classify NPs on the basis of their animacy. The second method relies on a machine learning algorithm which exploits a WordNet enriched with animacy information for each sense. The effect of word sense disambiguation on the two methods is also assessed. The intrinsic evaluation reveals that the machine learning method reaches human levels of performance. The extrinsic evaluation demonstrates that animacy identification can be beneficial in anaphora resolution, especially in the cases where animate entities are identified with high precision.

Download Full-text

Machine Learning

10.1093/oxfordhb/9780199276349.013.0020 ◽

2012 ◽

Author(s):

Raymond J. Mooney

Keyword(s):

Machine Learning ◽

Word Sense Disambiguation ◽

Anaphora Resolution ◽

Syntactic Parsing ◽

Word Sense ◽

Part Of Speech Tagging ◽

Part Of Speech ◽

Sense Disambiguation ◽

Training Examples ◽

Speech Tagging

This article introduces the type of symbolic machine learning in which decision trees, rules, or case-based classifiers are induced from supervised training examples. It describes the representation of knowledge assumed by each of these approaches and reviews basic algorithms for inducing such representations from annotated training examples and using the acquired knowledge to classify future instances. Machine learning is the study of computational systems that improve performance on some task with experience. Most machine learning methods concern the task of categorizing examples described by a set of features. These techniques can be applied to learn knowledge required for a variety of problems in computational linguistics ranging from part-of-speech tagging and syntactic parsing to word-sense disambiguation and anaphora resolution. Finally, this article reviews the applications to a variety of these problems, such as morphology, part-of-speech tagging, word-sense disambiguation, syntactic parsing, semantic parsing, information extraction, and anaphora resolution.

Download Full-text

Machine Learning Techniques for Myanmar Word-Sense Disambiguation

Advances in Intelligent Systems and Computing - Genetic and Evolutionary Computing ◽

10.1007/978-3-319-23204-1_18 ◽

2015 ◽

pp. 175-185

Author(s):

Phyu Phyu Khaing ◽

Than Nwe Aung

Keyword(s):

Machine Learning ◽

Word Sense Disambiguation ◽

Machine Learning Techniques ◽

Word Sense ◽

Learning Techniques ◽

Sense Disambiguation

Download Full-text

Urdu word sense disambiguation using machine learning approach

Cluster Computing ◽

10.1007/s10586-017-0918-0 ◽

2017 ◽

Vol 21 (1) ◽

pp. 515-522 ◽

Cited By ~ 2

Author(s):

Muhammad Abid ◽

Asad Habib ◽

Jawad Ashraf ◽

Abdul Shahid

Keyword(s):

Machine Learning ◽

Word Sense Disambiguation ◽

Learning Approach ◽

Word Sense ◽

Machine Learning Approach ◽

Sense Disambiguation

Download Full-text

Application of Rule Based approach to Word Sense Disambiguation of Marathi Language text

2015 International Conference on Innovations in Information, Embedded and Communication Systems (ICIIECS) ◽

10.1109/iciiecs.2015.7193146 ◽

2015 ◽

Author(s):

Gauri Dhopavkar ◽

Manali Kshirsagar ◽

Latesh Malik

Keyword(s):

Word Sense Disambiguation ◽

Word Sense ◽

Rule Based ◽

Sense Disambiguation ◽

Rule Based Approach ◽

Language Text

Download Full-text

Implementation of XGBoost Ensemble Learning Model for Detecting Money Laundering

International Journal for Research in Applied Science and Engineering Technology ◽

10.22214/ijraset.2021.36323 ◽

2021 ◽

Vol 9 (VII) ◽

pp. 312-316

Author(s):

Abarna Ramprakash

Keyword(s):

Machine Learning ◽

Ensemble Learning ◽

Money Laundering ◽

Learning Algorithm ◽

Learning Model ◽

False Positives ◽

Machine Learning Algorithm ◽

Huge Amount ◽

Rule Based ◽

Complex Sequence

Money laundering is the illegal process of concealing the origins of money obtained illegally by passing it through a complex sequence of banking transfers. Currently banks use rule based systems to identify the suspicious transactions which could be used for money laundering. However these systems generate a large number of false positives which leads the banks to spend a huge amount of money and time in investigating the false positives. Hence, in this paper, the monitoring of transactions is to be done using XGBoost machine learning algorithm in order to reduce the number of false positives and to increase the probability of identifying true positives.

Download Full-text

Bio-molecular Event Trigger Extraction by Word Sense Disambiguation Based on Supervised Machine Learning Using Wordnet-Based Data Decomposition and Feature Selection

Advances in Intelligent Systems and Computing - Proceedings of the Global AI Congress 2019 ◽

10.1007/978-981-15-2188-1_31 ◽

2020 ◽

pp. 391-398

Author(s):

Amit Majumder ◽

Asif Ekbal ◽

Sudip Kumar Naskar

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Word Sense Disambiguation ◽

Supervised Machine Learning ◽

Word Sense ◽

Molecular Event ◽

Data Decomposition ◽

Sense Disambiguation ◽

Event Trigger

Download Full-text

Interactive medical word sense disambiguation through informed learning

Journal of the American Medical Informatics Association ◽

10.1093/jamia/ocy013 ◽

2018 ◽

Vol 25 (7) ◽

pp. 800-808 ◽

Cited By ~ 4

Author(s):

Yue Wang ◽

Kai Zheng ◽

Hua Xu ◽

Qiaozhu Mei

Keyword(s):

Active Learning ◽

Learning Curve ◽

Domain Knowledge ◽

Learning Algorithm ◽

Interactive Learning ◽

Word Sense Disambiguation ◽

Word Sense ◽

High Quality ◽

Clinical Notes ◽

Sense Disambiguation

Abstract Objective Medical word sense disambiguation (WSD) is challenging and often requires significant training with data labeled by domain experts. This work aims to develop an interactive learning algorithm that makes efficient use of expert’s domain knowledge in building high-quality medical WSD models with minimal human effort. Methods We developed an interactive learning algorithm with expert labeling instances and features. An expert can provide supervision in 3 ways: labeling instances, specifying indicative words of a sense, and highlighting supporting evidence in a labeled instance. The algorithm learns from these labels and iteratively selects the most informative instances to ask for future labels. Our evaluation used 3 WSD corpora: 198 ambiguous terms from Medical Subject Headings (MSH) as MEDLINE indexing terms, 74 ambiguous abbreviations in clinical notes from the University of Minnesota (UMN), and 24 ambiguous abbreviations in clinical notes from Vanderbilt University Hospital (VUH). For each ambiguous term and each learning algorithm, a learning curve that plots the accuracy on the test set against the number of labeled instances was generated. The area under the learning curve was used as the primary evaluation metric. Results Our interactive learning algorithm significantly outperformed active learning, the previous fastest learning algorithm for medical WSD. Compared to active learning, it achieved 90% accuracy for the MSH corpus with 42% less labeling effort, 35% less labeling effort for the UMN corpus, and 16% less labeling effort for the VUH corpus. Conclusions High-quality WSD models can be efficiently trained with minimal supervision by inviting experts to label informative instances and provide domain knowledge through labeling/highlighting contextual features.

Download Full-text

Parameter optimization for machine-learning of word sense disambiguation

Natural Language Engineering ◽

10.1017/s1351324902003005 ◽

2002 ◽

Vol 8 (4) ◽

pp. 311-325 ◽

Cited By ~ 24

Author(s):

V. HOSTE ◽

I. HENDRICKX ◽

W. DAELEMANS ◽

A. VAN DEN BOSCH

Keyword(s):

Machine Learning ◽

Parameter Optimization ◽

Information Sources ◽

Word Sense Disambiguation ◽

Training Data ◽

Learning Material ◽

Word Sense ◽

Performance Measurements ◽

Sense Disambiguation ◽

The Impact

Various Machine Learning (ML) approaches have been demonstrated to produce relatively successful Word Sense Disambiguation (WSD) systems. There are still unexplained differences among the performance measurements of different algorithms, hence it is warranted to deepen the investigation into which algorithm has the right ‘bias’ for this task. In this paper, we show that this is not easy to accomplish, due to intricate interactions between information sources, parameter settings, and properties of the training data. We investigate the impact of parameter optimization on generalization accuracy in a memory-based learning approach to English and Dutch WSD. A ‘word-expert’ architecture was adopted, yielding a set of classifiers, each specialized in one single wordform. The experts consist of multiple memory-based learning classifiers, each taking different information sources as input, combined in a voting scheme. We optimized the architectural and parametric settings for each individual word-expert by performing cross-validation experiments on the learning material. The results of these experiments show that the variation of both the algorithmic parameters and the information sources available to the classifiers leads to large fluctuations in accuracy. We demonstrate that optimization per word-expert leads to an overall significant improvement in the generalization accuracies of the produced WSD systems.

Download Full-text