Event Extraction via Rules and Machine Learning

Author(s):  
Xuepan Gao ◽  
Zhengang Diao ◽  
Kailin Wei ◽  
Yuzhao Yang ◽  
Lei Li
2019 ◽  
Vol 17 ◽  
pp. 100190 ◽  
Author(s):  
Kajal Negi ◽  
Arun Pavuri ◽  
Ladle Patel ◽  
Chirag Jain

2021 ◽  
Vol 7 ◽  
pp. e775
Author(s):  
Malik Daler Ali Awan ◽  
Nadeem Iqbal Kajla ◽  
Amnah Firdous ◽  
Mujtaba Husnain ◽  
Malik Muhammad Saad Missen

The real-time availability of the Internet has engaged millions of users around the world. The usage of regional languages is being preferred for effective and ease of communication that is causing multilingual data on social networks and news channels. People share ideas, opinions, and events that are happening globally i.e., sports, inflation, protest, explosion, and sexual assault, etc. in regional (local) languages on social media. Extraction and classification of events from multilingual data have become bottlenecks because of resource lacking. In this research paper, we presented the event classification task for the Urdu language text existing on social media and the news channels by using machine learning classifiers. The dataset contains more than 0.1 million (102,962) labeled instances of twelve (12) different types of events. The title, its length, and the last four words of a sentence are used as features to classify the events. The Term Frequency-Inverse Document Frequency (tf-idf) showed the best results as a feature vector to evaluate the performance of the six popular machine learning classifiers. Random Forest (RF) and K-Nearest Neighbor (KNN) are among the classifiers that out-performed among other classifiers by achieving 98.00% and 99.00% accuracy, respectively. The novelty lies in the fact that the features aforementioned are not applied, up to the best of our knowledge, in the event extraction of the text written in the Urdu language.


Author(s):  
Tuấn Nguyên Hoài Đức ◽  
Trần Tiện Lợi Long Tứ ◽  
Lê Đình Việt Huy

We built a model labelling the Predicate Argument Structure (PAS) for biomedical documents. PAS is an important semantic information of any document, because it reveals the main event mentioned in each sentence. Extracting PAS in a sentence is an important premise for the computer to solve a series of other problems related to the semantics in text such as event extraction, named entity extraction, question answering system… The predicate argument structure is domain dependent. Therefore, in Biomedical field, it is required to define a completely new Predicate Argument frame compared to the general field. For a machine learning model to work well with a new argument frame, identifying a new feature set is required. This is difficult, manual and requires a lot of expert labor. To address this challenge, we chose to train our model with Deep Learning method utilizing Bi-directional Long Short Term Memory. Deep learning is a machine learning method that does not require defining the feature sets manually. In addition, we also integrate Highway Connection between hidden neuron layers to minimize derivative loss. Besides, to overcome the problem of small training corpus, we integrate Deep Learning with Multi-task Learning technique. Multi-task Learning helps the main task (PAS tagging) to be complemented with knowledge learnt from a closely related task, the NER. Our model achieved F1 = 75.13% without any manually designed feature, thereby showing the prospect of Deep Learning in this domain. In addition, the experiment results also show that Multi-task Learning is an appropriate technique to overcome the problem of little training data in biomedical fields, by improving the F1 score.


2015 ◽  
Vol 13 (03) ◽  
pp. 1541001 ◽  
Author(s):  
Yifan Nie ◽  
Wenge Rong ◽  
Yiyuan Zhang ◽  
Yuanxin Ouyang ◽  
Zhang Xiong

Molecular events normally have significant meanings since they describe important biological interactions or alternations such as binding of a protein. As a crucial step of biological event extraction, event trigger identification has attracted much attention and many methods have been proposed. Traditionally those methods can be categorised into rule-based approach and machine learning approach and machine learning-based approaches have demonstrated its potential and outperformed rule-based approaches in many situations. However, machine learning-based approaches still face several challenges among which a notable one is how to model semantic and syntactic information of different words and incorporate it into the prediction model. There exist many ways to model semantic and syntactic information, among which word embedding is an effective one. Therefore, in order to address this challenge, in this study, a word embedding assisted neural network prediction model is proposed to conduct event trigger identification. The experimental study on commonly used dataset has shown its potential. It is believed that this study could offer researchers insights into semantic-aware solutions for event trigger identification.


Author(s):  
Gilles Jacobs ◽  
Véronique Hoste

AbstractWe present SENTiVENT, a corpus of fine-grained company-specific events in English economic news articles. The domain of event processing is highly productive and various general domain, fine-grained event extraction corpora are freely available but economically-focused resources are lacking. This work fills a large need for a manually annotated dataset for economic and financial text mining applications. A representative corpus of business news is crawled and an annotation scheme developed with an iteratively refined economic event typology. The annotations are compatible with benchmark datasets (ACE/ERE) so state-of-the-art event extraction systems can be readily applied. This results in a gold-standard dataset annotated with event triggers, participant arguments, event co-reference, and event attributes such as type, subtype, negation, and modality. An adjudicated reference test set is created for use in annotator and system evaluation. Agreement scores are substantial and annotator performance adequate, indicating that the annotation scheme produces consistent event annotations of high quality. In an event detection pilot study, satisfactory results were obtained with a macro-averaged $$F_1$$ F 1 -score of $$59\%$$ 59 % validating the dataset for machine learning purposes. This dataset thus provides a rich resource on events as training data for supervised machine learning for economic and financial applications. The dataset and related source code is made available at https://osf.io/8jec2/.


2020 ◽  
Vol 43 ◽  
Author(s):  
Myrthe Faber

Abstract Gilead et al. state that abstraction supports mental travel, and that mental travel critically relies on abstraction. I propose an important addition to this theoretical framework, namely that mental travel might also support abstraction. Specifically, I argue that spontaneous mental travel (mind wandering), much like data augmentation in machine learning, provides variability in mental content and context necessary for abstraction.


2020 ◽  
Author(s):  
Mohammed J. Zaki ◽  
Wagner Meira, Jr
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document