Rule-Based Crime Information Extraction on Indonesian Digital News

This paper proposes a system called CFP Manager specialized on IT field and designed to ease the process of searching conference suitable to one's need. At present, the handling of CFP faces two problems: for emails, the huge quantity of CFP received can be easily skimmed through. For websites, the reviewing of some of the main CFP aggregators available online points out the lack of usable criteria. This system proposes to answer to these problems via its architecture consisting of three components: firstly an Information Extraction module extracting relevant information (as date, location, etc...) from CFP using rule based text mining algorithm. The second component enriches the now extracted data with external one from ontology models. Finally the last one displays the said data and allows the end user to perform complex queries on the CFP dataset and thus allow him to only access to CFP suitable for him. In order to validate the authors' proposal, they eventually process the well-known precision / recall metric on our information extraction component with an average of 0.95 for precision and 0.91 for recall on three different 100 CFP dataset. This paper finally discusses the validity of our approach by confronting our system for different queries with two systems already available online (WikiCFP and IEEE Conference Search) and basic text searching approach standing for searching in an email box. On a 100 CFP dataset with the wide variety of usable data and the possibility to perform complex queries we surpass basic text searching method and WikiCFP by not returning the false positive usually returned by them and find a result close to the IEEE system.

Get full-text (via PubEx)

RCE-OIE: Open Information Extraction Using a Rule-Based Clause Extraction Engine for Semantic Applications

Advances in Intelligent Systems and Computing - Recent Findings in Intelligent Computing Techniques ◽

10.1007/978-981-10-8633-5_20 ◽

2018 ◽

pp. 191-198

Author(s):

D. Thenmozhi ◽

Chandrabose Aravindan

Keyword(s):

Information Extraction ◽

Rule Based ◽

Open Information Extraction

Get full-text (via PubEx)

Information Extraction of Protein Phosphorylation from Biomedical Literature

Information Retrieval in Biomedicine ◽

10.4018/978-1-60566-274-9.ch009 ◽

2010 ◽

pp. 163-176

Author(s):

M. Narayanaswamy ◽

K. E. Ravikumar ◽

Z. Z. Hu ◽

K. Vijay-Shanker ◽

C. H. Wu

Keyword(s):

Text Mining ◽

Information Extraction ◽

Protein Phosphorylation ◽

Biomedical Literature ◽

Literature Mining ◽

Second Phase ◽

Mining System ◽

Rule Based ◽

Kinase Substrate ◽

Fundamental Biological Process

Protein posttranslational modification (PTM) is a fundamental biological process, and currently few text mining systems focus on PTM information extraction. A rule-based text mining system, RLIMS-P (Rule-based LIterature Mining System for Protein Phosphorylation), was recently developed by our group to extract protein substrate, kinase and phosphorylated residue/sites from MEDLINE abstracts. This chapter covers the evaluation and benchmarking of RLIMS-P and highlights some novel and unique features of the system. The extraction patterns of RLIMS-P capture a range of lexical, syntactic and semantic constraints found in sentences expressing phosphorylation information. RLIMS-P also has a second phase that puts together information extracted from different sentences. This is an important feature since it is not common to find the kinase, substrate and site of phosphorylation to be mentioned in the same sentence. Small modifications to the rules for extraction of phosphorylation information have also allowed us to develop systems for extraction of two other PTMs, acetylation and methylation. A thorough evaluation of these two systems needs to be completed. Finally, an online version of RLIMSP with enhanced functionalities, namely, phosphorylation annotation ranking, evidence tagging, and protein entity mapping, has been developed and is publicly accessible.

Get full-text (via PubEx)

Person Named Entity Recognition in Balinese

JELIKU (Jurnal Elektronik Ilmu Komputer Udayana) ◽

10.24843/jlk.2021.v10.i01.p13 ◽

2021 ◽

Vol 10 (1) ◽

pp. 99

Author(s):

Kenny Kurniadi ◽

Ngurah Agus Sanjaya ER

Keyword(s):

Information Extraction ◽

Named Entity Recognition ◽

Morphological Structure ◽

Entity Recognition ◽

Linguistic Meaning ◽

Rule Based ◽

Named Entity

Named Entity Recognition (NER) is part of information extraction whose task is to classify text which is categorized into several classes such as names of people (figures), organizations, and locations. In this study, the authors propose making a NER identify the names of characters in Balinese language documents. This study will use a rule-based method (rule-based). Rules are build based on the morphological structure and linguistic meaning of Balinese names. The research conducted, that the system has an accuracy of 67.41%, precision of 83.42%, recall of 77.83%, and F-Score of 80.53%.

Get full-text (via PubEx)

Rule based Text Extraction from a Bibliographic Database

DESIDOC Journal of Library & Information Technology ◽

10.14429/djlit.38.1.12307 ◽

2018 ◽

Vol 38 (1) ◽

pp. 5

Author(s):

Veena Makhija ◽

Swapnil Ahuja

Keyword(s):

Information Retrieval ◽

Information Extraction ◽

Extraction Process ◽

Controlled Vocabulary ◽

Bibliographic Database ◽

Extraction Techniques ◽

Knowledge Domain ◽

Rule Based ◽

Text Extraction ◽

Selection Of

<p>The emergent concept of ‘ Big Data’ has shifted the paradigm from information retrieval to information extraction techniques. The information extraction techniques enables corpus analysis to draw useful interpretations and its possible applications. Selection of appropriate information extraction technique depends upon the type of data being dealt with and its possible applications. In an R&D environment, the published information is considered as an authenticated benchmark to study and analyse the growth pattern in that field of science, medicine, business. A rule based information extraction process, on the selected data extracted from a bibliographic database of published R&D papers is proposed in this paper. Aim of the study is to build up a database on relevant concepts, cleaning of retrieved data and automate the process of information retrieval in the local database. For this purpose, a concept based ‘subject profiles’ in the area of advanced semiconductors as well as the rules for text extraction from metadata retrieved from the bibliographic database was developed. This subset was used as an input to the knowledge domain to support R&D in the area of ‘advanced semiconductor materials and devices’ and provide information services on Intranet. Study found that concept based pattern matching on the datasets downloaded yielded better results as compared to the results by using the controlled vocabulary of the source database .</p>

Get full-text (via PubEx)

An Algebraic Approach to Rule-Based Information Extraction

2008 IEEE 24th International Conference on Data Engineering ◽

10.1109/icde.2008.4497502 ◽

2008 ◽

Cited By ~ 35

Author(s):

Frederick Reiss ◽

Sriram Raghavan ◽

Rajasekar Krishnamurthy ◽

Huaiyu Zhu ◽

Shivakumar Vaithyanathan

Keyword(s):

Information Extraction ◽

Algebraic Approach ◽

Rule Based

Get full-text (via PubEx)