A RULE-BASED STEMMER FOR PUNJABI ADJECTIVES

This research work is concerned with the development of a rule-based stemmer for stemming of adjectives in the Punjabi language. Stemming is a method of deriving the root word from the inflected word. The proposed Punjabi Adjective Stemmer (PAS) uses a rule-based approach for converting the inflected Punjabi adjectives to their root forms. A database containing valid root adjectives occurring in the Punjabi language has been created. This database stores 1,762 Punjabi root adjectives. When an adjective word is fed to PAS as an input, first it compares the input word with the root database to determine whether the input adjective is a root adjective or an inflected one. If the input adjective is a root adjective, then no stemming is required and the input adjective is returned as the output. Otherwise, the inflected input adjective is sent to the suffix-stripping algorithm to get the corresponding root adjective. The suffix-stripping algorithm uses a set of predefined rules. India is a linguistically rich country with 22 languages recognized officially. But the computational resources developed for these languages are very scarce. Most of the stemmers developed for Punjabi language so far concentrated on nouns and proper names. PAS is the only stemmer developed so far for specifically addressing the problem of stemming of Punjabi adjectives. PAS has an overall accuracy of 88.76%.

Download Full-text

An Efficient Romanization of Gurmukhi Punjabi Proper Nouns for Pattern Matching

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.b2467.098319 ◽

2019 ◽

Vol 8 (3) ◽

pp. 634-640

Keyword(s):

Pattern Matching ◽

Spoken Language ◽

Input Word ◽

Rule Based ◽

Proper Nouns ◽

Direct Mapping ◽

Database Table ◽

Roman Script ◽

Rule Based Approach ◽

Gurmukhi Script

A Romanization system is used to convert some text of a source script to the Roman script through word by word mapping. The phonological characteristics of the source word are not lost. Only writing script is changed, without any changes in the spoken language. This paper presents a rule based approach for Romanization of Gurmukhi script proper nouns. The aim is to develop a lightweight Romanization system, which may produce multiple possible results for the same input word. The algorithm uses a list of Gurmukhi script characters along with their equivalent character combinations in Roman script. Direct mapping of Gurmukhi script characters to their equivalent Roman script character combinations does not produce efficient results, so some rules are applied to get the correct mappings. The rules are basically to place or remove the letter ‘a’ in between the mapped consonants. Three different sets of rules are applied to get three different Romanized outputs. All these outputs are acceptable for information extraction using pattern matching. In Gurmukhi, some words are written differently than these are pronounced. To handle such words, these words or part of these words are stored in a database table. Along with these words their Romanized form is also stored in second column. The table is used to directly pick the Romanization from the table and use it for Romanization of these words. The result of this Romanization system is a set of possible words that can be generated from the source script word. It enables an application to pattern match those output words with some text or database to get the required information

Download Full-text

A Rule Based Approach for Root Word Identification in Malayalam Language

International Journal of Computer Science and Information Technology ◽

10.5121/ijcsit.2012.4313 ◽

2012 ◽

Vol 4 (3) ◽

pp. 159-166 ◽

Cited By ~ 2

Author(s):

Meera Subhash

Keyword(s):

Word Identification ◽

Rule Based ◽

Root Word ◽

Rule Based Approach ◽

Malayalam Language

Download Full-text

Basic Word Extraction Algorithm Based on Morphological Rules for Balinese Texts

JELIKU (Jurnal Elektronik Ilmu Komputer Udayana) ◽

10.24843/jlk.2020.v08.i04.p06 ◽

2020 ◽

Vol 8 (4) ◽

pp. 401

Author(s):

I Made Wahyu Guna Negara ◽

Ngurah Agus Sanjaya ER

Keyword(s):

Previous Method ◽

Simple Problem ◽

High Complexity ◽

Rule Based ◽

Basic Word ◽

Extraction Algorithm ◽

Root Word ◽

Rule Based Approach ◽

Better Than ◽

Do So

Stemming is the process of extracting the root word of an affixed word. The process is intended to reduce the variations in the word. In this research, we are interested in applying stemming on Balinese language. Previous works on stemming of the Balinese language applied rule-based method but only prefix and suffix were considered. Moreover, the rules were constructed without providing much attention to the morphology of the Balinese language. Rule-based method can be verified and validated with ease on simple problem but fail to do so on problems with high complexity such as Balinese language. To overcome the weaknesses of rule-based stemming on Balinese language, we propose a method that reduce all variations of affix on Balinese language by combining the rule- based approach and the Balinese language morphology. Based on experiments carried out, our proposed method obtained an average stemming accuracy of 99% which is better than 96.67% achieved by the previous method. Keywords: Stemming, Balinese language, Rule-based

Download Full-text

A Brief Survey on Text Classification Using Various Machine Learning Techniques

International Journal of Advanced Research in Computer Science and Software Engineering ◽

10.23956/ijarcsse.v8i1.521 ◽

2018 ◽

Vol 8 (1) ◽

pp. 14

Author(s):

Padmavathi .S ◽

M. Chidambaram

Keyword(s):

Machine Learning ◽

Text Classification ◽

Fixed Number ◽

Machine Learning Techniques ◽

Online Information ◽

Rule Based ◽

Learning Techniques ◽

Machine Learning Approach ◽

Rule Based Approach

Text classification has grown into more significant in managing and organizing the text data due to tremendous growth of online information. It does classification of documents in to fixed number of predefined categories. Rule based approach and Machine learning approach are the two ways of text classification. In rule based approach, classification of documents is done based on manually defined rules. In Machine learning based approach, classification rules or classifier are defined automatically using example documents. It has higher recall and quick process. This paper shows an investigation on text classification utilizing different machine learning techniques.

Download Full-text

MRI Brain Tumor Image Analysis Using Fuzzy Rule based Approach

Journal of Research on the Lepidoptera ◽

10.36872/lepi/v50i2/201012 ◽

2019 ◽

Vol 50 (2) ◽

pp. 98-112 ◽

Cited By ~ 5

Author(s):

KALYAN KUMAR JENA ◽

SASMITA MISHRA ◽

SAROJANANDA MISHRA ◽

SOURAV KUMAR BHOI ◽

SOUMYA RANJAN NAYAK

Keyword(s):

Image Analysis ◽

Brain Tumor ◽

Fuzzy Rule ◽

Rule Based ◽

Mri Brain ◽

Rule Based Approach

Download Full-text

Rule-based Approach to Semantic Resolution of Chinese Addresses

Geo-information Science ◽

10.3724/sp.j.1047.2010.00009 ◽

2010 ◽

Vol 12 (1) ◽

pp. 9-16 ◽

Cited By ~ 3

Author(s):

Xueying ZHNAG ◽

Guonian LV ◽

Boqiu LI ◽

Wenjun CHEN

Keyword(s):

Rule Based ◽

Semantic Resolution ◽

Rule Based Approach

Download Full-text

An Automatic Question Generation System using Rule-Based Approach in Bloom’s Taxonomy

Recent Advances in Computer Science and Communications ◽

10.2174/2213275912666191113143335 ◽

2019 ◽

Vol 13 ◽

Author(s):

G Deena ◽

K Raja ◽

K Kannan

Keyword(s):

Language Processing ◽

Learning Process ◽

Question Generation ◽

Test Question ◽

Rule Based ◽

Part Of Speech ◽

Core Idea ◽

Rule Based Approach ◽

Teaching Learning ◽

Automatic Question Generation

: In this competing world, education has become part of everyday life. The process of imparting the knowledge to the learner through education is the core idea in the Teaching-Learning Process (TLP). An assessment is one way to identify the learner’s weak spot of the area under discussion. An assessment question has higher preferences in judging the learner's skill. In manual preparation, the questions are not assured in excellence and fairness to assess the learner’s cognitive skill. Question generation is the most important part of the teaching-learning process. It is clearly understood that generating the test question is the toughest part. Methods: Proposed an Automatic Question Generation (AQG) system which automatically generates the assessment questions dynamically from the input file. Objective: The Proposed system is to generate the test questions that are mapped with blooms taxonomy to determine the learner’s cognitive level. The cloze type questions are generated using the tag part-of-speech and random function. Rule-based approaches and Natural Language Processing (NLP) techniques are implemented to generate the procedural question of the lowest blooms cognitive levels. Analysis: The outputs are dynamic in nature to create a different set of questions at each execution. Here, input paragraph is selected from computer science domain and their output efficiency are measured using the precision and recall.

Download Full-text