gujarati language Latest Research Papers

Sentiment analysis on film review in Gujarati language using machine learning

International Journal of Electrical and Computer Engineering (IJECE) ◽

10.11591/ijece.v12i1.pp1030-1039 ◽

2022 ◽

Vol 12 (1) ◽

pp. 1030

Author(s):

Parita Shah ◽

Priya Swaminarayan ◽

Maitri Patel

Keyword(s):

Sentiment Analysis ◽

Nearest Neighbor ◽

Naive Bayes ◽

Naïve Bayes ◽

K Nearest Neighbor ◽

K Nearest Neighbors ◽

Word Level ◽

Document Frequency ◽

Accuracy Result ◽

Gujarati Language

<span>Opinion analysis is by a long shot most basic zone of characteristic language handling. It manages the portrayal of information to choose the motivation behind the wellspring of the content. The reason might be of a type of gratefulness (positive) or study (negative). This paper offers a correlation between the outcomes accomplished by applying the calculation arrangement using various classifiers for instance K-nearest neighbor and multinomial naive Bayes. These techniques are utilized to assess a significant assessment with either a positive remark or negative remark. The gathered information considered on the grounds of the extremity film datasets and an association with the results accessible proof has been created for a careful assessment. This paper investigates the word level count vectorizer and term frequency inverse document frequency (TF-IDF) influence on film sentiment analysis. We concluded that multinomial Naive Bayes (MNB) classier generate more accurate result using TF-IDF vectorizer compared to CountVectorizer, K-nearest-neighbors (KNN) classifier has the same accuracy result in case of TF-IDF and CountVectorizer.</span>

POS-HOML: POS Tagging Technique For Gujarati Language Using Hybrid Optimal And Machine Learning Approaches

International Journal of Engineering Trends and Technology ◽

10.14445/22315381/ijett-v69i11p232 ◽

2021 ◽

Vol 69 (11) ◽

pp. 256-262

Author(s):

Pooja M Bhatt ◽

Amit Ganatra

Keyword(s):

Machine Learning ◽

Learning Approaches ◽

Pos Tagging ◽

Gujarati Language

The RAND-36 Health Survey 1.0: Translation, Reliability, Cross-Cultural Adaptation and Validation of the Gujarati Version

International Journal of Science and Healthcare Research ◽

10.52403/ijshr.20211008 ◽

2021 ◽

Vol 6 (4) ◽

pp. 52-56

Author(s):

Hemang Jani ◽

Gauravi Dhruva ◽

Dinesh Sorani

Keyword(s):

Quality Of Life ◽

Analysis Of Variance ◽

Test Validity ◽

Cross Cultural ◽

Health Related Quality ◽

Intra Class Correlation ◽

Related Quality ◽

Health Related ◽

Gujarati Language

Background: The Short Form 36 Item Survey is the most typically used instrument for assessing health-related quality of life.1 Two identical versions of the initial instrument are currently available: the general public domain, license-free RAND-36, and also the commercial SF 36.2 RAND 36 don't seem to be available within the Gujarati language. The aim of this study was to translate and culturally adapt the RAND 36 into the Gujarati language and measure its reliability and validity. Methods: According to the guidelines by the International Quality of Life Assessment project, a test of item-scale correlation, a sequence of translation, and validation were implemented for the translation of the Gujarati version of the RAND-36. Following pilot testing, the English and the Gujarati versions of the RAND-36 were administered to a random sample of 120 apparently healthy individuals to test validity and 96 respondents completed the Gujarati RAND-36 again after two weeks to test reliability. Data were analyzed using one-way analysis of variance, multi-trait scaling analysis, one-way analysis of variance, Pearson’s product-moment correlation analysis, and Intra-Class Correlation (ICC) at p < 0.05 Results: The median Cronbach's alphas for the Gujarati RAND-36 in multiple subgroups exceeded 0.70 for every scale except one. Two of the English RAND-36 scales had median Cronbach's alphas that exceeded 0.70; the rest exceeded 0.50. Test-retest correlations were found statistically significant for both versions. Product-moment correlations to test the equivalence of the corresponding Gujarati and English versions of the RAND-36 ranged from 0.73 to 0.92. The Gujarati version of the RAND-36 has high internal consistency (Cronbach’s α=0.809) and test-retest reliability (Intra-class correlation coefficient=0.746, 95% CI: 0.58, 0.94). Conclusions: The Gujarati version of the RAND-36 performed well and the findings suggest that it is a reliable and valid measure of health-related quality of life among the general Gujarati population. Keywords: RAND-36, cross-cultural translation, quality of life, health status assessment, Gujarati.

Parts-of-Speech Tagger for Gujarati Language using Long-short-Term-Memory

10.1109/aimv53313.2021.9670996 ◽

2021 ◽

Author(s):

Charmi Jobanputra ◽

Nihit Parikh ◽

Vishwa Vora ◽

Santosh Kumar Bharti

Keyword(s):

Short Term Memory ◽

Short Term ◽

Term Memory ◽

Parts Of Speech ◽

Long Short Term Memory ◽

Gujarati Language

Debates over Gujarati language and literature

10.4324/9781003177166-4 ◽

2021 ◽

pp. 67-102

Author(s):

Riho Isaka

Keyword(s):

Language And Literature ◽

Gujarati Language

A Novel Speech to Sign Communication Model for Gujarati Language

10.1109/icirca51532.2021.9544635 ◽

2021 ◽

Author(s):

Nasrin Aasofwala ◽

Shanti Verma ◽

Kalyani Patel

Keyword(s):

Communication Model ◽

Gujarati Language

Influence of GUJarati STEmmeR in Supervised Learning of Web Page Categorization

International Journal of Intelligent Systems and Applications ◽

10.5815/ijisa.2021.03.03 ◽

2021 ◽

Vol 13 (3) ◽

pp. 23-34

Author(s):

Chandrakant D. Patel ◽

◽

Jayesh M. Patel

Keyword(s):

Machine Learning ◽

Information Retrieval ◽

Research Work ◽

Research Problem ◽

Machine Learning Algorithms ◽

Supervised Machine Learning ◽

Web Page ◽

User Query ◽

On Line ◽

Gujarati Language

With the large quantity of information offered on-line, it's equally essential to retrieve correct information for a user query. A large amount of data is available in digital form in multiple languages. The various approaches want to increase the effectiveness of on-line information retrieval but the standard approach tries to retrieve information for a user query is to go looking at the documents within the corpus as a word by word for the given query. This approach is incredibly time intensive and it's going to miss several connected documents that are equally important. So, to avoid these issues, stemming has been extensively utilized in numerous Information Retrieval Systems (IRS) to extend the retrieval accuracy of all languages. These papers go through the problem of stemming with Web Page Categorization on Gujarati language which basically derived the stem words using GUJSTER algorithms [1]. The GUJSTER algorithm is based on morphological rules which is used to derived root or stem word from inflected words of the same class. In particular, we consider the influence of extracted a stem or root word, to check the integrity of the web page classification using supervised machine learning algorithms. This research work is intended to focus on the analysis of Web Page Categorization (WPC) of Gujarati language and concentrate on a research problem to do verify the influence of a stemming algorithm in a WPC application for the Gujarati language with improved accuracy between from 63% to 98% through Machine Learning supervised models with standard ratio 80% as training and 20% as testing.

Text Classification of Gujarati Newspaper Headlines

International Journal of Asian Language Processing ◽

10.1142/s2717554520500204 ◽

2021 ◽

pp. 2050020

Author(s):

Stuti Mehta ◽

Suman K. Mitra

Keyword(s):

Feature Extraction ◽

Language Processing ◽

Text Classification ◽

Low Resource ◽

Textual Data ◽

Gujarati Language ◽

News Headlines ◽

Embedding Methods ◽

Insight Into

Text classification is an extremely important area of Natural Language Processing (NLP). This paper studies various methods for embedding and classification in the Gujarati language. The dataset comprises of Gujarati News Headlines classified into various categories. Different embedding methods for Gujarati language and various classifiers are used to classify the headlines into given categories. Gujarati is a low resource language. This language is not commonly worked upon. This paper deals with one of the most important NLP tasks - classification and along with it, an idea about various embedding techniques for Gujarati language can be obtained since they help in feature extraction for the process of classification. This paper first performs embedding to get a valid representation of the textual data and then uses already existing robust classifiers to perform classification over the embedded data. Additionally, the paper provides an insight into how various NLP tasks can be performed over a low resource language like Gujarati. Finally, the research paper carries out a comparative analysis between the performances of various existing methods of embedding and classification to get an idea of which combination gives a better outcome.

Using Term Frequency - Inverse Document Frequency to find the Relevance of Words in Gujarati Language

International Journal for Research in Applied Science and Engineering Technology ◽

10.22214/ijraset.2021.33625 ◽

2021 ◽

Vol 9 (4) ◽

pp. 378-381

Author(s):

Tripti Dodiya

Keyword(s):

Inverse Document Frequency ◽

Term Frequency ◽

Document Frequency ◽

Gujarati Language

Improving Semantic Coherence of Gujarati Text Topic Model Using Inflectional Forms Reduction and Single-letter Words Removal

ACM Transactions on Asian and Low-Resource Language Information Processing ◽

10.1145/3447760 ◽

2021 ◽

Vol 20 (1) ◽

pp. 1-18

Author(s):

Uttam Chauhan ◽

Apurva Shah

Keyword(s):

Topic Model ◽

Text Processing ◽

Morphological Structure ◽

Single Letter ◽

Inference Process ◽

Vocabulary Size ◽

Semantic Coherence ◽

Coherence Score ◽

Gujarati Language

A topic model is one of the best stochastic models for summarizing an extensive collection of text. It has accomplished an inordinate achievement in text analysis as well as text summarization. It can be employed to the set of documents that are represented as a bag-of-words, without considering grammar and order of the words. We modeled the topics for Gujarati news articles corpus. As the Gujarati language has a diverse morphological structure and inflectionally rich, Gujarati text processing finds more complexity. The size of the vocabulary plays an important role in the inference process and quality of topics. As the vocabulary size increases, the inference process becomes slower and topic semantic coherence decreases. If the vocabulary size is diminished, then the topic inference process can be accelerated. It may also improve the quality of topics. In this work, the list of suffixes has been prepared that encounters too frequently with words in Gujarati text. The inflectional forms have been reduced to the root words concerning the suffixes in the list. Moreover, Gujarati single-letter words have been eliminated for faster inference and better quality of topics. Experimentally, it has been proved that if inflectional forms are reduced to their root words, then vocabulary length is shrunk to a significant extent. It also caused the topic formation process quicker. Moreover, the inflectional forms reduction and single-letter word removal enhanced the interpretability of topics. The interpretability of topics has been assessed on semantic coherence, word length, and topic size. The experimental results showed improvements in the topical semantic coherence score. Also, the topic size grew notably as the number of tokens assigned to the topics increased.

gujarati language
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Sentiment analysis on film review in Gujarati language using machine learning

POS-HOML: POS Tagging Technique For Gujarati Language Using Hybrid Optimal And Machine Learning Approaches

The RAND-36 Health Survey 1.0: Translation, Reliability, Cross-Cultural Adaptation and Validation of the Gujarati Version

Parts-of-Speech Tagger for Gujarati Language using Long-short-Term-Memory

Debates over Gujarati language and literature

A Novel Speech to Sign Communication Model for Gujarati Language

Influence of GUJarati STEmmeR in Supervised Learning of Web Page Categorization

Text Classification of Gujarati Newspaper Headlines

Using Term Frequency - Inverse Document Frequency to find the Relevance of Words in Gujarati Language

Improving Semantic Coherence of Gujarati Text Topic Model Using Inflectional Forms Reduction and Single-letter Words Removal

Export Citation Format

gujarati languageRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Sentiment analysis on film review in Gujarati language using machine learning

POS-HOML: POS Tagging Technique For Gujarati Language Using Hybrid Optimal And Machine Learning Approaches

The RAND-36 Health Survey 1.0: Translation, Reliability, Cross-Cultural Adaptation and Validation of the Gujarati Version

Parts-of-Speech Tagger for Gujarati Language using Long-short-Term-Memory

Debates over Gujarati language and literature

A Novel Speech to Sign Communication Model for Gujarati Language

Influence of GUJarati STEmmeR in Supervised Learning of Web Page Categorization

Text Classification of Gujarati Newspaper Headlines

Using Term Frequency - Inverse Document Frequency to find the Relevance of Words in Gujarati Language

Improving Semantic Coherence of Gujarati Text Topic Model Using Inflectional Forms Reduction and Single-letter Words Removal

gujarati language
Recently Published Documents