GovdeTurk: A Novel Turkish Natural Language Processing Tool for Stemming, Morphological Labelling and Verb Negation

GovdeTurk is a tool for stemming, morphological labeling and verb negation for Turkish language. We designed comprehensive finite automata to represent Turkish grammar rules. Based on these automata, GovdeTurk finds the stem of the word by removing the inflectional suffixes in a longest match strategy. Levenshtein Distance is used to correct spelling errors that may occur during suffix removal. Morphological labeling identifies the functionality of a given token. Nine different dictionaries are constructed for each specific word type. These dictionaries are used in the stemming and morphological labeling. Verb negation module is developed for lexicon based sentiment analysis. GovdeTurk is tested on a dataset of one million words. The results are compared with Zemberek and Turkish Snowball Algorithm. While the closest competitor, Zemberek, in the stemming step has an accuracy of 80%, GovdeTurk gives 97.3% of accuracy. Morphological labeling accuracy of GovdeTurk is 93.6%. With outperforming results, our model becomes foremost among its competitors

Download Full-text

Natural language processing for analysis of student online sentiment in a postgraduate program

Pacific Journal of Technology Enhanced Learning ◽

10.24135/pjtel.v2i2.4 ◽

2020 ◽

Vol 2 (2) ◽

pp. 15-30

Author(s):

Truc D Pham ◽

Darcy Vo ◽

Frank Li ◽

Karen Baker ◽

Binglan Han ◽

...

Keyword(s):

Social Media ◽

Natural Language Processing ◽

Natural Language ◽

Sentiment Analysis ◽

Language Processing ◽

Learning Experience ◽

Research Question ◽

The Social ◽

Natural Language Processing Tool ◽

Course Experience

Higher education institutes are continually looking for new and better ways to support and understand the learning experience of their students. One possible option is to use sentiment analysis tools to investigate the attitudes and emotions of students when they are interacting on social media about their course experience. In this study, we analysed the social media posts, from a closed programme-based community, of more than 300 students in a single programme cohort by processing the dataset with the Google cloud-based Natural Language Processing API for sentiment analysis. The sentiment scores and magnitudes were then visualised to help explore the research question ‘How does a natural language processing tool help analyse student online sentiment in a postgraduate program?’ The results have provided a better understanding of students’ online sentiment relating to the activities and assessments of the programme as well as the variation of that sentiment over the timeline of the programme.

Download Full-text

VADER Natural Language Processing in Market Sentiment Analysis

SSRN Electronic Journal ◽

10.2139/ssrn.3676706 ◽

2020 ◽

Author(s):

Jonathan Seror

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Sentiment Analysis ◽

Language Processing ◽

Market Sentiment

Download Full-text

Sentiment Analysis Techniques Applied to Raw-Text Data from a Csq-8 Questionnaire about Mindfulness in Times of COVID-19 to Improve Strategy Generation

International Journal of Environmental Research and Public Health ◽

10.3390/ijerph18126408 ◽

2021 ◽

Vol 18 (12) ◽

pp. 6408

Author(s):

Mario Jojoa Acosta ◽

Gema Castillo-Sánchez ◽

Begonya Garcia-Zapirain ◽

Isabel de la Torre Díez ◽

Manuel Franco-Martín

Keyword(s):

Health Care ◽

Natural Language Processing ◽

Natural Language ◽

Sentiment Analysis ◽

Transfer Learning ◽

Language Processing ◽

Health Care Professionals ◽

Ground Truth ◽

Relevant Information ◽

Free Text

The use of artificial intelligence in health care has grown quickly. In this sense, we present our work related to the application of Natural Language Processing techniques, as a tool to analyze the sentiment perception of users who answered two questions from the CSQ-8 questionnaires with raw Spanish free-text. Their responses are related to mindfulness, which is a novel technique used to control stress and anxiety caused by different factors in daily life. As such, we proposed an online course where this method was applied in order to improve the quality of life of health care professionals in COVID 19 pandemic times. We also carried out an evaluation of the satisfaction level of the participants involved, with a view to establishing strategies to improve future experiences. To automatically perform this task, we used Natural Language Processing (NLP) models such as swivel embedding, neural networks, and transfer learning, so as to classify the inputs into the following three categories: negative, neutral, and positive. Due to the limited amount of data available—86 registers for the first and 68 for the second—transfer learning techniques were required. The length of the text had no limit from the user’s standpoint, and our approach attained a maximum accuracy of 93.02% and 90.53%, respectively, based on ground truth labeled by three experts. Finally, we proposed a complementary analysis, using computer graphic text representation based on word frequency, to help researchers identify relevant information about the opinions with an objective approach to sentiment. The main conclusion drawn from this work is that the application of NLP techniques in small amounts of data using transfer learning is able to obtain enough accuracy in sentiment analysis and text classification stages.

Download Full-text

Creation of a simple natural language processing tool to support an imaging utilization quality dashboard

International Journal of Medical Informatics ◽

10.1016/j.ijmedinf.2017.02.011 ◽

2017 ◽

Vol 101 ◽

pp. 93-99 ◽

Cited By ~ 10

Author(s):

Jordan Swartz ◽

Christian Koziatek ◽

Jason Theobald ◽

Silas Smith ◽

Eduardo Iturrate

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Natural Language Processing Tool ◽

Imaging Utilization

Download Full-text

Machine-learning as a validated tool to characterize individual differences in free recall of naturalistic events.

10.31234/osf.io/uygzv ◽

2021 ◽

Author(s):

Xinxu Shen ◽

Troy Houser ◽

David Victor Smith ◽

Vishnu P. Murty

Keyword(s):

Machine Learning ◽

Natural Language Processing ◽

Natural Language ◽

Individual Difference ◽

Language Processing ◽

Large Scale ◽

High Reliability ◽

Difference Analysis ◽

Universal Sentence ◽

Natural Language Processing Tool

The use of naturalistic stimuli, such as narrative movies, is gaining popularity in many fields, characterizing memory, affect, and decision-making. Narrative recall paradigms are often used to capture the complexity and richness of memory for naturalistic events. However, scoring narrative recalls is time-consuming and prone to human biases. Here, we show the validity and reliability of using a natural language processing tool, the Universal Sentence Encoder (USE), to automatically score narrative recall. We compared the reliability in scoring made between two independent raters (i.e., hand-scored) and between our automated algorithm and individual raters (i.e., automated) on trial-unique, video clips of magic tricks. Study 1 showed that our automated segmentation approaches yielded high reliability and reflected measures yielded by hand-scoring, and further that the results using USE outperformed another popular natural language processing tool, GloVe. In study two, we tested whether our automated approach remained valid when testing individual’s varying on clinically-relevant dimensions that influence episodic memory, age and anxiety. We found that our automated approach was equally reliable across both age groups and anxiety groups, which shows the efficacy of our approach to assess narrative recall in large-scale individual difference analysis. In sum, these findings suggested that machine learning approaches implementing USE are a promising tool for scoring large-scale narrative recalls and perform individual difference analysis for research using naturalistic stimuli.

Download Full-text

Coarse-Grained Sentiment Analysis Berbasis Natural Language Processing – Ulasan Hotel

Jurnal Nasional Teknik Elektro dan Teknologi Informasi (JNTETI) ◽

10.22146/jnteti.v10i1.548 ◽

2021 ◽

Vol 10 (1) ◽

pp. 41-48

Author(s):

Warnia Nengsih ◽

M. Mahrus Zein ◽

Nazifa Hayati

Keyword(s):

Natural Language Processing ◽

Random Forest ◽

Natural Language ◽

Receiver Operating Characteristic ◽

Sentiment Analysis ◽

Language Processing ◽

Operating Characteristic ◽

Coarse Grained ◽

Receiver Operating

Sentiment analysis adalah metode untuk memperoleh data dari berbagai platform yang tersedia di internet. Kemajuan teknologi memungkinkan mesin untuk mengenali suatu istilah yang dianggap sebagai opini positif maupun sebaliknya. Data-data dan opini tersebut berperan penting sebagai umpan balik produk, layanan, dan topik lainnya. Tanpa perlu memperoleh opini secara langsung dari masyarakat, pihak penyedia telah mendapatkan evaluasi yang penting guna mengembangkan diri. Bisnis perhotelan merupakan bidang yang terkait dengan jasa memberikan layanan pada pelanggan. Indikator keberlangsungan bisnis ini juga bergantung pada umpan balik pelanggannya dan dijadikan sebagai acuan untuk pengambilan kebijakan strategis. Teknik sentiment analysis berbasis Natural Language Processing dapat mengatasi permasalahan tersebut. Pada makalah ini prediksi dilakukan menggunakan classifier Random Forest (RF), sementara untuk merangkum kualitas classifier, digunakan kurva Receiver Operating Characteristic (ROC). Kurva ROC berupa grafik yang baik untuk merangkum kualitas classifier. Semakin tinggi kurva berada di atas garis diagonal, semakin baik prediksinya, dengan nilai kurva ROC yang diperoleh sebesar 0,90. Terlihat hasil ulasan terhadap opini pelanggan terhadap jasa dan pelayanan yang diberikan oleh hotel untuk kategori positif lebih banyak daripada kategori negatif. Polaritas dari ulasan diperoleh 68% ulasan pelanggan berada pada area positif dan 32% berada pada area negatif.

Download Full-text

Sentiment Analysis on Twitter Airline Data

International Journal for Research in Applied Science and Engineering Technology ◽

10.22214/ijraset.2021.35807 ◽

2021 ◽

Vol 9 (VI) ◽

pp. 3767-3770

Author(s):

Kirti Jain

Keyword(s):

Machine Learning ◽

Natural Language Processing ◽

Natural Language ◽

Sentiment Analysis ◽

Language Processing ◽

Learning Task ◽

Model Based ◽

Sentiment Mining ◽

General Opinion

Sentiment analysis, also known as sentiment mining, is a submachine learning task where we want to determine the overall sentiment of a particular document. With machine learning and natural language processing (NLP), we can extract the information of a text and try to classify it as positive, neutral, or negative according to its polarity. In this project, We are trying to classify Twitter tweets into positive, negative, and neutral sentiments by building a model based on probabilities. Twitter is a blogging website where people can quickly and spontaneously share their feelings by sending tweets limited to 140 characters. Because of its use of Twitter, it is a perfect source of data to get the latest general opinion on anything.

Download Full-text

EMOSIS Sentiment Analysis on Tweets with Emotion and Intensity Level Recognition Considering Ending Punctuation Marks

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.d4518.118419 ◽

2019 ◽

Vol 8 (4) ◽

pp. 10289-10293

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Emotion Recognition ◽

Sentiment Analysis ◽

Language Processing ◽

Significant Role ◽

Language Model ◽

Intensity Level ◽

Processing Stage ◽

Overall Performance

Sentiment Analysis is a tool used for determining the Polarity or Emotion of a Sentence. It is a field of Natural Language Processing which focuses on the study of opinions. In this study, the researchers solved one key challenge in Sentiment Analysis, which is to consider the Ending Punctuation Marks present in a sentence. Ending punctuation marks plays a significant role in Emotion Recognition and Intensity Level Recognition. The research made used of tweets expressing opinions about Philippine President Rodrigo Duterte. These downloaded tweets served as the inputs. It was initially subjected to pre-processing stage to be able to prepare the sentences for processing. A Language Model was created to serve as the classifier for determining the scores of the tweets. The scores give the polarity of the sentence. Accuracy is very important in sentiment analysis. To increase the chance of correctly identifying the polarity of the tweets, the input undergone Intensity Level Recognition which determines the intensifiers and negations within the sentences. The system was evaluated with overall performance of 80.27%.

Download Full-text

Análise de discursos em notícias sobre homofobia, racismo e sexismo em comentários de portais brasileiros de notícias

10.14210/cotb.v12.p467-474 ◽

2021 ◽

Author(s):

Lucas Rodrigues ◽

Antonio Jacob Junior ◽

Fábio Lobato

Keyword(s):

Social Media ◽

Natural Language Processing ◽

Sentiment Analysis ◽

Data Visualization ◽

Language Processing ◽

Topic Modeling ◽

Hate Speech ◽

Psychological Impact ◽

Internet Service ◽

General Law

Posts with defamatory content or hate speech are constantly foundon social media. The results for readers are numerous, not restrictedonly to the psychological impact, but also to the growth of thissocial phenomenon. With the General Law on the Protection ofPersonal Data and the Marco Civil da Internet, service providersbecame responsible for the content in their platforms. Consideringthe importance of this issue, this paper aims to analyze the contentpublished (news and comments) on the G1 News Portal with techniquesbased on data visualization and Natural Language Processing,such as sentiment analysis and topic modeling. The results showthat even with most of the comments being neutral or negative andclassified or not as hate speech, the majority of them were acceptedby the users.

Download Full-text

Generalized approach to sentiment analysis of short text messages in natural language processing

Information and Control Systems ◽

10.31799/1684-8853-2020-1-2-14 ◽

2020 ◽

pp. 2-14

Author(s):

Evrenii Polyakov ◽

Leonid Voskov ◽

Pavel Abramov ◽

Sergey Polyakov

Keyword(s):

Machine Learning ◽

Natural Language Processing ◽

Data Analysis ◽

Natural Language ◽

Sentiment Analysis ◽

Language Processing ◽

Full Range ◽

Text Messages ◽

Basic Solution ◽

Short Text

Introduction: Sentiment analysis is a complex problem whose solution essentially depends on the context, field of study andamount of text data. Analysis of publications shows that the authors often do not use the full range of possible data transformationsand their combinations. Only a part of the transformations is used, limiting the ways to develop high-quality classification models.Purpose: Developing and exploring a generalized approach to building a model, which consists in sequentially passing throughthe stages of exploratory data analysis, obtaining a basic solution, vectorization, preprocessing, hyperparameter optimization, andmodeling. Results: Comparative experiments conducted using a generalized approach for classical machine learning and deeplearning algorithms in order to solve the problem of sentiment analysis of short text messages in natural language processinghave demonstrated that the classification quality grows from one stage to another. For classical algorithms, such an increasein quality was insignificant, but for deep learning, it was 8% on average at each stage. Additional studies have shown that theuse of automatic machine learning which uses classical classification algorithms is comparable in quality to manual modeldevelopment; however, it takes much longer. The use of transfer learning has a small but positive effect on the classificationquality. Practical relevance: The proposed sequential approach can significantly improve the quality of models under developmentin natural language processing problems.

Download Full-text