scholarly journals Lexicon Based Sentiment Analysis in Indonesia Languages : A Systematic Literature Review

2021 ◽  
Vol 1 (1) ◽  
pp. 363-367
Author(s):  
Yuli Fauziah ◽  
Bambang Yuwono ◽  
Agus Sasmito Aribowo

This systematic literature review aims to determine the trend of lexicon based sentiment analysis research in Indonesian Language in the last two years. The focus of the study is on the understanding of preprocessing used in lexicon-based sentiment analysis studies in the last two years, the lexicon used in these studies, and classification accuracy. The main question in this SLR : what techniques of lexicon based sentiment analysis will provide the highest accuracy. The most widely used preprocessing methods in previous research are tokenization, case conversion, stemming, remove punctuation, remove stop word, remove or replace emoji and emoticons, and normalization or slangword conversion. The sentiment labeling process in previous studies calculated based on the comparison of the number of negative sentiment keywords with positive sentiment keywords in one sentence. The maximum accuracy from previous study is 90%. The most widely used lexicon is NRC and Inset which is a lexicon dictionary in Indonesian. Knowledge of this can be used to propose a better model for lexicon based sentiment analysis in Indonesian Languages.

2021 ◽  
pp. 097215092098485
Author(s):  
Sonika Gupta ◽  
Sushil Kumar Mehta

Data mining techniques have proven quite effective not only in detecting financial statement frauds but also in discovering other financial crimes, such as credit card frauds, loan and security frauds, corporate frauds, bank and insurance frauds, etc. Classification of data mining techniques, in recent years, has been accepted as one of the most credible methodologies for the detection of symptoms of financial statement frauds through scanning the published financial statements of companies. The retrieved literature that has used data mining classification techniques can be broadly categorized on the basis of the type of technique applied, as statistical techniques and machine learning techniques. The biggest challenge in executing the classification process using data mining techniques lies in collecting the data sample of fraudulent companies and mapping the sample of fraudulent companies against non-fraudulent companies. In this article, a systematic literature review (SLR) of studies from the area of financial statement fraud detection has been conducted. The review has considered research articles published between 1995 and 2020. Further, a meta-analysis has been performed to establish the effect of data sample mapping of fraudulent companies against non-fraudulent companies on the classification methods through comparing the overall classification accuracy reported in the literature. The retrieved literature indicates that a fraudulent sample can either be equally paired with non-fraudulent sample (1:1 data mapping) or be unequally mapped using 1:many ratio to increase the sample size proportionally. Based on the meta-analysis of the research articles, it can be concluded that machine learning approaches, in comparison to statistical approaches, can achieve better classification accuracy, particularly when the availability of sample data is low. High classification accuracy can be obtained with even a 1:1 mapping data set using machine learning classification approaches.


Author(s):  
Karen Mite-Baidal ◽  
Carlota Delgado-Vera ◽  
Evelyn Solís-Avilés ◽  
Ana Herrera Espinoza ◽  
Jenny Ortiz-Zambrano ◽  
...  

SISTEMASI ◽  
2022 ◽  
Vol 11 (1) ◽  
pp. 197
Author(s):  
Okta Fanny ◽  
Heri Suroyo

From the research that has been done, it can be concluded that Sentiment Analysis can be used to know the sentiment of the public, especially Twitter netizens against omnibus law. After the sentiment analysis, it looks neutral artmen with the largest percentage of 55%, then positive sentiment by 35% and negative sentiment by 10%. The results of the analysis showed that the Naïve Bayes Classifier method provides classification test results with accuracy in Hashtag Pro with an average accuracy score of 92.1%, precision values with an average of 94.8% and recall values with an average of 90.7%. While Hashtag Counter For data classification, with an average accuracy value of 98.3%, precision value with an average of 97.6% and recall value with an average of 98.7%. The result of text cloud analysis conducted on a combination of hashtags both Hashtag pros and Hashtags cons, the dominant word appears is Omnibus Law which means that all hashtags in scrap is really discussing the main topic that is about Omnibus Law


2020 ◽  
Vol 79 (11) ◽  
pp. 1432-1437 ◽  
Author(s):  
Chanakya Sharma ◽  
Samuel Whittle ◽  
Pari Delir Haghighi ◽  
Frada Burstein ◽  
Roee Sa'adon ◽  
...  

ObjectivesWe hypothesise that patients have a positive sentiment regarding biological/targeted synthetic disease modifying anti-rheumatic drugs (b/tsDMARDs) and a negative sentiment towards conventional synthetic agents (csDMARDs). We analysed discussions on social media platforms regarding DMARDs to understand the collective sentiment expressed towards these medications.MethodsTreato analytics were used to download all available posts on social media about DMARDs in the context of rheumatoid arthritis. Strict filters ensured that user generated content was downloaded. The sentiment (positive or negative) expressed in these posts was analysed for each DMARD using sentiment analysis. We also analysed the reason(s) for this sentiment for each DMARD, looking specifically at efficacy and side effects.ResultsComputer algorithms analysed millions of social media posts and included 54 742 posts about DMARDs. We found that both classes had an overall positive sentiment. The ratio of positive to negative posts was higher for b/tsDMARDs (1.210) than for csDMARDs (1.048). Efficacy was the most commonly mentioned reason in posts with a positive sentiment and lack of efficacy was the most commonly mentioned reason for a negative sentiment. These were followed by the presence/absence of side effects in negative or positive posts, respectively.ConclusionsPublic opinion on social media is generally positive about DMARDs. Lack of efficacy followed by side effects were the most common themes in posts with a negative sentiment. There are clear reasons why a DMARD generates a positive or negative sentiment, as the sentiment analysis technology becomes more refined, targeted studies could be done to analyse these reasons and allow clinicians to tailor DMARDs to match patient needs.


2019 ◽  
Vol 2 (2) ◽  
pp. 1-2
Author(s):  
Haniya Ahmed ◽  
Kenny Wong

The purpose of the project is to identify common difficulties that learners may face and to understand their emotions as they progress through MOOCs. MOOC is an abbreviation for the Massive Open Online Course and the research deals with the data from ten different courses from Coursera. The data is used to extract pieces of text that students have made. Then, those certain texts are required to be sent to Google Cloud Natural Language API. This app allows users to get a sentiment analysis of a text. The main goal is to assist instructors with monitoring MOOC to make it more efficient and easier for students to progress since it assists to improve the courses.  To achieve this, the first step is to gather all the data from each of the courses. Then use programming to dump all that data into one big database. The program that is used here is called Pycharm and user is required to use python and sql to aid him in dumping the data in the database. Once the database is created, coding is done to only select out the pieces of information that are needed. These texts should be where students make comments or ask questions. Next, the data is queried to send these texts to Google Cloud Natural Language API. Here, the program breaks down all the sentences to only be just words. Then the program is going to categorize each word according to whether its connotation is positive, negative or neutral. Next, all the words are sorted according to their connotations. The overall sentiment depends on the emotion that has the highest number. If positives and negatives are all balanced out then the sentiment is neutral. Sentiment scores range from -1 to 1, where -1 is the most negative, 1 is the most positive and anywhere near 0 is neutral.  Positive sentiment scores indicate instructors that students are doing well on their course and neutral sentiment scores indicate that the course is balanced out with difficulties and easy tasks. However, negative sentiment is the most important to instructors since it indicates them that students are struggling and they need to improve the course.


2019 ◽  
Vol 6 (1) ◽  
pp. 20-34 ◽  
Author(s):  
Aam Slamet Rusydiana ◽  
Irman Firmansyah ◽  
Lina Marlina

It is important to do research on public sentiment towards microtakaful presence in a country in order to know public response to its existence. This study aimed to determine public sentiment towards microtakaful in Indonesia and in Malaysia. Data were collected from 40 articles, journals and other writings. Data were analyzed using the software Semantria as an analytical tool in the form of text. The results showed that the assessment of existence of microtakaful in Indonesia amounted to 52% of the community showed positive sentiment, 28% indicate negative sentiment and 20% indicates a neutral sentiment. While in Malaysia that 62% showed positive sentiment, 23% negative sentiment and 15% neutral sentiment.


IEEE Access ◽  
2021 ◽  
pp. 1-1
Author(s):  
Ruba Obiedat ◽  
Duha Al-Darras ◽  
Esra Alzaghoul ◽  
Osama Harfoushi

Sign in / Sign up

Export Citation Format

Share Document