negative class
Recently Published Documents





Saurabh R. Sangwan ◽  
M. P. S. Bhatia

Cyberspace has been recognized as a conducive environment for use of various hostile, direct, and indirect behavioural tactics to target individuals or groups. Denigration is one of the most frequently used cyberbullying ploys to actively damage, humiliate, and disparage the online reputation of target by sending, posting, or publishing cruel rumours, gossip, and untrue statements. Previous pertinent studies report detecting profane, vulgar, and offensive words primarily in the English language. This research puts forward a model to detect online denigration bullying in low-resource Hindi language using attention residual networks. The proposed model Hindi Denigrate Comment–Attention Residual Network (HDC-ARN) intends to uncover defamatory posts (denigrate comments) written in Hindi language which stake and vilify a person or an entity in public. Data with 942 denigrate comments and 1499 non-denigrate comments is scraped using certain hashtags from two recent trending events in India: Tablighi Jamaat spiked Covid-19 (April 2020, Event 1) and Sushant Singh Rajput Death (June 2020: Event 2). Only text-based features, that is, the actual content of the post, are considered. The pre-trained word embedding for Hindi language from fastText is used. The model has three ResNet blocks with an attention layer that generates a post vector for a single input, which is passed through a sigmoid activation function to get the final output as either denigrate (positive class) or non-denigrate (negative class). An F-1 score of 0.642 is achieved on the dataset.

2021 ◽  
Vol 8 (1) ◽  
pp. 2
Chanunya Loraksa ◽  
Sirima Mongkolsomlit ◽  
Nitikarn Nimsuk ◽  
Meenut Uscharapong ◽  
Piya Kiatisevi

Osteosarcoma is a rare bone cancer which is more common in children than in adults and has a high chance of metastasizing to the patient’s lungs. Due to initiated cases, it is difficult to diagnose and hard to detect the nodule in a lung at the early state. Convolutional Neural Networks (CNNs) are effectively applied for early state detection by considering CT-scanned images. Transferring patients from small hospitals to the cancer specialized hospital, Lerdsin Hospital, poses difficulties in information sharing because of the privacy and safety regulations. CD-ROM media was allowed for transferring patients’ data to Lerdsin Hospital. Digital Imaging and Communications in Medicine (DICOM) files cannot be stored on a CD-ROM. DICOM must be converted into other common image formats, such as BMP, JPG and PNG formats. Quality of images can affect the accuracy of the CNN models. In this research, the effect of different image formats is studied and experimented. Three popular medical CNN models, VGG-16, ResNet-50 and MobileNet-V2, are considered and used for osteosarcoma detection. The positive and negative class images are corrected from Lerdsin Hospital, and 80% of all images are used as a training dataset, while the rest are used to validate the trained models. Limited training images are simulated by reducing images in the training dataset. Each model is trained and validated by three different image formats, resulting in 54 testing cases. F1-Score and accuracy are calculated and compared for the models’ performance. VGG-16 is the most robust of all the formats. PNG format is the most preferred image format, followed by BMP and JPG formats, respectively.

2021 ◽  
Vol 4 (1) ◽  
pp. 17-22
Zetta Nillawati Reyka Putri ◽  
Muhammad Muhajir

At the end of 2020, Habib Rizieq's return to Indonesia drew criticism from the public for causing crowds during the Covid-19 pandemic. News and opinions about Habib Rizieq fill internet platforms, including Twitter. The researcher wants to classify the opinion text data of Habib Rizieq's return from Twitter into positive and negative sentiments using the Support Vector Machine method. Opinion data comes from Twitter, so the data is analyzed by text mining through the preprocessing stage. The SVM classification of unbalanced data between positive and negative classes resulted in 95.06% accuracy with a negative class precision value of 84% and better than 72% recall, in the positive class the precision value was 96% less than 2% of recall 98%. While the SVM classification with the oversampling method gets 100% accuracy, precision, and recall. The results of positive sentiments are known that the public will always support and want freedom for Rizieq, for negative sentiments it is known that many people are disappointed with Rizieq regarding the lies of his swab test results.

Maciej A. Wujec

The deep neural network - BERT model (Bidirectional Encoder Representations from Transformers) and the stocks cumulative abnormal return is used in this article to analyze the sentiment of financial texts. The proposed approach, unlike those used so far, does not require the creation of dictionaries, takes into account the broad context of words and their meaning in financial texts, eliminates the problem of ambiguity of words in various contexts, does not require manual labelling of data and is free from the subjective assessment of the researcher. The sentiment of financial texts in the meaning presented in this paper is directly related to the market reaction to the information contained in these texts. For texts belonging to one of the two classes (positive or negative) with the highest probability the BERT model gives the results of predictions with a precision level of 62.38% for the positive class and 55% for the negative class. The results at this level can be used in event study, market efficiency research, investment strategy development or support of investment analysts using fundamental analysis.

2021 ◽  
Vol 13 (3) ◽  
pp. 128-133
Attala Rafid Abelard ◽  
Yuliant Sibaroni

Among many film streaming platforms that have sprung up, Netflix is ​​the platform that has the most subscribers compared to the other platforms. However, not all reviews provided by the Netflix users are good reviews. These reviews will later be analyzed to determine what aspects are reviewed by the users based on reviews written on the Google Play Store, using the Latent Dirichlet Allocation (LDA) method. Then, the classification process using the Support Vector Machine (SVM) method will be carried out to determine whether each of these reviews is included in the positive or negative class (Sentiment Analysis). There are 2 scenarios that were carried out in this study. The first scenario resulted that the best number of LDA topics to be used is 40, and the second scenario resulted that the use of filtering process in the preprocessing stage reduces the score of the f1-score. Thus, this study resulted in the best performance score on LDA and SVM testing with 40 topics, and without running the filtering process with the score of 78.15%.

2021 ◽  
Vol 4 ◽  
pp. 76-82
Wilma Latuny ◽  
Victor O. Lawalata ◽  
Daniel B. Paillin ◽  
Rahman Ohoirenan

UD Sinar Baru has eucalyptus oil products with various sizes from 30 ml to 550 ml, and the size of 550 ml is the most consumed eucalyptus oil product. However, this product has been criticized by consumers for its packaging which has not met their expectations. This study aims to obtain an accurate method of classifying consumer sentiment and obtain features that affect the redesign of the 550 ml eucalyptus oil product packaging. Collecting data using an online survey method from social media Facebook to get consumer comments using power queries. Data analysis uses the concept of the Support Vector Machine (SVM) method with the support of the WEKA application to provide sentiment analysis and accuracy of consumer comments. The results of the study present the tendency of comments on each attribute with an assessment of 83% accuracy for the entire class, 3% for positive class comments, and 57% comments for negative class. The sentiment that shows the packaging tends to be normal at 20% which is interpreted as neutral. The conclusion from the results of this study is that SMO has a very accurate prediction rate to analyze consumer sentiment about the features of the 550 ml eucalyptus oil packaging, and it is necessary to redesign the current packaging by considering the features of shape, color, size, and efficiency.

2021 ◽  
Vol 778 (1) ◽  
pp. 012009
M Tamrin ◽  
L Septianasari

Abstract TripAdvisor has become a credential traveling platform for tourists worldwide to set travel plans. The widespread of big data in online platforms urges the use of text mining to benefit some sectors, including in the tourism industry. This study aimed to investigate the information extraction based on the online reviews on TripAdvisor for Gili Trawangan tourist destinations. The method used in this research was text mining with Support Vector Machine (SVM) to classify the online reviews that categorized into two classes, positive class and negative class. The results of information extraction show that the issue of horse cruelty, bad waste management, and ecosystem vulnerability dominated the negative sentiments. These negative sentiments need to be handled professionally by the tourism enterprise to boost the tourism industry in Gili Trawangan.

2021 ◽  
Vol 5 (1) ◽  
pp. 123-131
Ni Luh Putu Merawati Putu ◽  
Ahmad Zuli Amrullah ◽  

Lombok Island is one of the favorite tourist destinations. Various topics and comments about Lombok tourism experience through social media accounts are difficult to manually identify public sentiments and topics. The opinion expressed by tourists through social media is interesting for further research. This study aims to classify tourists' opinions into two classes, positive and negative, and topics modelling by using the Naive Bayes method and modeling the topic by using Latent Dirichlet Allocation (LDA). The stages of this research include data collection, data cleaning, data transformation, data classification. The results performance testing of the classification model using Naive Bayes method is shown with an accuracy value of 92%, precision of 100%, recall of 84% and specificity of 100%. The results of modeling topics using LDA in each positive and negative class from the coherence value shows the highest value for the positive class was obtained on the 8th topic with a value of 0.613 and for the negative class on the 12th topic with a value of 0.528. The use of the Naive Bayes and LDA algorithms is considered effective for analyzing the sentiment and topic modelling for Lombok tourism.  

Sensors ◽  
2020 ◽  
Vol 21 (1) ◽  
pp. 57
Bruno Machado Rocha ◽  
Diogo Pessoa ◽  
Alda Marques ◽  
Paulo Carvalho ◽  
Rui Pedro Paiva

(1) Background: Patients with respiratory conditions typically exhibit adventitious respiratory sounds (ARS), such as wheezes and crackles. ARS events have variable duration. In this work we studied the influence of event duration on automatic ARS classification, namely, how the creation of the Other class (negative class) affected the classifiers’ performance. (2) Methods: We conducted a set of experiments where we varied the durations of the other events on three tasks: crackle vs. wheeze vs. other (3 Class); crackle vs. other (2 Class Crackles); and wheeze vs. other (2 Class Wheezes). Four classifiers (linear discriminant analysis, support vector machines, boosted trees, and convolutional neural networks) were evaluated on those tasks using an open access respiratory sound database. (3) Results: While on the 3 Class task with fixed durations, the best classifier achieved an accuracy of 96.9%, the same classifier reached an accuracy of 81.8% on the more realistic 3 Class task with variable durations. (4) Conclusion: These results demonstrate the importance of experimental design on the assessment of the performance of automatic ARS classification algorithms. Furthermore, they also indicate, unlike what is stated in the literature, that the automatic classification of ARS is not a solved problem, as the algorithms’ performance decreases substantially under complex evaluation scenarios.

Sign in / Sign up

Export Citation Format

Share Document