Analisis Sentimen Terhadap Review Film Menggunakan Metode Modified Balanced Random Forest dan Mutual Information

Information exchange is currently the most happening on the internet. Information exchange can be done in many ways, such as expressing expressions on social media. One of them is reviewing a film. When someone reviews a film he will use his emotions to express their feelings, it can be positive or negative. The fast growth of the internet has made information more diverse, plentiful and unstructured. Sentiment analysis can handle this, because sentiment analysis is a classification process to understand opinions, interactions, and emotions of a document or text that is carried out automatically by a computer system. One suitable machine learning method is the Modified Balanced Random Forest. To deal with the various data, the feature selection used is Mutual Information. With these two methods, the system is able to produce an accuracy value of 79% and F1-scores value of 75%.

Download Full-text

Cross-domain sentiment analysis model on Indonesian YouTube comment

International Journal of Advances in Intelligent Informatics ◽

10.26555/ijain.v7i1.554 ◽

2021 ◽

Vol 7 (1) ◽

pp. 12

Author(s):

Agus Sasmito Aribowo ◽

Halizah Basiron ◽

Noor Fazilla Abd Yusof ◽

Siti Khomsah

Keyword(s):

Machine Learning ◽

Random Forest ◽

Sentiment Analysis ◽

Machine Learning Method ◽

Learning Method ◽

Analysis Model ◽

Language Form ◽

Cross Domain ◽

Ensemble Machine Learning ◽

Stop Word

A cross-domain sentiment analysis (CDSA) study in the Indonesian language and tree-based ensemble machine learning is quite interesting. CDSA is useful to support the labeling process of cross-domain sentiment and reduce any dependence on the experts; however, the mechanism in the opinion unstructured by stop word, language expressions, and Indonesian slang words is unidentified yet. This study aimed to obtain the best model of CDSA for the opinion in Indonesia language that commonly is full of stop words and slang words in the Indonesian dialect. This study was purposely to observe the benefits of the stop words cleaning and slang words conversion in CDSA in the Indonesian language form. It was also to find out which machine learning method is suitable for this model. This study started by crawling five datasets of the comments on YouTube from 5 different domains. The dataset was copied into two groups: the dataset group without any process of stop word cleaning and slang word conversion and the dataset group to stop word cleaning and slang word conversion. CDSA model was built for each dataset group and then tested using two types of tree-based ensemble machine learning, i.e., Random Forest (RF) and Extra Tree (ET) classifier, and tested using three types of non-ensemble machine learning, including Naïve Bayes (NB), SVM, and Decision Tree (DT) as the comparison. Then, It can be suggested that the accuracy of CDSA in Indonesia Language increased if it still removed the stop words and converted the slang words. The best classifier model was built using tree-based ensemble machine learning, particularly ET, as in this study, the ET model could achieve the highest accuracy by 91.19%. This model is expected to be the CDSA technique alternative in the Indonesian language.

Download Full-text

Application of Twin Objective Function SVM in Sentiment Analysis

Machine Learning and Artificial Intelligence - Frontiers in Artificial Intelligence and Applications ◽

10.3233/faia200786 ◽

2020 ◽

Author(s):

Qiaoman Yang ◽

Chunyu Liu

Keyword(s):

Machine Learning ◽

Decision Making ◽

Objective Function ◽

Sentiment Analysis ◽

Support Vector ◽

Machine Learning Method ◽

Learning Method ◽

Key Issues ◽

Single Objective ◽

Accuracy And Stability

Classification modeling is one of the key issues in sentiment analysis. Support vector machine (SVM) has been widely used in classification as an effective machine learning method. Generally, a common SVM is only for decision-making that sacrifices the distribution of data. In practice, sentiment data are big and mazy, which results in the deficiency of accuracy and stability when common SVM is used. The study investigates sentiment analysis by applying the twin objective function SVM, including nonparallel SVM(NPSVM) and twin SVM (TWSVM). From the experiments, we concluded that twin objective function SVMs are superior to NB and single objective function SVM in accuracy and stability.

Download Full-text

Blockchain Leveraged Cyberbullying Preventing framework

10.21203/rs.3.rs-21075/v2 ◽

2021 ◽

Author(s):

Md Anawar Hossen Wadud ◽

Md Ashraf Uddin

Keyword(s):

Machine Learning ◽

Social Media ◽

Language Processing ◽

Processing Technique ◽

The Internet ◽

Machine Learning Method ◽

Learning Method ◽

Natural Language Processing Technique ◽

Being Bullied ◽

Cyberbullying Detection

Abstract The popularity of social media has exploded worldwide over the last few decades and becomes the most preferred mode of social interaction. The internet also provides a new platform through which adolescents are being bullied. Appropriate means of cyberbullying detection is still partial and in some cases very limited. Moreover, research on cyberbullying detection extensively focuses on surveys and its psychological impacts on victims. However, prevention has not been widely addressed. To bridge the gap, this paper aims to detect cyberbullying efficiently. This paper employs a standard machine learning method and natural language processing technique as a part of the detection process in decentralized Blockchain leveraged architecture. We provide a fog based architecture for cyberbullying detection, aiming at relieving the server's load by placing the detection and the prevention of cyberbullying processes at the fog layer. The proposal might offer a probable solution to save users, particularly adolescents from severe consequences of cyberbullying.

Download Full-text

Comparative Study: The Implementation of Machine Learning Method for Sentiment Analysis in Social Media. A Recommendation for Future Research

Advanced Science Letters ◽

10.1166/asl.2014.5631 ◽

2014 ◽

Vol 20 (10) ◽

pp. 2009-2013

Author(s):

Miftah Andriansyah ◽

Adang Suhendra ◽

I Wayan Simri Wicaksana

Keyword(s):

Machine Learning ◽

Social Media ◽

Comparative Study ◽

Sentiment Analysis ◽

Future Research ◽

Machine Learning Method ◽

Learning Method

Download Full-text

Random forest machine learning method outperforms prehospital National Early Warning Score for predicting one-day mortality: A retrospective study

Resuscitation Plus ◽

10.1016/j.resplu.2020.100046 ◽

2020 ◽

Vol 4 ◽

pp. 100046

Author(s):

Jussi Pirneskoski ◽

Joonas Tamminen ◽

Antti Kallonen ◽

Jouni Nurmi ◽

Markku Kuisma ◽

...

Keyword(s):

Machine Learning ◽

Retrospective Study ◽

Random Forest ◽

Early Warning ◽

Early Warning Score ◽

Machine Learning Method ◽

Learning Method ◽

National Early Warning Score

Download Full-text

Quality Prediction of Drilled and Reamed Bores Based on Torque Measurements and the Machine Learning Method of Random Forest

Procedia Manufacturing ◽

10.1016/j.promfg.2020.05.127 ◽

2020 ◽

Vol 48 ◽

pp. 894-901

Author(s):

Sebastian Schorr ◽

Matthias Möller ◽

Jörg Heib ◽

Dirk Bähre

Keyword(s):

Machine Learning ◽

Random Forest ◽

Quality Prediction ◽

Machine Learning Method ◽

Learning Method ◽

Torque Measurements

Download Full-text

Effective Compatibility and Reduction of Data for Bigdata Applications

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijeat.a9821.109119 ◽

2019 ◽

Vol 9 (1) ◽

pp. 3781-3784

Keyword(s):

Machine Learning ◽

Random Forest ◽

Language Processing ◽

Processing Technique ◽

Unstructured Data ◽

Machine Learning Method ◽

Learning Method ◽

Source File ◽

Natural Language Processing Technique ◽

Input And Output

The system identifies a duplicate record from the database using the machine learning method. We must pass unstructured data. Data are prepared using any natural language processing technique such as text similarity. This prepared data is then fed into the latest machine learning method called Random Forest. After this data collection, using these files, the target file is compared to the source file. We make input and output files. This is carried out until accurate efficiency is generated

Download Full-text

A Predictive Model for Kidney Transplant Graft Survival using Machine Learning

10.5121/csit.2020.101609 ◽

2020 ◽

Author(s):

Eric S. Pahl ◽

W. Nick Street ◽

Hans J. Johnson ◽

Alan I. Reed

Keyword(s):

Machine Learning ◽

Random Forest ◽

Cox Regression ◽

Risk Index ◽

Error Rates ◽

Machine Learning Method ◽

Learning Method ◽

Kaplan Meier ◽

End Stage

Kidney transplantation is the best treatment for end-stage renal failure patients. The predominant method used for kidney quality assessment is the Cox regression-based, kidney donor risk index. A machine learning method may provide improved prediction of transplant outcomes and help decision-making. A popular tree-based machine learning method, random forest, was trained and evaluated with the same data originally used to develop the risk index (70,242 observations from 1995-2005). The random forest successfully predicted an additional 2,148 transplants than the risk index with equal type II error rates of 10%. Predicted results were analyzed with follow-up survival outcomes up to 240 months after transplant using Kaplan-Meier analysis and confirmed that the random forest performed significantly better than the risk index (p<0.05). The random forest predicted significantly more successful and longer-surviving transplants than the risk index. Random forests and other machine learning models may improve transplant decisions.

Download Full-text

Research of machine learning method for specific information recognition on the Internet

Proceedings. Fourth IEEE International Conference on Multimodal Interfaces ◽

10.1109/icmi.2002.1166998 ◽

2003 ◽

Cited By ~ 1

Author(s):

Dequan Zheng ◽

Yi Hu ◽

Tiejun Zhao ◽

Hao Yu ◽

Sheng Li

Keyword(s):

Machine Learning ◽

The Internet ◽

Specific Information ◽

Machine Learning Method ◽

Learning Method ◽

Information Recognition

Download Full-text

Tropical Overshooting Cloud-Top Height Retrieval from Himawari-8 Imagery Based on Random Forest Model

Atmosphere ◽

10.3390/atmos12020173 ◽

2021 ◽

Vol 12 (2) ◽

pp. 173

Author(s):

Gaoyun Wang ◽

Hongqing Wang ◽

Yizhou Zhuang ◽

Qiong Wu ◽

Siyue Chen ◽

...

Keyword(s):

Machine Learning ◽

Random Forest ◽

Brightness Temperature ◽

Zenith Angle ◽

Single Channel ◽

Interpolation Method ◽

Moisture Distribution ◽

Strong Impact ◽

Machine Learning Method ◽

Learning Method

Tropical overshooting convection has a strong impact on both heat budget and moisture distribution in the upper troposphere and lower stratosphere, and it can pose a great risk to aviation safety. Cloud-top height is one of the essential concerns of overshooting convection for both the climate system and the aviation weather forecast. The main purpose of our work is to verify the application of the machine learning method, taking the random forest (RF) model as an instance, in overshooting cloud-top height retrieval from Himawari-8 data. By using collocated CloudSat observations as a reference, we utilize several infrared indicators of Himawari-8 that are commonly recognized to relate to cloud-top height, along with some temporal and geographical parameters (latitude, month, satellite zenith angle, etc.), as predictors to construct and validate the model. Analysis of variable importance shows that the brightness temperature of 6.2 um acts as the dominant predictor, followed by satellite zenith angle, brightness temperature of 13.3 um, latitude, and month. In the comparison between the RF model and the traditional single-channel interpolation method, retrievals from the RF model agree well with observation with a high correlation coefficient (0.92), small RMSE (222 m), and small MAE (164 m), while these metrics from traditional single-channel interpolation method shows lower skills (0.70, 1305 m, and 1179 m). This work presents a new sight of overshooting cloud-top height retrieval based on the machine learning method.

Download Full-text