scholarly journals Sentiment classification for employees reviews using regression vector- stochastic gradient descent classifier (RV-SGDC)

2021 ◽  
Vol 7 ◽  
pp. e712
Author(s):  
Babacar Gaye ◽  
Dezheng Zhang ◽  
Aziguli Wulamu

The satisfaction of employees is very important for any organization to make sufficient progress in production and to achieve its goals. Organizations try to keep their employees satisfied by making their policies according to employees’ demands which help to create a good environment for the collective. For this reason, it is beneficial for organizations to perform staff satisfaction surveys to be analyzed, allowing them to gauge the levels of satisfaction among employees. Sentiment analysis is an approach that can assist in this regard as it categorizes sentiments of reviews into positive and negative results. In this study, we perform experiments for the world’s big six companies and classify their employees’ reviews based on their sentiments. For this, we proposed an approach using lexicon-based and machine learning based techniques. Firstly, we extracted the sentiments of employees from text reviews and labeled the dataset as positive and negative using TextBlob. Then we proposed a hybrid/voting model named Regression Vector-Stochastic Gradient Descent Classifier (RV-SGDC) for sentiment classification. RV-SGDC is a combination of logistic regression, support vector machines, and stochastic gradient descent. We combined these models under a majority voting criteria. We also used other machine learning models in the performance comparison of RV-SGDC. Further, three feature extraction techniques: term frequency-inverse document frequency (TF-IDF), bag of words, and global vectors are used to train learning models. We evaluated the performance of all models in terms of accuracy, precision, recall, and F1 score. The results revealed that RV-SGDC outperforms with a 0.97 accuracy score using the TF-IDF feature due to its hybrid architecture.

Author(s):  
L. S. Koriashkina ◽  
H. V. Symonets

Purpose. Detecting toxic comments on YouTube video hosting under training videos by classifying unstructured text using a combination of machine learning methods. Methodology. To work with the specified type of data, machine learning methods were used for cleaning, normalizing, and presenting textual data in a form acceptable for processing on a computer. Directly to classify comments as “toxic”, we used a logistic regression classifier, a linear support vector classification method without and with a learning method – stochastic gradient descent, a random forest classifier and a gradient enhancement classifier. In order to assess the work of the classifiers, the methods of calculating the matrix of errors, accuracy, completeness and F-measure were used. For a more generalized assessment, a cross-validation method was used. Python programming language. Findings. Based on the assessment indicators, the most optimal methods were selected – support vector machine (Linear SVM), without and with the training method using stochastic gradient descent. The described technologies can be used to analyze the textual comments under any training videos to detect toxic reviews. Also, the approach can be useful for identifying unwanted or even aggressive information on social networks or services where reviews are provided. Originality. It consists in a combination of methods for preprocessing a specific type of text, taking into account such features as the possibility of having a timecode, emoji, links, and the like, as well as in the adaptation of classification methods of machine learning for the analysis of Russian-language comments. Practical value. It is about optimizing (simplification) the comment analysis process. The need for this processing is due to the growing volumes of text data, especially in the field of education through quarantine conditions and the transition to distance learning. The volume of educational Internet content already needs to automate the processing and analysis of feedback, over time this need will only grow.


Author(s):  
Russell Tsuchida ◽  
Fred Roosta ◽  
Marcus Gallagher

In the analysis of machine learning models, it is often convenient to assume that the parameters are IID. This assumption is not satisfied when the parameters are updated through training processes such as Stochastic Gradient Descent. A relaxation of the IID condition is a probabilistic symmetry known as exchangeability. We show the sense in which the weights in MLPs are exchangeable. This yields the result that in certain instances, the layer-wise kernel of fully-connected layers remains approximately constant during training. Our results shed light on such kernel properties throughout training while limiting the use of unrealistic assumptions.


Symmetry ◽  
2021 ◽  
Vol 13 (9) ◽  
pp. 1652
Author(s):  
Wanida Panup ◽  
Rabian Wangkeeree

In this paper, we propose a stochastic gradient descent algorithm, called stochastic gradient descent method-based generalized pinball support vector machine (SG-GPSVM), to solve data classification problems. This approach was developed by replacing the hinge loss function in the conventional support vector machine (SVM) with a generalized pinball loss function. We show that SG-GPSVM is convergent and that it approximates the conventional generalized pinball support vector machine (GPSVM). Further, the symmetric kernel method was adopted to evaluate the performance of SG-GPSVM as a nonlinear classifier. Our suggested algorithm surpasses existing methods in terms of noise insensitivity, resampling stability, and accuracy for large-scale data scenarios, according to the experimental results.


Author(s):  
Tsehay Admassu Assegie

Machine-learning approaches have become greatly applicable in disease diagnosis and prediction process. This is because of the accuracy and better precision of the machine learning models in disease prediction. However, different machine learning models have different accuracy and precision on disease prediction. Selecting the better model that would result in better disease prediction accuracy and precision is an open research problem. In this study, we have proposed machine learning model for liver disease prediction using Support Vector Machine (SVM) and K-Nearest Neighbors (KNN) learning algorithms and we have evaluated the accuracy and precision of the models on liver disease prediction using the Indian liver disease data repository. The analysis of result showed 82.90% accuracy for SVM and 72.64% accuracy for the KNN algorithm. Based on the accuracy score of SVM and KNN on experimental test results, the SVM is better in performance on the liver disease prediction than the KNN algorithm.  


Author(s):  
Pawar A B ◽  
Jawale M A ◽  
Kyatanavar D N

Usages of Natural Language Processing techniques in the field of detection of fake news is analyzed in this research paper. Fake news are misleading concepts spread by invalid resources can provide damages to human-life, society. To carry out this analysis work, dataset obtained from web resource OpenSources.co is used which is mainly part of Signal Media. The document frequency terms as TF-IDF of bi-grams used in correlation with PCFG (Probabilistic Context Free Grammar) on a set of 11,000 documents extracted as news articles. This set tested on classification algorithms namely SVM (Support Vector Machines), Stochastic Gradient Descent, Bounded Decision Trees, Gradient Boosting algorithm with Random Forests. In experimental analysis, found that combination of Stochastic Gradient Descent with TF-IDF of bi-grams gives an accuracy of 77.2% in detecting fake contents, which observes with PCFGs having slight recalling defects


2021 ◽  
Vol 2021 ◽  
pp. 1-10
Author(s):  
Arvin Hansrajh ◽  
Timothy T. Adeliyi ◽  
Jeanette Wing

The exponential growth in fake news and its inherent threat to democracy, public trust, and justice has escalated the necessity for fake news detection and mitigation. Detecting fake news is a complex challenge as it is intentionally written to mislead and hoodwink. Humans are not good at identifying fake news. The detection of fake news by humans is reported to be at a rate of 54% and an additional 4% is reported in the literature as being speculative. The significance of fighting fake news is exemplified during the present pandemic. Consequently, social networks are ramping up the usage of detection tools and educating the public in recognising fake news. In the literature, it was observed that several machine learning algorithms have been applied to the detection of fake news with limited and mixed success. However, several advanced machine learning models are not being applied, although recent studies are demonstrating the efficacy of the ensemble machine learning approach; hence, the purpose of this study is to assist in the automated detection of fake news. An ensemble approach is adopted to help resolve the identified gap. This study proposed a blended machine learning ensemble model developed from logistic regression, support vector machine, linear discriminant analysis, stochastic gradient descent, and ridge regression, which is then used on a publicly available dataset to predict if a news report is true or not. The proposed model will be appraised with the popular classical machine learning models, while performance metrics such as AUC, ROC, recall, accuracy, precision, and f1-score will be used to measure the performance of the proposed model. Results presented showed that the proposed model outperformed other popular classical machine learning models.


Sign in / Sign up

Export Citation Format

Share Document