Sentiment classification for employees reviews using regression vector- stochastic gradient descent classifier (RV-SGDC)

PeerJ Computer Science ◽

10.7717/peerj-cs.712 ◽

2021 ◽

Vol 7 ◽

pp. e712

Author(s):

Babacar Gaye ◽

Dezheng Zhang ◽

Aziguli Wulamu

Keyword(s):

Machine Learning ◽

Gradient Descent ◽

Stochastic Gradient ◽

Sentiment Classification ◽

Majority Voting ◽

Stochastic Gradient Descent ◽

Support Vector ◽

Hybrid Architecture ◽

Accuracy Score ◽

Learning Models

The satisfaction of employees is very important for any organization to make sufficient progress in production and to achieve its goals. Organizations try to keep their employees satisfied by making their policies according to employees’ demands which help to create a good environment for the collective. For this reason, it is beneficial for organizations to perform staff satisfaction surveys to be analyzed, allowing them to gauge the levels of satisfaction among employees. Sentiment analysis is an approach that can assist in this regard as it categorizes sentiments of reviews into positive and negative results. In this study, we perform experiments for the world’s big six companies and classify their employees’ reviews based on their sentiments. For this, we proposed an approach using lexicon-based and machine learning based techniques. Firstly, we extracted the sentiments of employees from text reviews and labeled the dataset as positive and negative using TextBlob. Then we proposed a hybrid/voting model named Regression Vector-Stochastic Gradient Descent Classifier (RV-SGDC) for sentiment classification. RV-SGDC is a combination of logistic regression, support vector machines, and stochastic gradient descent. We combined these models under a majority voting criteria. We also used other machine learning models in the performance comparison of RV-SGDC. Further, three feature extraction techniques: term frequency-inverse document frequency (TF-IDF), bag of words, and global vectors are used to train learning models. We evaluated the performance of all models in terms of accuracy, precision, recall, and F1 score. The results revealed that RV-SGDC outperforms with a 0.97 accuracy score using the TF-IDF feature due to its hybrid architecture.

APPLICATION OF MACHINE LEARNING ALGORITHMS FOR PROCESSING COMMENTS FROM THE YOUTUBE VIDEO HOSTING UNDER TRAINING VIDEOS

Science and Transport Progress Bulletin of Dnipropetrovsk National University of Railway Transport ◽

10.15802/stp2020/225264 ◽

2021 ◽

pp. 33-42

Author(s):

L. S. Koriashkina ◽

H. V. Symonets

Keyword(s):

Machine Learning ◽

Gradient Descent ◽

Russian Language ◽

Stochastic Gradient ◽

Machine Learning Algorithms ◽

Stochastic Gradient Descent ◽

Support Vector ◽

Learning Methods ◽

Machine Learning Methods ◽

Gradient Enhancement

Purpose. Detecting toxic comments on YouTube video hosting under training videos by classifying unstructured text using a combination of machine learning methods. Methodology. To work with the specified type of data, machine learning methods were used for cleaning, normalizing, and presenting textual data in a form acceptable for processing on a computer. Directly to classify comments as “toxic”, we used a logistic regression classifier, a linear support vector classification method without and with a learning method – stochastic gradient descent, a random forest classifier and a gradient enhancement classifier. In order to assess the work of the classifiers, the methods of calculating the matrix of errors, accuracy, completeness and F-measure were used. For a more generalized assessment, a cross-validation method was used. Python programming language. Findings. Based on the assessment indicators, the most optimal methods were selected – support vector machine (Linear SVM), without and with the training method using stochastic gradient descent. The described technologies can be used to analyze the textual comments under any training videos to detect toxic reviews. Also, the approach can be useful for identifying unwanted or even aggressive information on social networks or services where reviews are provided. Originality. It consists in a combination of methods for preprocessing a specific type of text, taking into account such features as the possibility of having a timecode, emoji, links, and the like, as well as in the adaptation of classification methods of machine learning for the analysis of Russian-language comments. Practical value. It is about optimizing (simplification) the comment analysis process. The need for this processing is due to the growing volumes of text data, especially in the field of education through quarantine conditions and the transition to distance learning. The volume of educational Internet content already needs to automate the processing and analysis of feedback, over time this need will only grow.

Exchangeability and Kernel Invariance in Trained MLPs

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/498 ◽

2019 ◽

Author(s):

Russell Tsuchida ◽

Fred Roosta ◽

Marcus Gallagher

Keyword(s):

Machine Learning ◽

Gradient Descent ◽

Stochastic Gradient ◽

Stochastic Gradient Descent ◽

Learning Models ◽

Fully Connected ◽

Shed Light ◽

Machine Learning Models

In the analysis of machine learning models, it is often convenient to assume that the parameters are IID. This assumption is not satisfied when the parameters are updated through training processes such as Stochastic Gradient Descent. A relaxation of the IID condition is a probabilistic symmetry known as exchangeability. We show the sense in which the weights in MLPs are exchangeable. This yields the result that in certain instances, the layer-wise kernel of fully-connected layers remains approximately constant during training. Our results shed light on such kernel properties throughout training while limiting the use of unrealistic assumptions.

Linear Support Vector Machine (SVM) with Stochastic Gradient Descent (SGD) training and multinomial Nave Bayes (NB) in News Classification

International Journal of Computer Sciences and Engineering ◽

10.26438/ijcse/v7i4.360363 ◽

2019 ◽

Vol 7 (4) ◽

pp. 360-363

Author(s):

Feroz Ahmed ◽

Shabina Ghafir

Keyword(s):

Support Vector Machine ◽

Gradient Descent ◽

Stochastic Gradient ◽

Stochastic Gradient Descent ◽

Support Vector ◽

Linear Support Vector Machine

Hyperspectral Image Classification Using Stochastic Gradient Descent Based Support Vector Machine

Learning and Analytics in Intelligent Systems - Biologically Inspired Techniques in Many-Criteria Decision Making ◽

10.1007/978-3-030-39033-4_8 ◽

2020 ◽

pp. 78-84

Author(s):

Pattem Sampurnima ◽

Sandeep Kumar Satapathy ◽

Shruti Mishra ◽

Pradeep Kumar Mallick

Keyword(s):

Support Vector Machine ◽

Image Classification ◽

Gradient Descent ◽

Hyperspectral Image ◽

Stochastic Gradient ◽

Stochastic Gradient Descent ◽

Support Vector ◽

Hyperspectral Image Classification

Large-Scale Machine Learning with Stochastic Gradient Descent Léon Bottou

Statistical Learning and Data Science ◽

10.1201/b11429-6 ◽

2011 ◽

pp. 33-42 ◽

Cited By ~ 1

Keyword(s):

Machine Learning ◽

Gradient Descent ◽

Large Scale ◽

Stochastic Gradient ◽

Stochastic Gradient Descent

Large-Scale Machine Learning with Stochastic Gradient Descent

Proceedings of COMPSTAT'2010 ◽

10.1007/978-3-7908-2604-3_16 ◽

2010 ◽

pp. 177-186 ◽

Cited By ~ 1247

Author(s):

Léon Bottou

Keyword(s):

Machine Learning ◽

Gradient Descent ◽

Large Scale ◽

Stochastic Gradient ◽

Stochastic Gradient Descent

Stochastic Subgradient for Large-Scale Support Vector Machine Using the Generalized Pinball Loss Function

Symmetry ◽

10.3390/sym13091652 ◽

2021 ◽

Vol 13 (9) ◽

pp. 1652

Author(s):

Wanida Panup ◽

Rabian Wangkeeree

Keyword(s):

Support Vector Machine ◽

Loss Function ◽

Gradient Descent ◽

Large Scale ◽

Stochastic Gradient ◽

Stochastic Gradient Descent ◽

Support Vector ◽

Hinge Loss ◽

Gradient Descent Algorithm ◽

Pinball Loss

In this paper, we propose a stochastic gradient descent algorithm, called stochastic gradient descent method-based generalized pinball support vector machine (SG-GPSVM), to solve data classification problems. This approach was developed by replacing the hinge loss function in the conventional support vector machine (SVM) with a generalized pinball loss function. We show that SG-GPSVM is convergent and that it approximates the conventional generalized pinball support vector machine (GPSVM). Further, the symmetric kernel method was adopted to evaluate the performance of SG-GPSVM as a nonlinear classifier. Our suggested algorithm surpasses existing methods in terms of noise insensitivity, resampling stability, and accuracy for large-scale data scenarios, according to the experimental results.

Support Vector Machine And K-Nearest Neighbor Based Liver Disease Classification Model

Indonesian Journal of electronics, electromedical engineering, and medical informatics ◽

10.35882/ijeeemi.v3i1.2 ◽

2021 ◽

Vol 3 (1) ◽

pp. 9-14

Author(s):

Tsehay Admassu Assegie

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Liver Disease ◽

Classification Model ◽

Support Vector ◽

Disease Prediction ◽

Accuracy Score ◽

Learning Models ◽

Accuracy And Precision ◽

Machine Learning Models

Machine-learning approaches have become greatly applicable in disease diagnosis and prediction process. This is because of the accuracy and better precision of the machine learning models in disease prediction. However, different machine learning models have different accuracy and precision on disease prediction. Selecting the better model that would result in better disease prediction accuracy and precision is an open research problem. In this study, we have proposed machine learning model for liver disease prediction using Support Vector Machine (SVM) and K-Nearest Neighbors (KNN) learning algorithms and we have evaluated the accuracy and precision of the models on liver disease prediction using the Indian liver disease data repository. The analysis of result showed 82.90% accuracy for SVM and 72.64% accuracy for the KNN algorithm. Based on the accuracy score of SVM and KNN on experimental test results, the SVM is better in performance on the liver disease prediction than the KNN algorithm.

Analyzing Fake News Based on Machine Learning Algorithms

Intelligent Systems and Computer Technology - Advances in Parallel Computing ◽

10.3233/apc200146 ◽

2020 ◽

Author(s):

Pawar A B ◽

Jawale M A ◽

Kyatanavar D N

Keyword(s):

Language Processing ◽

Gradient Descent ◽

Human Life ◽

Stochastic Gradient ◽

Machine Learning Algorithms ◽

Stochastic Gradient Descent ◽

Gradient Boosting ◽

Support Vector ◽

Fake News ◽

Processing Techniques

Usages of Natural Language Processing techniques in the field of detection of fake news is analyzed in this research paper. Fake news are misleading concepts spread by invalid resources can provide damages to human-life, society. To carry out this analysis work, dataset obtained from web resource OpenSources.co is used which is mainly part of Signal Media. The document frequency terms as TF-IDF of bi-grams used in correlation with PCFG (Probabilistic Context Free Grammar) on a set of 11,000 documents extracted as news articles. This set tested on classification algorithms namely SVM (Support Vector Machines), Stochastic Gradient Descent, Bounded Decision Trees, Gradient Boosting algorithm with Random Forests. In experimental analysis, found that combination of Stochastic Gradient Descent with TF-IDF of bi-grams gives an accuracy of 77.2% in detecting fake contents, which observes with PCFGs having slight recalling defects

Detection of Online Fake News Using Blending Ensemble Learning

Scientific Programming ◽

10.1155/2021/3434458 ◽

2021 ◽

Vol 2021 ◽

pp. 1-10

Author(s):

Arvin Hansrajh ◽

Timothy T. Adeliyi ◽

Jeanette Wing

Keyword(s):

Machine Learning ◽

Performance Metrics ◽

Machine Learning Algorithms ◽

Stochastic Gradient Descent ◽

Support Vector ◽

Fake News ◽

Learning Models ◽

Linear Discriminant ◽

Proposed Model ◽

Machine Learning Models

The exponential growth in fake news and its inherent threat to democracy, public trust, and justice has escalated the necessity for fake news detection and mitigation. Detecting fake news is a complex challenge as it is intentionally written to mislead and hoodwink. Humans are not good at identifying fake news. The detection of fake news by humans is reported to be at a rate of 54% and an additional 4% is reported in the literature as being speculative. The significance of fighting fake news is exemplified during the present pandemic. Consequently, social networks are ramping up the usage of detection tools and educating the public in recognising fake news. In the literature, it was observed that several machine learning algorithms have been applied to the detection of fake news with limited and mixed success. However, several advanced machine learning models are not being applied, although recent studies are demonstrating the efﬁcacy of the ensemble machine learning approach; hence, the purpose of this study is to assist in the automated detection of fake news. An ensemble approach is adopted to help resolve the identified gap. This study proposed a blended machine learning ensemble model developed from logistic regression, support vector machine, linear discriminant analysis, stochastic gradient descent, and ridge regression, which is then used on a publicly available dataset to predict if a news report is true or not. The proposed model will be appraised with the popular classical machine learning models, while performance metrics such as AUC, ROC, recall, accuracy, precision, and f1-score will be used to measure the performance of the proposed model. Results presented showed that the proposed model outperformed other popular classical machine learning models.