Opinion mining framework using proposed RB-bayes model for text classication

<p><span lang="EN-US">Information mining is a capable idea with incredible potential to anticipate future patterns and conduct. It alludes to the extraction of concealed information from vast data sets by utilizing procedures like factual examination, machine learning, grouping, neural systems and genetic algorithms. In naive baye’s, there exists a problem of zero likelihood. This paper proposed RB-Bayes method based on baye’s theorem for prediction to remove problem of zero likelihood. We also compare our method with few existing methods i.e. naive baye’s and SVM. We demonstrate that this technique is better than some current techniques and specifically can analyze data sets in better way. At the point when the proposed approach is tried on genuine data-sets, the outcomes got improved accuracy in most cases. RB-Bayes calculation having precision 83.333.</span></p>

Download Full-text

Lying on the Dissection Table: Anatomizing Faked Responses

10.31234/osf.io/2m5xw ◽

2021 ◽

Author(s):

Jessica Röhner ◽

Philipp Thoss ◽

Astrid Schütz

Keyword(s):

Machine Learning ◽

Logistic Regression ◽

Random Forest ◽

Input Data ◽

Data Sets ◽

Response Patterns ◽

Implicit Association ◽

Machine Learning Classifiers ◽

Self Reports ◽

Better Than

Research has shown that even experts cannot detect faking above chance, but recent studies have suggested that machine learning may help in this endeavor. However, faking differs between faking conditions, previous efforts have not taken these differences into account, and faking indices have yet to be integrated into such approaches. We reanalyzed seven data sets (N = 1,039) with various faking conditions (high and low scores, different constructs, naïve and informed faking, faking with and without practice, different measures [self-reports vs. implicit association tests; IATs]). We investigated the extent to which and how machine learning classifiers could detect faking under these conditions and compared different input data (response patterns, scores, faking indices) and different classifiers (logistic regression, random forest, XGBoost). We also explored the features that classifiers used for detection. Our results show that machine learning has the potential to detect faking, but detection success varies between conditions from chance levels to 100%. There were differences in detection (e.g., detecting low-score faking was better than detecting high-score faking). For self-reports, response patterns and scores were comparable with regard to faking detection, whereas for IATs, faking indices and response patterns were superior to scores. Logistic regression and random forest worked about equally well and outperformed XGBoost. In most cases, classifiers used more than one feature (faking occurred over different pathways), and the features varied in their relevance. Our research supports the assumption of different faking processes and explains why detecting faking is a complex endeavor.

Download Full-text

Predictive Analysis using Convolution Network on Sentiment Analysis of Text Classification using Machine Learning

INFORMATION TECHNOLOGY IN INDUSTRY ◽

10.17762/itii.v9i2.348 ◽

2021 ◽

Vol 9 (2) ◽

pp. 313-317

Author(s):

Vanitha kakollu, Et. al.

Keyword(s):

Machine Learning ◽

Sentiment Analysis ◽

Language Processing ◽

Opinion Mining ◽

Test Accuracy ◽

Data Sets ◽

Negative Data ◽

Term Extraction ◽

Positive Data ◽

Machine Learning Approach

Today we have large amounts of textual data to be processed and the procedure involved in classifying text is called natural language processing. The basic goal is to identify whether the text is positive or negative. This process is also called as opinion mining. In this paper, we consider three different data sets and perform sentiment analysis to find the test accuracy. We have three different cases- 1. If the text contains more positive data than negative data then the overall result leans towards positive. 2. If the text contains more negative data than positive data then the overall result leans towards negative. 3. In the final case the number or positive and negative data is nearly equal then we have a neutral output. For sentiment analysis we have several steps like term extraction, feature selection, sentiment classification etc. In this paper the key point of focus is on sentiment analysis by comparing the machine learning approach and lexicon-based approach and their respective accuracy loss graphs.

Download Full-text

A Systematic Review of Machine Learning for Assessment and Feedback of Treatment Fidelity

10.31234/osf.io/wmyuc ◽

2021 ◽

Author(s):

Asghar Ahmadi ◽

Michael Noetel ◽

Melissa Schellekens ◽

Philip David Parker ◽

Devan Antczak ◽

...

Keyword(s):

Machine Learning ◽

Best Practice ◽

Cost Effective ◽

Treatment Fidelity ◽

Data Sets ◽

Verbal Behaviour ◽

Automated Coding ◽

Objective Feedback ◽

Level Performance ◽

Better Than

Many psychological treatments have been shown to be cost-effective and efficacious, as long as they are implemented faithfully. Assessing fidelity and providing feedback is expensive and time-consuming. Machine learning has been used to assess treatment fidelity, but the reliability and generalisability is unclear. We collated and critiqued all implementations of machine learning to assess the verbal behaviour of all helping professionals, with particular emphasis on treatment fidelity for therapists. We conducted searches using nine electronic databases for automated approaches of coding verbal behaviour in therapy and similar contexts. We completed screening, extraction, and quality assessment in duplicate. Fifty-two studies met our inclusion criteria (65.3% in psychotherapy). Automated coding methods performed better than chance, and some methods showed near human-level performance; performance tended to be better with larger data sets, a smaller number of codes, conceptually simple codes, and when predicting session-level ratings than utterance-level ones. Few studies adhered to best-practice machine learning guidelines. Machine learning demonstrated promising results, particularly where there are large, annotated datasets and a modest number of concrete features to code. These methods are novel, cost-effective, scalable ways of assessing fidelity and providing therapists with individualised, prompt, and objective feedback.

Download Full-text

Applying Machine Learning Techniques for Performing Comparative Opinion Mining

Open Computer Science ◽

10.1515/comp-2020-0148 ◽

2020 ◽

Vol 10 (1) ◽

pp. 461-477

Author(s):

Umair Younis ◽

Muhammad Zubair Asghar ◽

Adil Khan ◽

Alamsher Khan ◽

Javed Iqbal ◽

...

Keyword(s):

Machine Learning ◽

Opinion Mining ◽

Random Forest Classifier ◽

Machine Learning Techniques ◽

Product Reviews ◽

Business Organizations ◽

Machine Learning Classifiers ◽

Learning Techniques ◽

Improved Accuracy ◽

Comparative Opinion Mining

AbstractIn recent times, comparative opinion mining applications have attracted both individuals and business organizations to compare the strengths and weakness of products. Prior works on comparative opinion mining have focused on applying a single classifier, limited comparative opinion labels, and limited dataset of product reviews, resulting in degraded performance for classifying comparative reviews. In this work, we perform multi-class comparative opinion mining by applying multiple machine learning classifiers using an increased number of comparative opinion labels (9 classes) on 4 datasets of comparative product reviews. The experimental results show that Random Forest classifier has outperformed the comparing algorithms in terms of improved accuracy, precision, recall and f-measure.

Download Full-text

ANALISA DAN PREDIKSI IKLAN LOWONGAN KERJA PALSU DENGAN METODE NATURAL LANGUAGE PROGRAMING DAN MACHINE LEARNING

Jurnal Informatika ◽

10.30873/ji.v21i1.2865 ◽

2021 ◽

Vol 21 (1) ◽

pp. 14-22

Author(s):

Hary Sabita ◽

Fitria Fitria ◽

Riko Herwanto

Keyword(s):

Machine Learning ◽

Gradient Descent ◽

Naive Bayes ◽

Group Discussion ◽

Naïve Bayes ◽

Stochastic Gradient Descent ◽

Baseline Model ◽

Bayes Model ◽

The Us ◽

Better Than

This research was conducted using the data provided by Kaggle. This data contains features that describe job vacancies. This study used location-based data in the US, which covered 60% of all data. Job vacancies that are posted are categorized as real or fake. This research was conducted by following five stages, namely: defining the problem, collecting data, cleaning data (exploration and pre-processing) and modeling. The evaluation and validation models use Naïve Bayes as a baseline model and Small Group Discussion as end model. For the Naïve Bayes model, an accuracy value of 0.971 and an F1-score of 0.743 is obtained. While the Stochastic Gradient Descent obtained an accuracy value of 0.977 and an F1-score of 0.81. These final results indicate that SGD performs slightly better than Naïve Bayes.Keywords—NLP, Machine Learning, Naïve Bayes, SGD, Fake Jobs

Download Full-text

Comparative Analysis of Random Forests with Statistical and Machine Learning Methods in Predicting Fault-Prone Classes

Cross-Disciplinary Applications of Artificial Intelligence and Pattern Recognition - Advances in Computational Intelligence and Robotics ◽

10.4018/978-1-61350-429-1.ch023 ◽

2012 ◽

pp. 428-449 ◽

Cited By ~ 1

Author(s):

Ruchika Malhotra ◽

Arvinder Kaur ◽

Yogesh Singh

Keyword(s):

Machine Learning ◽

Random Forests ◽

Data Sets ◽

Classification Problems ◽

Learning Methods ◽

Machine Learning Methods ◽

Code Metrics ◽

Machine Learning Models ◽

Better Than

There are available metrics for predicting fault prone classes, which may help software organizations for planning and performing testing activities. This may be possible due to proper allocation of resources on fault prone parts of the design and code of the software. Hence, importance and usefulness of such metrics is understandable, but empirical validation of these metrics is always a great challenge. Random Forest (RF) algorithm has been successfully applied for solving regression and classification problems in many applications. In this work, the authors predict faulty classes/modules using object oriented metrics and static code metrics. This chapter evaluates the capability of RF algorithm and compares its performance with nine statistical and machine learning methods in predicting fault prone software classes. The authors applied RF on six case studies based on open source, commercial software and NASA data sets. The results indicate that the prediction performance of RF is generally better than statistical and machine learning models. Further, the classification of faulty classes/modules using the RF method is better than the other methods in most of the data sets.

Download Full-text

Predictive Analysis using Convolution Network on Sentiment Analysis of Text Classification using Machine Learning

INFORMATION TECHNOLOGY IN INDUSTRY ◽

10.17762/itii.v9i2.349 ◽

2021 ◽

Vol 9 (2) ◽

pp. 318-322

Author(s):

Vanitha kakollu, Et. al.

Keyword(s):

Machine Learning ◽

Sentiment Analysis ◽

Language Processing ◽

Opinion Mining ◽

Test Accuracy ◽

Data Sets ◽

Negative Data ◽

Term Extraction ◽

Positive Data ◽

Machine Learning Approach

Today we have large amounts of textual data to be processed and the procedure involved in classifying text is called natural language processing. The basic goal is to identify whether the text is positive or negative. This process is also called as opinion mining. In this paper, we consider three different data sets and perform sentiment analysis to find the test accuracy. We have three different cases- 1. If the text contains more positive data than negative data then the overall result leans towards positive. 2. If the text contains more negative data than positive data then the overall result leans towards negative. 3. In the final case the number or positive and negative data is nearly equal then we have a neutral output. For sentiment analysis we have several steps like term extraction, feature selection, sentiment classification etc. In this paper the key point of focus is on sentiment analysis by comparing the machine learning approach and lexicon-based approach and their respective accuracy loss graphs.

Download Full-text

Improving Variational Autoencoder based Out-of-Distribution Detection for Embedded Real-time Applications

ACM Transactions on Embedded Computing Systems ◽

10.1145/3477026 ◽

2021 ◽

Vol 20 (5s) ◽

pp. 1-26

Author(s):

Yeli Feng ◽

Daniel Jun Xian Ng ◽

Arvind Easwaran

Keyword(s):

Machine Learning ◽

Real Time ◽

State Of The Art ◽

Autonomous Driving ◽

The State ◽

Theory And Practice ◽

Data Sets ◽

Variational Autoencoder ◽

Distribution Shifts ◽

Better Than

Uncertainties in machine learning are a significant roadblock for its application in safety-critical cyber-physical systems (CPS). One source of uncertainty arises from distribution shifts in the input data between training and test scenarios. Detecting such distribution shifts in real-time is an emerging approach to address the challenge. The high dimensional input space in CPS applications involving imaging adds extra difficulty to the task. Generative learning models are widely adopted for the task, namely out-of-distribution (OoD) detection. To improve the state-of-the-art, we studied existing proposals from both machine learning and CPS fields. In the latter, safety monitoring in real-time for autonomous driving agents has been a focus. Exploiting the spatiotemporal correlation of motion in videos, we can robustly detect hazardous motion around autonomous driving agents. Inspired by the latest advances in the Variational Autoencoder (VAE) theory and practice, we tapped into the prior knowledge in data to further boost OoD detection’s robustness. Comparison studies over nuScenes and Synthia data sets show our methods significantly improve detection capabilities of OoD factors unique to driving scenarios, 42% better than state-of-the-art approaches. Our model also generalized near-perfectly, 97% better than the state-of-the-art across the real-world and simulation driving data sets experimented. Finally, we customized one proposed method into a twin-encoder model that can be deployed to resource limited embedded devices for real-time OoD detection. Its execution time was reduced over four times in low-precision 8-bit integer inference, while detection capability is comparable to its corresponding floating-point model.

Download Full-text

Opinion Mining Embedding with Applications to Opinions

International Journal of Engineering & Technology ◽

10.14419/ijet.v7i3.27.17760 ◽

2018 ◽

Vol 7 (3.27) ◽

pp. 192

Author(s):

Pavani . ◽

U V. Anbazhagu ◽

Bhavadharani . ◽

M Latha ◽

J Senthil

Keyword(s):

Machine Learning ◽

Opinion Mining ◽

Data Sets ◽

The Real ◽

Word Level ◽

Sentence Level ◽

Internet Business ◽

The One ◽

Item Data

The main objective of this project, we portray strategies to consequently create and score another estimation vocabulary, called sentimental analysis. Sentimental analysis is the one of the real errands of machine learning processing. Individuals post their own emotions and contemplating any items for an internet business website, (for example, Amazon, Flip card etc).sometime individuals needs to know whether these posts are positive, negative or unbiased. Existing word inserting learning calculations regularly just utilize the settings of words yet disregard the assumption of writings. Now we are applying enclose to word level assumption and stepwise level supposition arrangement, and estimation vocabularies. Information utilized as a part of this study are online item data sets are gathered from amazon.com. Experiments for both sentence-level and word-level are performed.

Download Full-text