K′ times k-means logistic regression algorithm for imbalanced classification

Author(s):  
Yanfeng Zhang Lichun Wang
2020 ◽  
Vol 30 (1) ◽  
pp. 192-208 ◽  
Author(s):  
Hamza Aldabbas ◽  
Abdullah Bajahzar ◽  
Meshrif Alruily ◽  
Ali Adil Qureshi ◽  
Rana M. Amir Latif ◽  
...  

Abstract To maintain the competitive edge and evaluating the needs of the quality app is in the mobile application market. The user’s feedback on these applications plays an essential role in the mobile application development industry. The rapid growth of web technology gave people an opportunity to interact and express their review, rate and share their feedback about applications. In this paper we have scrapped 506259 of user reviews and applications rate from Google Play Store from 14 different categories. The statistical information was measured in the results using different of common machine learning algorithms such as the Logistic Regression, Random Forest Classifier, and Multinomial Naïve Bayes. Different parameters including the accuracy, precision, recall, and F1 score were used to evaluate Bigram, Trigram, and N-gram, and the statistical result of these algorithms was compared. The analysis of each algorithm, one by one, is performed, and the result has been evaluated. It is concluded that logistic regression is the best algorithm for review analysis of the Google Play Store applications. The results have been checked scientifically, and it is found that the accuracy of the logistic regression algorithm for analyzing different reviews based on three classes, i.e., positive, negative, and neutral.


2021 ◽  
Vol 2083 (3) ◽  
pp. 032059
Author(s):  
Qiang Chen ◽  
Meiling Deng

Abstract Regression algorithms are commonly used in machine learning. Based on encryption and privacy protection methods, the current key hot technology regression algorithm and the same encryption technology are studied. This paper proposes a PPLAR based algorithm. The correlation between data items is obtained by logistic regression formula. The algorithm is distributed and parallelized on Hadoop platform to improve the computing speed of the cluster while ensuring the average absolute error of the algorithm.


Ethiopia has a great agricultural potential because of its vast areas of fertile land, diverse climate, generally adequate rainfall, and large labor force. With its verified importance to the Ethiopian economy, there is sufficient evidence to show that the potential of the agricultural sector can be expanded considerably by attracting investors towards the sector. This study aims at applying classification techniques in developing a predictive model that can estimate yield production of vegetable crops and the correlation of crops based on their class. In the process of building a model, different steps were undertaken. Among the steps, data collection, data preprocessing and model building and validation were the major ones. Different tasks performed in each step are mentioned as follows. The data were collected Food and Agriculture Organization of the United Nations (FAO). Under preprocessing, data cleaning, discretization and attribute selection were done. The final step was model building and validation and it was performed using the selected tools and techniques. The data mining tool used in this research was Weka. In this software the logistic regression algorithm was selected since it is capable to score more accuracy. After successive experiments were done using this software, a model that can classify crop yield as high, medium and low with better accuracy to the extent of 88.6%. Experimental results show that logistic regression is a very helpful tool to depict the contribution of yield estimation and crop correlation. The reported findings are optimistic, making the proposed model a useful tool in the decision making process. Eventually, the whole research process can be a good input for further indepth research


Author(s):  
Charles M. Pérez-Espinoza ◽  
Nuvia Beltran-Robayo ◽  
Teresa Samaniego-Cobos ◽  
Abel Alarcón-Salvatierra ◽  
Ana Rodriguez-Mendez ◽  
...  

Scientific Knowledge and Electronic devices are growing day by day. In this aspect, many expert systems are involved in the healthcare industry using machine learning algorithms. Deep neural networks beat the machine learning techniques and often take raw data i.e., unrefined data to calculate the target output. Deep learning or feature learning is used to focus on features which is very important and gives a complete understanding of the model generated. Existing methodology used data mining technique like rule based classification algorithm and machine learning algorithm like hybrid logistic regression algorithm to preprocess data and extract meaningful insights of data. This is, however a supervised data. The proposed work is based on unsupervised data that is there is no labelled data and deep neural techniques is deployed to get the target output. Machine learning algorithms are compared with proposed deep learning techniques using TensorFlow and Keras in the aspect of accuracy. Deep learning methodology outfits the existing rule based classification and hybrid logistic regression algorithm in terms of accuracy. The designed methodology is tested on the public MIT-BIH arrhythmia database, classifying four kinds of abnormal beats. The proposed approach based on deep learning technique offered a better performance, improving the results when compared to machine learning approaches of the state-of-the-art


2019 ◽  
Vol 8 (4) ◽  
pp. 9044-9049

Diabetes mellitus is defined as a one of the chronic and deadliest diseases which combined with abnormally high level of sugar (glucose) in the blood. The classification technique helps in diagnosis the symptoms at starting stages. This paper focused to prognosticate the chance of diabetes in patients with extremely correct classification of Diabetes. The classification algorithms viz., Naïve Bayes, Logistic Regression, and Decision Tree can be used to detect diabetes at an early stage. The algorithm performances are evaluated based on various measures like Recall, Precision, and F-Measure. Experiments are conducted where the time complexity of each of the algorithm is measured. Accuracy is also measured over correct classification and misclassification instances, observed that a Logistic Regression algorithm has much better performance when compared to the other type classifications. Using Receiver Operating Characteristic curves the results are verified in a systematic manner.


Sign in / Sign up

Export Citation Format

Share Document