Enabling Accurate Indoor Localization Using a Machine Learning Algorithm

In this paper, fingerprint referencing methods based on wireless fidelity Wi-Fi received signal strength (RSS) have used for indoor positioning. More precisely, Naïve Bayes, decision tree (DT), and support vector machine (SVM) one-to-one multi-classes and error-correcting-output-codes classifier are to enable accurate indoor positioning. Then, normalization is used to reduce positioning error by reducing the fluctuation and diverse distribution of the RSS values. Different devices are used in this experiment; the training dataset is not included in the main dataset. Nonetheless, the learned model by the SVM algorithm cannot be affected by the elimination of train datasets of the test device. The efficiency of DT is lower than the other machine learning algorithms, because it performs by Boolean function, and it provides the low accuracy of prediction for dataset than the algorithms. Naïve Bayes technique based on Bayes Theorem is better than DT and close to SVM for positioning approves that 1–1.5 m positioning accuracy for indoor environments can be achieved by the proposed approach which is an excellent result than traditional protocol.

Download Full-text

Predicting Student’s Performance Using Machine Learning Algorithm

International Journal of Advanced Research in Science, Communication and Technology ◽

10.48175/ijarsct-1209 ◽

2021 ◽

pp. 53-58

Author(s):

Sheela Rani P ◽

Dhivya S ◽

Dharshini Priya M ◽

Dharmila Chowdary A

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Prediction Model ◽

Naive Bayes ◽

Learning Algorithm ◽

Naïve Bayes ◽

Machine Learning Algorithms ◽

Support Vector ◽

Learning Approaches ◽

K Nearest Neighbors

Machine learning is a new analysis discipline that uses knowledge to boost learning, optimizing the training method and developing the atmosphere within which learning happens. There square measure 2 sorts of machine learning approaches like supervised and unsupervised approach that square measure accustomed extract the knowledge that helps the decision-makers in future to require correct intervention. This paper introduces an issue that influences students' tutorial performance prediction model that uses a supervised variety of machine learning algorithms like support vector machine , KNN(k-nearest neighbors), Naïve Bayes and supplying regression and logistic regression. The results supported by various algorithms are compared and it is shown that the support vector machine and Naïve Bayes performs well by achieving improved accuracy as compared to other algorithms. The final prediction model during this paper may have fairly high prediction accuracy .The objective is not just to predict future performance of students but also provide the best technique for finding the most impactful features that influence student’s while studying.

Download Full-text

Cyber Bullying Detection for Twitter Using ML Classification Algorithms

International Journal for Research in Applied Science and Engineering Technology ◽

10.22214/ijraset.2021.38701 ◽

2021 ◽

Vol 9 (11) ◽

pp. 24-29

Author(s):

Muskan Patidar

Keyword(s):

Machine Learning ◽

Social Media ◽

Natural Language ◽

Naive Bayes ◽

Learning Algorithms ◽

Naïve Bayes ◽

Cyber Bullying ◽

Machine Learning Algorithms ◽

Support Vector ◽

Classification Algorithms

Abstract: Social networking platforms have given us incalculable opportunities than ever before, and its benefits are undeniable. Despite benefits, people may be humiliated, insulted, bullied, and harassed by anonymous users, strangers, or peers. Cyberbullying refers to the use of technology to humiliate and slander other people. It takes form of hate messages sent through social media and emails. With the exponential increase of social media users, cyberbullying has been emerged as a form of bullying through electronic messages. We have tried to propose a possible solution for the above problem, our project aims to detect cyberbullying in tweets using ML Classification algorithms like Naïve Bayes, KNN, Decision Tree, Random Forest, Support Vector etc. and also we will apply the NLTK (Natural language toolkit) which consist of bigram, trigram, n-gram and unigram on Naïve Bayes to check its accuracy. Finally, we will compare the results of proposed and baseline features with other machine learning algorithms. Findings of the comparison indicate the significance of the proposed features in cyberbullying detection. Keywords: Cyber bullying, Machine Learning Algorithms, Twitter, Natural Language Toolkit

Download Full-text

Machine Learning Readmission Risk Modeling: A Pediatric Case Study

BioMed Research International ◽

10.1155/2019/8532892 ◽

2019 ◽

Vol 2019 ◽

pp. 1-9 ◽

Cited By ~ 3

Author(s):

Patricio Wolff ◽

Manuel Graña ◽

Sebastián A. Ríos ◽

Maria Begoña Yarza

Keyword(s):

Machine Learning ◽

Multilayer Perceptron ◽

Naive Bayes ◽

Class Imbalance ◽

Predictive Performance ◽

Naïve Bayes ◽

Distribution Model ◽

Training Dataset ◽

Support Vector ◽

Pediatric Hospital

Background. Hospital readmission prediction in pediatric hospitals has received little attention. Studies have focused on the readmission frequency analysis stratified by disease and demographic/geographic characteristics but there are no predictive modeling approaches, which may be useful to identify preventable readmissions that constitute a major portion of the cost attributed to readmissions.Objective. To assess the all-cause readmission predictive performance achieved by machine learning techniques in the emergency department of a pediatric hospital in Santiago, Chile.Materials. An all-cause admissions dataset has been collected along six consecutive years in a pediatric hospital in Santiago, Chile. The variables collected are the same used for the determination of the child’s treatment administrative cost.Methods. Retrospective predictive analysis of 30-day readmission was formulated as a binary classification problem. We report classification results achieved with various model building approaches after data curation and preprocessing for correction of class imbalance. We compute repeated cross-validation (RCV) with decreasing number of folders to assess performance and sensitivity to effect of imbalance in the test set and training set size.Results. Increase in recall due to SMOTE class imbalance correction is large and statistically significant. The Naive Bayes (NB) approach achieves the best AUC (0.65); however the shallow multilayer perceptron has the best PPV and f-score (5.6 and 10.2, resp.). The NB and support vector machines (SVM) give comparable results if we consider AUC, PPV, and f-score ranking for all RCV experiments. High recall of deep multilayer perceptron is due to high false positive ratio. There is no detectable effect of the number of folds in the RCV on the predictive performance of the algorithms.Conclusions. We recommend the use of Naive Bayes (NB) with Gaussian distribution model as the most robust modeling approach for pediatric readmission prediction, achieving the best results across all training dataset sizes. The results show that the approach could be applied to detect preventable readmissions.

Download Full-text

Evaluation of Prognosis in Nasopharyngeal Cancer Using Machine Learning

Technology in Cancer Research & Treatment ◽

10.1177/1533033820909829 ◽

2020 ◽

Vol 19 ◽

pp. 153303382090982

Author(s):

Melek Akcay ◽

Durmus Etiz ◽

Ozer Celik ◽

Alaattin Ozen

Keyword(s):

Machine Learning ◽

Logistic Regression ◽

Naive Bayes ◽

Nasopharyngeal Cancer ◽

Naïve Bayes ◽

Machine Learning Algorithms ◽

Support Vector ◽

Tumor Diameter ◽

Survival Prognosis ◽

Data Set

Background and Aim: Although the prognosis of nasopharyngeal cancer largely depends on a classification based on the tumor-lymph node metastasis staging system, patients at the same stage may have different clinical outcomes. This study aimed to evaluate the survival prognosis of nasopharyngeal cancer using machine learning. Settings and Design: Original, retrospective. Materials and Methods: A total of 72 patients with a diagnosis of nasopharyngeal cancer who received radiotherapy ± chemotherapy were included in the study. The contribution of patient, tumor, and treatment characteristics to the survival prognosis was evaluated by machine learning using the following techniques: logistic regression, artificial neural network, XGBoost, support-vector clustering, random forest, and Gaussian Naive Bayes. Results: In the analysis of the data set, correlation analysis, and binary logistic regression analyses were applied. Of the 18 independent variables, 10 were found to be effective in predicting nasopharyngeal cancer-related mortality: age, weight loss, initial neutrophil/lymphocyte ratio, initial lactate dehydrogenase, initial hemoglobin, radiotherapy duration, tumor diameter, number of concurrent chemotherapy cycles, and T and N stages. Gaussian Naive Bayes was determined as the best algorithm to evaluate the prognosis of machine learning techniques (accuracy rate: 88%, area under the curve score: 0.91, confidence interval: 0.68-1, sensitivity: 75%, specificity: 100%). Conclusion: Many factors affect prognosis in cancer, and machine learning algorithms can be used to determine which factors have a greater effect on survival prognosis, which then allows further research into these factors. In the current study, Gaussian Naive Bayes was identified as the best algorithm for the evaluation of prognosis of nasopharyngeal cancer.

Download Full-text

Future Prediction of Diabetics using XG Booster Classifiers

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijeat.c5144.029320 ◽

2020 ◽

Vol 9 (3) ◽

pp. 2128-2132

Keyword(s):

Machine Learning ◽

Decision Tree ◽

Naive Bayes ◽

Naïve Bayes ◽

The Body ◽

Machine Learning Algorithms ◽

Support Vector ◽

Common Disease ◽

Data Set ◽

Glucose Content

Diabetes is a most common disease that occurs to most of the humans now a day. The predictions for this disease are proposed through machine learning techniques. Through this method the risk factors of this disease are identified and can be prevented from increasing. Early prediction in such disease can be controlled and save human’s life. For the early predictions of this disease we collect data set having 8 attributes diabetic of 200 patients. The patients’ sugar level in the body is tested by the features of patient’s glucose content in the body and according to the age. The main Machine learning algorithms are Support vector machine (SVM), naive bayes (NB), K nearest neighbor (KNN) and Decision Tree (DT). In the exiting the Naive Bayes the accuracy levels are 66% but in the Decision tree the accuracy levels are 70 to 71%. The accuracy levels of the patients are not proper in range. But in XG boost classifiers even after the Naïve Bayes 74 Percentage and in Decision tree the accuracy levels are 89 to 90%. In the proposed system the accuracy ranges are shown properly and this is only used mostly. A dataset of 729 patients can be stored in Mongo DB and in that 129 patients repots are taken for the prediction purpose and the remaining are used for training. The training datasets are used for the prediction purposes.

Download Full-text

Effectiveness of Classification Methods on the Diabetes System

Asian Journal of Research in Computer Science ◽

10.9734/ajrcos/2021/v12i330287 ◽

2021 ◽

pp. 33-43

Author(s):

Ahmed T. Shawky ◽

Ismail M. Hagag

Keyword(s):

Machine Learning ◽

Naive Bayes ◽

Early Stage ◽

Research Paper ◽

Naïve Bayes ◽

Machine Learning Algorithms ◽

Support Vector ◽

K Nearest Neighbor ◽

Machine Learning Classification ◽

Bayes Algorithm

In today’s world using data mining and classification is considered to be one of the most important techniques, as today’s world is full of data that is generated by various sources. However, extracting useful knowledge out of this data is the real challenge, and this paper conquers this challenge by using machine learning algorithms to use data for classifiers to draw meaningful results. The aim of this research paper is to design a model to detect diabetes in patients with high accuracy. Therefore, this research paper using five different algorithms for different machine learning classification includes, Decision Tree, Support Vector Machine (SVM), Random Forest, Naive Bayes, and K- Nearest Neighbor (K-NN), the purpose of this approach is to predict diabetes at an early stage. Finally, we have compared the performance of these algorithms, concluding that K-NN algorithm is a better accuracy (81.16%), followed by the Naive Bayes algorithm (76.06%).

Download Full-text

Ammonoid Taxonomy with Supervised and Unsupervised Machine Learning Algorithms

10.31233/osf.io/ewkx9 ◽

2021 ◽

Author(s):

Floe Foxon

Keyword(s):

Machine Learning ◽

Naive Bayes ◽

Learning Algorithms ◽

Clustering Algorithms ◽

Measurement Data ◽

Naïve Bayes ◽

Machine Learning Algorithms ◽

Gradient Boosting ◽

Support Vector ◽

Unsupervised Machine Learning

Ammonoid identification is crucial to biostratigraphy, systematic palaeontology, and evolutionary biology, but may prove difficult when shell features and sutures are poorly preserved. This necessitates novel approaches to ammonoid taxonomy. This study aimed to taxonomize ammonoids by their conch geometry using supervised and unsupervised machine learning algorithms. Ammonoid measurement data (conch diameter, whorl height, whorl width, and umbilical width) were taken from the Paleobiology Database (PBDB). 11 species with ≥50 specimens each were identified providing N=781 total unique specimens. Naive Bayes, Decision Tree, Random Forest, Gradient Boosting, K-Nearest Neighbours, and Support Vector Machine classifiers were applied to the PBDB data with a 5x5 nested cross-validation approach to obtain unbiased generalization performance estimates across a grid search of algorithm parameters. All supervised classifiers achieved ≥70% accuracy in identifying ammonoid species, with Naive Bayes demonstrating the least over-fitting. The unsupervised clustering algorithms K-Means, DBSCAN, OPTICS, Mean Shift, and Affinity Propagation achieved Normalized Mutual Information scores of ≥0.6, with the centroid-based methods having most success. This presents a reasonably-accurate proof-of-concept approach to ammonoid classification which may assist identification in cases where more traditional methods are not feasible.

Download Full-text

Classification of forest fire areas using machine learning algorithm

World Journal of Advanced Engineering Technology and Sciences ◽

10.30574/wjaets.2021.3.1.0048 ◽

2021 ◽

Vol 3 (1) ◽

pp. 008-015

Author(s):

Saruni Dwiasnati ◽

Yudo Devianto

Keyword(s):

Machine Learning ◽

Forest Fire ◽

Forest Fires ◽

Naive Bayes ◽

Learning Algorithm ◽

Fire Hazard ◽

Naïve Bayes ◽

Machine Learning Algorithms ◽

Burned Area ◽

K Nearest Neighbor

Forest fires that occur will cause various kinds of problems, both in terms of health, such as smoke that can interfere with the respiratory system, in terms of the economy such as the economic wheel cannot run as usual, in terms of the environment can damage the surrounding environment and the environment that is missed by smoke, and other disasters. Forest fires can also have an impact on the costs that will be incurred to resolve the problems that arise due to forest fires, so research is needed to find out and measure the area affected by forest fires that burned in the range of 1980 - 2019 using a dataset of approximately 10,000. The target in this research is to be able to generate the best percentage scenario and find out the model of using the algorithm used to explore the algorithm in the Machine Learning method for the model for estimating the area of forest fires, namely the Siak Kampar Peninsula in Riau Province. In this study, 7 parameters were used to create a forest and land fire hazard map, namely weather temperature, Burned Area Density, hotspot density, wind speed, land cover type, rainfall, and land use. The seven parameters will be searched for accuracy results using the Classification method with Machine Learning algorithms, including Naïve Bayes, SVM, and K-Nearest Neighbor (K-NN). In this study, comparisons were made to obtain the best algorithm for estimating forest fire areas. By generating each algorithm is 71.72% for the Naïve Bayes algorithm, 75.00% for the SVM algorithm, and 64.71% for the K-NN algorithm.

Download Full-text

CLASSIFICATION OF HEAD AND NECK CANCER TYPES USING MACHINE LEARNING ALGORITHM

EPRA International Journal of Research & Development (IJRD) ◽

10.36713/epra3289 ◽

2020 ◽

pp. 198-205

Author(s):

Prof O. Olabode ◽

Prof A. O. Adetunmbi ◽

Folake Akinbohun ◽

Dr Ambrose Akinbohun

Keyword(s):

Machine Learning ◽

Head And Neck Cancer ◽

Head And Neck ◽

Neck Cancer ◽

Naive Bayes ◽

Information Gain ◽

Learning Algorithm ◽

Naïve Bayes ◽

Machine Learning Algorithms ◽

Chi Square

The worldwide incidence of head and neck cancer exceeds half a million cases annually. The morbidity and mortality of head and neck cancers considering thyroid, nasopharyngeal, sinonasal and laryngeal were reported high. The degree of facial disfigurement is unrivalled. Information Gain and Chi Square, Decision and Naïve Bayes were deployed for the study. The dataset was divided into training and test data. The results showed that the performance of Naïve Bayes outperformed Decision Trees. With the application of machine learning algorithms, head and neck cancer can be classified. KEYWORDS: Head and Neck, thyroid, Chi Square, Information Gain

Download Full-text

Author identification for Under-Resourced language (KadazanDusun)

Indonesian Journal of Electrical Engineering and Computer Science ◽

10.11591/ijeecs.v17.i1.pp248-255 ◽

2020 ◽

Vol 17 (1) ◽

pp. 248 ◽

Cited By ~ 1

Author(s):

Nursyahirah Tarmizi ◽

Suhaila Saee ◽

Dayang Hanani Abang Ibrahim

Keyword(s):

Machine Learning ◽

Naive Bayes ◽

Naïve Bayes ◽

Machine Learning Algorithms ◽

Support Vector ◽

Svm Classifier ◽

Identification Task ◽

Data Set ◽

Short Text ◽

Author Identification

<span>This paper presents the task of Author Identification for KadazanDusun language by using tweets as the source of data to perform Author Identification task of short text on KadazanDusun, which is considered as one the under-resourced language in Malaysia. The aim of this paper is to demonstrate Author Identification of short text on KadazanDusun. Besides, this paper also examines the performance of two machine learning algorithms on the KadazanDusun data set by analyzing the stylometric features. Stylometric features are used to quantify the writing styles of the authors which includes character n-grams and word n-grams. The workflow of Author Identification implements the machine learning approach to solve the single-labelled multi-class problem and predict the author of a given message in KadazanDusun. Two classifiers are used to compare the accuracy including Naïve Bayes and Support Vector Machine (SVM). The results show that the combination of n-grams which is word-level unigram and {1-5}-grams with character 3-grams are the most relevant stylometric features in identifying the author of KadazanDusun message with an accuracy of 80.17%. The results also show that SVM classifier has outperformed Naive Bayes in this Author Identification task with the accuracy of 80.17%.</span>

Download Full-text