Machine Learning Assisted Cervical Cancer Detection

2018 ◽

pp. 122-139

Author(s):

Saugata Bose ◽

Ritambhra Korpal

Keyword(s):

Machine Learning ◽

Language Processing ◽

Confusion Matrix ◽

False Negative ◽

False Negative Rate ◽

Search Space ◽

Machine Learning Algorithms ◽

C4.5 Decision Tree ◽

N Gram ◽

Four Levels

In this chapter, an initiative is proposed where natural language processing (NLP) techniques and supervised machine learning algorithms have been combined to detect external plagiarism. The major emphasis is on to construct a framework to detect plagiarism from monolingual texts by implementing n-gram frequency comparison approach. The framework is based on 120 characteristics which have been extracted during pre-processing steps using simple NLP approach. Afterward, filter metrics has been applied to select most relevant features and supervised classification learning algorithm has been used later to classify the documents in four levels of plagiarism. Then, confusion matrix was built to estimate the false positives and false negatives. Finally, the authors have shown C4.5 decision tree-based classifier's suitability on calculating accuracy over naive Bayes. The framework achieved 89% accuracy with low false positive and false negative rate and it shows higher precision and recall value comparing to passage similarities method, sentence similarity method, and search space reduction method.

Download Full-text

Diagnosis and Classification of the Diabetes Using Machine Learning Algorithms

10.21203/rs.3.rs-514771/v2 ◽

2021 ◽

Author(s):

Prasannavenkatesan Theerthagiri ◽

Usha Ruby A ◽

Vidya J

Keyword(s):

Machine Learning ◽

Multilayer Perceptron ◽

Nearest Neighbor ◽

False Positive Rate ◽

Learning Algorithms ◽

False Negative ◽

False Negative Rate ◽

Disease Diagnosis ◽

Machine Learning Algorithms ◽

K Nearest Neighbor

Abstract Diabetes mellitus is characterized as a chronic disease may cause many complications. The machine learning algorithms are used to diagnosis and predict the diabetes. The learning based algorithms plays a vital role on supporting decision making in disease diagnosis and prediction. In this paper, traditional classification algorithms and neural network based machine learning are investigated for the diabetes dataset. Also, various performance methods with different aspects are evaluated for the K-nearest neighbor, Naive Bayes, extra trees, decision trees, radial basis function, and multilayer perceptron algorithms. It supports the estimation on patients suffering from diabetes in future. The results of this work shows that the multilayer perceptron algorithm gives the highest prediction accuracy with lowest MSE of 0.19. The MLP gives the lowest false positive rate and false negative rate with highest area under curve of 86 %.

Download Full-text

Intelligent Personalized Abnormality Detection for Remote Health Monitoring

International Journal of Intelligent Information Technologies ◽

10.4018/ijiit.2020040105 ◽

2020 ◽

Vol 16 (2) ◽

pp. 87-109 ◽

Cited By ~ 1

Author(s):

Poorani Marimuthu ◽

Varalakshmi Perumal ◽

Vaidehi Vijayakumar

Keyword(s):

Machine Learning ◽

Prediction Accuracy ◽

Prediction Models ◽

False Negative ◽

False Negative Rate ◽

Area Under The Curve ◽

Ground Truth ◽

Machine Learning Algorithms ◽

Abnormality Detection ◽

Remote Healthcare

Machine learning algorithms are extensively used in healthcare analytics to learn normal and abnormal patterns automatically. The detection and prediction accuracy of any machine learning model depends on many factors like ground truth instances, attribute relationships, model design, the size of the dataset, the percentage of uncertainty, the training and testing environment, etc. Prediction models in healthcare should generate a minimal false positive and false negative rate. To accomplish high classification or prediction accuracy, the screening of health status needs to be personalized rather than following general clinical practice guidelines (CPG) which fits for an average population. Hence, a personalized screening model (IPAD – Intelligent Personalized Abnormality Detection) for remote healthcare is proposed that tailored to specific individual. The severity level of the abnormal status has been derived using personalized health values and the IPAD model obtains an area under the curve (AUC) of 0.907.

Download Full-text

Cric searchable image database as a public platform for conventional pap smear cytology data

Scientific Data ◽

10.1038/s41597-021-00933-8 ◽

2021 ◽

Vol 8 (1) ◽

Author(s):

Mariana T. Rezende ◽

Raniere Silva ◽

Fagner de O. Bernardo ◽

Alessandra H. G. Tobias ◽

Paulo H. C. Oliveira ◽

...

Keyword(s):

Machine Learning ◽

Cervical Cancer ◽

Pap Smear ◽

False Negative ◽

Testing Machine ◽

Subsequent Treatment ◽

Machine Learning Algorithms ◽

Pap Smears ◽

Health Crisis ◽

Routine Work

AbstractAmidst the current health crisis and social distancing, telemedicine has become an important part of mainstream of healthcare, and building and deploying computational tools to support screening more efficiently is an increasing medical priority. The early identification of cervical cancer precursor lesions by Pap smear test can identify candidates for subsequent treatment. However, one of the main challenges is the accuracy of the conventional method, often subject to high rates of false negative. While machine learning has been highlighted to reduce the limitations of the test, the absence of high-quality curated datasets has prevented strategies development to improve cervical cancer screening. The Center for Recognition and Inspection of Cells (CRIC) platform enables the creation of CRIC Cervix collection, currently with 400 images (1,376 × 1,020 pixels) curated from conventional Pap smears, with manual classification of 11,534 cells. This collection has the potential to advance current efforts in training and testing machine learning algorithms for the automation of tasks as part of the cytopathological analysis in the routine work of laboratories.

Download Full-text

Reducing U2R and R2L category false negative rates with support vector machines

Serbian Journal of Electrical Engineering ◽

10.2298/sjee131007015m ◽

2014 ◽

Vol 11 (1) ◽

pp. 175-188 ◽

Cited By ~ 1

Author(s):

Nemanja Macek ◽

Milan Milosavljevic

Keyword(s):

Machine Learning ◽

Detection System ◽

False Negative ◽

False Negative Rate ◽

Machine Learning Algorithms ◽

Support Vector ◽

Negative Rate ◽

Machine Learning Model ◽

Vector Machines ◽

Feature Values

The KDD Cup '99 is commonly used dataset for training and testing IDS machine learning algorithms. Some of the major downsides of the dataset are the distribution and the proportions of U2R and R2L instances, which represent the most dangerous attack types, as well as the existence of R2L attack instances identical to normal traffic. This enforces minor category detection complexity and causes problems while building a machine learning model capable of detecting these attacks with sufficiently low false negative rate. This paper presents a new support vector machine based intrusion detection system that classifies unknown data instances according both to the feature values and weight factors that represent importance of features towards the classification. Increased detection rate and significantly decreased false negative rate for U2R and R2L categories, that have a very few instances in the training set, have been empirically proven.

Download Full-text

Diagnosis and Classification of the Diabetes Using Machine Learning Algorithms

10.21203/rs.3.rs-514771/v1 ◽

2021 ◽

Author(s):

Prasannavenkatesan Theerthagiri ◽

Usha Ruby A ◽

Vidya J

Keyword(s):

Machine Learning ◽

Multilayer Perceptron ◽

Nearest Neighbor ◽

False Positive Rate ◽

Learning Algorithms ◽

False Negative ◽

False Negative Rate ◽

Disease Diagnosis ◽

Machine Learning Algorithms ◽

K Nearest Neighbor

Abstract Diabetes mellitus is characterized as a chronic disease may cause many complications. The machine learning algorithms are used to diagnosis and predict the diabetes. The learning based algorithms plays a vital role on supporting decision making in disease diagnosis and prediction. In this paper, traditional classification algorithms and neural network based machine learning are investigated for the diabetes dataset. Also, various performance methods with different aspects are evaluated for the K-nearest neighbor, Naive Bayes, extra trees, decision trees, radial basis function, and multilayer perceptron algorithms. It supports the estimation on patients suffering from diabetes in future. The results of this work shows that the multilayer perceptron algorithm gives the highest prediction accuracy with lowest MSE of 0.19. The MLP gives the lowest false positive rate and false negative rate with highest area under curve of 86 %.

Download Full-text

Diagnosis and Classification of the Diabetes Using Machine Learning Algorithms

10.21203/rs.3.rs-514771/v3 ◽

2021 ◽

Author(s):

Prasannavenkatesan Theerthagiri ◽

Usha Ruby A ◽

Vidya J

Keyword(s):

Machine Learning ◽

Multilayer Perceptron ◽

Nearest Neighbor ◽

False Positive Rate ◽

Learning Algorithms ◽

False Negative ◽

False Negative Rate ◽

Disease Diagnosis ◽

Machine Learning Algorithms ◽

K Nearest Neighbor

Abstract Diabetes mellitus is characterized as a chronic disease may cause many complications. The machine learning algorithms are used to diagnosis and predict the diabetes. The learning based algorithms plays a vital role on supporting decision making in disease diagnosis and prediction. In this paper, traditional classification algorithms and neural network based machine learning are investigated for the diabetes dataset. Also, various performance methods with different aspects are evaluated for the K-nearest neighbor, Naive Bayes, extra trees, decision trees, radial basis function, and multilayer perceptron algorithms. It supports the estimation on patients suffering from diabetes in future. The results of this work shows that the multilayer perceptron algorithm gives the highest prediction accuracy with lowest MSE of 0.19. The MLP gives the lowest false positive rate and false negative rate with highest area under curve of 86 %.

Download Full-text

Machine-Learning-Based External Plagiarism Detecting Methodology From Monolingual Documents

Scholarly Ethics and Publishing ◽

10.4018/978-1-5225-8057-7.ch021 ◽

2019 ◽

pp. 442-458

Author(s):

Saugata Bose ◽

Ritambhra Korpal

Keyword(s):

Machine Learning ◽

Language Processing ◽

Confusion Matrix ◽

False Negative ◽

False Negative Rate ◽

Search Space ◽

Machine Learning Algorithms ◽

C4.5 Decision Tree ◽

N Gram ◽

Four Levels

In this chapter, an initiative is proposed where natural language processing (NLP) techniques and supervised machine learning algorithms have been combined to detect external plagiarism. The major emphasis is on to construct a framework to detect plagiarism from monolingual texts by implementing n-gram frequency comparison approach. The framework is based on 120 characteristics which have been extracted during pre-processing steps using simple NLP approach. Afterward, filter metrics has been applied to select most relevant features and supervised classification learning algorithm has been used later to classify the documents in four levels of plagiarism. Then, confusion matrix was built to estimate the false positives and false negatives. Finally, the authors have shown C4.5 decision tree-based classifier's suitability on calculating accuracy over naive Bayes. The framework achieved 89% accuracy with low false positive and false negative rate and it shows higher precision and recall value comparing to passage similarities method, sentence similarity method, and search space reduction method.

Download Full-text

A Method of Apple Image Segmentation Based on Color-Texture Fusion Feature and Machine Learning

Agronomy ◽

10.3390/agronomy10070972 ◽

2020 ◽

Vol 10 (7) ◽

pp. 972 ◽

Cited By ~ 2

Author(s):

Chunlong Zhang ◽

Kunlin Zou ◽

Yue Pan

Keyword(s):

Machine Learning ◽

Image Segmentation ◽

False Positive Rate ◽

False Negative ◽

False Negative Rate ◽

Texture Features ◽

Apple Fruit ◽

Apple Orchard ◽

Machine Learning Algorithms ◽

Color Features

Apples are one of the most kind of important fruit in the world. China has been the largest apple producing country. Yield estimating, robot harvesting, precise spraying are important processes for precise planting apples. Image segmentation is an important step in machine vision systems for precision apple planting. In this paper, an apple fruit segmentation algorithm applied in the orchard was studied. The effect of many color features in classifying apple fruit pixels from other pixels was evaluated. Three color features were selected. This color features could effectively distinguish the apple fruit pixels from other pixels. The GLCM (Grey-Level Co-occurrence Matrix) was used to extract texture features. The best distance and orientation parameters for GLCM were found. Nine machine learning algorithms had been used to develop pixel classifiers. The classifier was trained with 100 pixels and tested with 100 pixels. The accuracy of the classifier based on Random Forest reached 0.94. One hundred images of an apple orchard were artificially labeled with apple fruit pixels and other pixels. At the same time, a classifier was used to segment these images. Regression analysis was performed on the results of artificial labeling and classifier classification. The average values of Af (segmentation error), FPR (false positive rate) and FNR (false negative rate) were 0.07, 0.13 and 0.15, respectively. This result showed that this algorithm could segment apple fruit in orchard images effectively. It could provide a reference for precise apple planting management.

Download Full-text

Prediction of Later-Age Concrete Compressive Strength Using Feedforward Neural Network

Advances in Materials Science and Engineering ◽

10.1155/2020/9682740 ◽

2020 ◽

Vol 2020 ◽

pp. 1-8

Author(s):

Thuy-Anh Nguyen ◽

Hai-Bang Ly ◽

Hai-Van Thi Mai ◽

Van Quan Tran

Keyword(s):

Neural Network ◽

Machine Learning ◽

Compressive Strength ◽

Mean Squared Error ◽

Learning Algorithm ◽

Pearson Correlation ◽

Feedforward Neural Network ◽

Machine Learning Algorithms ◽

Concrete Compressive Strength ◽

Squared Error

Accurate prediction of the concrete compressive strength is an important task that helps to avoid costly and time-consuming experiments. Notably, the determination of the later-age concrete compressive strength is more difficult due to the time required to perform experiments. Therefore, predicting the compressive strength of later-age concrete is crucial in specific applications. In this investigation, an approach using a feedforward neural network (FNN) machine learning algorithm was proposed to predict the compressive strength of later-age concrete. The proposed model was fully evaluated in terms of performance and prediction capability over statistical results of 1000 simulations under a random sampling effect. The results showed that the proposed algorithm was an excellent predictor and might be useful for engineers to avoid time-consuming experiments with the statistical performance indicators, namely, the Pearson correlation coefficient (R), root-mean-squared error (RMSE), and mean squared error (MAE) for the training and testing parts of 0.9861, 2.1501, 1.5650 and 0.9792, 2.8510, 2.1361, respectively. The results also indicated that the FNN model was superior to classical machine learning algorithms such as random forest and Gaussian process regression, as well as empirical formulations proposed in the literature.

Download Full-text