scholarly journals Machine Learning Assisted Cervical Cancer Detection

2021 ◽  
Vol 9 ◽  
Author(s):  
Mavra Mehmood ◽  
Muhammad Rizwan ◽  
Michal Gregus ml ◽  
Sidra Abbas

Cervical malignant growth is the fourth most typical reason for disease demise in women around the globe. Cervical cancer growth is related to human papillomavirus (HPV) contamination. Early screening made cervical cancer a preventable disease that results in minimizing the global burden of cervical cancer. In developing countries, women do not approach sufficient screening programs because of the costly procedures to undergo examination regularly, scarce awareness, and lack of access to the medical center. In this manner, the expectation of the individual patient's risk becomes very high. There are many risk factors relevant to malignant cervical formation. This paper proposes an approach named CervDetect that uses machine learning algorithms to evaluate the risk elements of malignant cervical formation. CervDetect uses Pearson correlation between input variables as well as with the output variable to pre-process the data. CervDetect uses the random forest (RF) feature selection technique to select significant features. Finally, CervDetect uses a hybrid approach by combining RF and shallow neural networks to detect Cervical Cancer. Results show that CervDetect accurately predicts cervical cancer, outperforms the state-of-the-art studies, and achieved an accuracy of 93.6%, mean squared error (MSE) error of 0.07111, false-positive rate (FPR) of 6.4%, and false-negative rate (FNR) of 100%.

Author(s):  
Saugata Bose ◽  
Ritambhra Korpal

In this chapter, an initiative is proposed where natural language processing (NLP) techniques and supervised machine learning algorithms have been combined to detect external plagiarism. The major emphasis is on to construct a framework to detect plagiarism from monolingual texts by implementing n-gram frequency comparison approach. The framework is based on 120 characteristics which have been extracted during pre-processing steps using simple NLP approach. Afterward, filter metrics has been applied to select most relevant features and supervised classification learning algorithm has been used later to classify the documents in four levels of plagiarism. Then, confusion matrix was built to estimate the false positives and false negatives. Finally, the authors have shown C4.5 decision tree-based classifier's suitability on calculating accuracy over naive Bayes. The framework achieved 89% accuracy with low false positive and false negative rate and it shows higher precision and recall value comparing to passage similarities method, sentence similarity method, and search space reduction method.


2021 ◽  
Author(s):  
Prasannavenkatesan Theerthagiri ◽  
Usha Ruby A ◽  
Vidya J

Abstract Diabetes mellitus is characterized as a chronic disease may cause many complications. The machine learning algorithms are used to diagnosis and predict the diabetes. The learning based algorithms plays a vital role on supporting decision making in disease diagnosis and prediction. In this paper, traditional classification algorithms and neural network based machine learning are investigated for the diabetes dataset. Also, various performance methods with different aspects are evaluated for the K-nearest neighbor, Naive Bayes, extra trees, decision trees, radial basis function, and multilayer perceptron algorithms. It supports the estimation on patients suffering from diabetes in future. The results of this work shows that the multilayer perceptron algorithm gives the highest prediction accuracy with lowest MSE of 0.19. The MLP gives the lowest false positive rate and false negative rate with highest area under curve of 86 %.


2020 ◽  
Vol 16 (2) ◽  
pp. 87-109 ◽  
Author(s):  
Poorani Marimuthu ◽  
Varalakshmi Perumal ◽  
Vaidehi Vijayakumar

Machine learning algorithms are extensively used in healthcare analytics to learn normal and abnormal patterns automatically. The detection and prediction accuracy of any machine learning model depends on many factors like ground truth instances, attribute relationships, model design, the size of the dataset, the percentage of uncertainty, the training and testing environment, etc. Prediction models in healthcare should generate a minimal false positive and false negative rate. To accomplish high classification or prediction accuracy, the screening of health status needs to be personalized rather than following general clinical practice guidelines (CPG) which fits for an average population. Hence, a personalized screening model (IPAD – Intelligent Personalized Abnormality Detection) for remote healthcare is proposed that tailored to specific individual. The severity level of the abnormal status has been derived using personalized health values and the IPAD model obtains an area under the curve (AUC) of 0.907.


2021 ◽  
Vol 8 (1) ◽  
Author(s):  
Mariana T. Rezende ◽  
Raniere Silva ◽  
Fagner de O. Bernardo ◽  
Alessandra H. G. Tobias ◽  
Paulo H. C. Oliveira ◽  
...  

AbstractAmidst the current health crisis and social distancing, telemedicine has become an important part of mainstream of healthcare, and building and deploying computational tools to support screening more efficiently is an increasing medical priority. The early identification of cervical cancer precursor lesions by Pap smear test can identify candidates for subsequent treatment. However, one of the main challenges is the accuracy of the conventional method, often subject to high rates of false negative. While machine learning has been highlighted to reduce the limitations of the test, the absence of high-quality curated datasets has prevented strategies development to improve cervical cancer screening. The Center for Recognition and Inspection of Cells (CRIC) platform enables the creation of CRIC Cervix collection, currently with 400 images (1,376 × 1,020 pixels) curated from conventional Pap smears, with manual classification of 11,534 cells. This collection has the potential to advance current efforts in training and testing machine learning algorithms for the automation of tasks as part of the cytopathological analysis in the routine work of laboratories.


2014 ◽  
Vol 11 (1) ◽  
pp. 175-188 ◽  
Author(s):  
Nemanja Macek ◽  
Milan Milosavljevic

The KDD Cup '99 is commonly used dataset for training and testing IDS machine learning algorithms. Some of the major downsides of the dataset are the distribution and the proportions of U2R and R2L instances, which represent the most dangerous attack types, as well as the existence of R2L attack instances identical to normal traffic. This enforces minor category detection complexity and causes problems while building a machine learning model capable of detecting these attacks with sufficiently low false negative rate. This paper presents a new support vector machine based intrusion detection system that classifies unknown data instances according both to the feature values and weight factors that represent importance of features towards the classification. Increased detection rate and significantly decreased false negative rate for U2R and R2L categories, that have a very few instances in the training set, have been empirically proven.


2021 ◽  
Author(s):  
Prasannavenkatesan Theerthagiri ◽  
Usha Ruby A ◽  
Vidya J

Abstract Diabetes mellitus is characterized as a chronic disease may cause many complications. The machine learning algorithms are used to diagnosis and predict the diabetes. The learning based algorithms plays a vital role on supporting decision making in disease diagnosis and prediction. In this paper, traditional classification algorithms and neural network based machine learning are investigated for the diabetes dataset. Also, various performance methods with different aspects are evaluated for the K-nearest neighbor, Naive Bayes, extra trees, decision trees, radial basis function, and multilayer perceptron algorithms. It supports the estimation on patients suffering from diabetes in future. The results of this work shows that the multilayer perceptron algorithm gives the highest prediction accuracy with lowest MSE of 0.19. The MLP gives the lowest false positive rate and false negative rate with highest area under curve of 86 %.


2021 ◽  
Author(s):  
Prasannavenkatesan Theerthagiri ◽  
Usha Ruby A ◽  
Vidya J

Abstract Diabetes mellitus is characterized as a chronic disease may cause many complications. The machine learning algorithms are used to diagnosis and predict the diabetes. The learning based algorithms plays a vital role on supporting decision making in disease diagnosis and prediction. In this paper, traditional classification algorithms and neural network based machine learning are investigated for the diabetes dataset. Also, various performance methods with different aspects are evaluated for the K-nearest neighbor, Naive Bayes, extra trees, decision trees, radial basis function, and multilayer perceptron algorithms. It supports the estimation on patients suffering from diabetes in future. The results of this work shows that the multilayer perceptron algorithm gives the highest prediction accuracy with lowest MSE of 0.19. The MLP gives the lowest false positive rate and false negative rate with highest area under curve of 86 %.


Author(s):  
Saugata Bose ◽  
Ritambhra Korpal

In this chapter, an initiative is proposed where natural language processing (NLP) techniques and supervised machine learning algorithms have been combined to detect external plagiarism. The major emphasis is on to construct a framework to detect plagiarism from monolingual texts by implementing n-gram frequency comparison approach. The framework is based on 120 characteristics which have been extracted during pre-processing steps using simple NLP approach. Afterward, filter metrics has been applied to select most relevant features and supervised classification learning algorithm has been used later to classify the documents in four levels of plagiarism. Then, confusion matrix was built to estimate the false positives and false negatives. Finally, the authors have shown C4.5 decision tree-based classifier's suitability on calculating accuracy over naive Bayes. The framework achieved 89% accuracy with low false positive and false negative rate and it shows higher precision and recall value comparing to passage similarities method, sentence similarity method, and search space reduction method.


Agronomy ◽  
2020 ◽  
Vol 10 (7) ◽  
pp. 972 ◽  
Author(s):  
Chunlong Zhang ◽  
Kunlin Zou ◽  
Yue Pan

Apples are one of the most kind of important fruit in the world. China has been the largest apple producing country. Yield estimating, robot harvesting, precise spraying are important processes for precise planting apples. Image segmentation is an important step in machine vision systems for precision apple planting. In this paper, an apple fruit segmentation algorithm applied in the orchard was studied. The effect of many color features in classifying apple fruit pixels from other pixels was evaluated. Three color features were selected. This color features could effectively distinguish the apple fruit pixels from other pixels. The GLCM (Grey-Level Co-occurrence Matrix) was used to extract texture features. The best distance and orientation parameters for GLCM were found. Nine machine learning algorithms had been used to develop pixel classifiers. The classifier was trained with 100 pixels and tested with 100 pixels. The accuracy of the classifier based on Random Forest reached 0.94. One hundred images of an apple orchard were artificially labeled with apple fruit pixels and other pixels. At the same time, a classifier was used to segment these images. Regression analysis was performed on the results of artificial labeling and classifier classification. The average values of Af (segmentation error), FPR (false positive rate) and FNR (false negative rate) were 0.07, 0.13 and 0.15, respectively. This result showed that this algorithm could segment apple fruit in orchard images effectively. It could provide a reference for precise apple planting management.


2020 ◽  
Vol 2020 ◽  
pp. 1-8
Author(s):  
Thuy-Anh Nguyen ◽  
Hai-Bang Ly ◽  
Hai-Van Thi Mai ◽  
Van Quan Tran

Accurate prediction of the concrete compressive strength is an important task that helps to avoid costly and time-consuming experiments. Notably, the determination of the later-age concrete compressive strength is more difficult due to the time required to perform experiments. Therefore, predicting the compressive strength of later-age concrete is crucial in specific applications. In this investigation, an approach using a feedforward neural network (FNN) machine learning algorithm was proposed to predict the compressive strength of later-age concrete. The proposed model was fully evaluated in terms of performance and prediction capability over statistical results of 1000 simulations under a random sampling effect. The results showed that the proposed algorithm was an excellent predictor and might be useful for engineers to avoid time-consuming experiments with the statistical performance indicators, namely, the Pearson correlation coefficient (R), root-mean-squared error (RMSE), and mean squared error (MAE) for the training and testing parts of 0.9861, 2.1501, 1.5650 and 0.9792, 2.8510, 2.1361, respectively. The results also indicated that the FNN model was superior to classical machine learning algorithms such as random forest and Gaussian process regression, as well as empirical formulations proposed in the literature.


Sign in / Sign up

Export Citation Format

Share Document