scholarly journals Prediction of Misclassification Data using Cognitive Bayes Computation Techniques (COBACO)

Missing data arise major issues in the large database regarding quantitative analysis. Due to this issues, the inference of the computational process produce bias results, more damage of data, the error rate can increase, and more difficult to accomplish the process of imputation. Prediction of disguised missing data occurs in the large data sets are another major problems in real time operation. Machine learning (ML) techniques to connect with the classification of measurement to enforce the accuracy rate of predictive values. These techniques overcome the various challenges to the problem of losing data. Recent work based on the prediction of misclassification using supervised ML approach; to predict an output for an unseen input with limited parameters in a data set. When increase the size of parameter, then it generates the outcome of less accuracy rate. This article presented a new approach COBACO, an effective supervised machine learning technique. Several strategies describe the classification of predictive techniques for missing data analysis in efficient supervised machine learning techniques. The proposed predictive techniques COBACO generated more precise, accurate results than the other predictive approaches. The Experimental results obtained using both real and synthetic data set show that the proposed approach offers a valuable and promising insight to the problem of prediction of missing information.

2021 ◽  
Vol 36 (1) ◽  
pp. 609-615
Author(s):  
Mandhapati Rajesh ◽  
Dr.K. Malathi

Aim: Predicting the Heartdiseases using medical parameters of cardiac patients to get a good accuracy rate using machine learning methods like innovative Decision Tree (DT) algorithm. Materials and Methods: Supervised Machine learning Techniques with innovative Decision Tree (N = 20) and K Nearest Neighbour (KNN) (N = 20) are performed with five different datasets at each time to record five samples. Results: The Decision Tree is used to predict heart disease with the help of various medical conditions, the accuracy is achieved for DT is 98% and KNN is 72.2%. The two algorithms Decision Tree and KNN are statistically insignificant (=.737) with the independent sample T-Test value (p<0.005) with a confidence level of 95%. Conclusion: Prediction and classification of heart disease significantly seem to be better in DT than KNN.


2021 ◽  
Vol 10 (7) ◽  
pp. 436
Author(s):  
Amerah Alghanim ◽  
Musfira Jilani ◽  
Michela Bertolotto ◽  
Gavin McArdle

Volunteered Geographic Information (VGI) is often collected by non-expert users. This raises concerns about the quality and veracity of such data. There has been much effort to understand and quantify the quality of VGI. Extrinsic measures which compare VGI to authoritative data sources such as National Mapping Agencies are common but the cost and slow update frequency of such data hinder the task. On the other hand, intrinsic measures which compare the data to heuristics or models built from the VGI data are becoming increasingly popular. Supervised machine learning techniques are particularly suitable for intrinsic measures of quality where they can infer and predict the properties of spatial data. In this article we are interested in assessing the quality of semantic information, such as the road type, associated with data in OpenStreetMap (OSM). We have developed a machine learning approach which utilises new intrinsic input features collected from the VGI dataset. Specifically, using our proposed novel approach we obtained an average classification accuracy of 84.12%. This result outperforms existing techniques on the same semantic inference task. The trustworthiness of the data used for developing and training machine learning models is important. To address this issue we have also developed a new measure for this using direct and indirect characteristics of OSM data such as its edit history along with an assessment of the users who contributed the data. An evaluation of the impact of data determined to be trustworthy within the machine learning model shows that the trusted data collected with the new approach improves the prediction accuracy of our machine learning technique. Specifically, our results demonstrate that the classification accuracy of our developed model is 87.75% when applied to a trusted dataset and 57.98% when applied to an untrusted dataset. Consequently, such results can be used to assess the quality of OSM and suggest improvements to the data set.


10.2196/20995 ◽  
2020 ◽  
Vol 8 (9) ◽  
pp. e20995
Author(s):  
Debbie Rankin ◽  
Michaela Black ◽  
Bronac Flanagan ◽  
Catherine F Hughes ◽  
Adrian Moore ◽  
...  

Background Machine learning techniques, specifically classification algorithms, may be effective to help understand key health, nutritional, and environmental factors associated with cognitive function in aging populations. Objective This study aims to use classification techniques to identify the key patient predictors that are considered most important in the classification of poorer cognitive performance, which is an early risk factor for dementia. Methods Data were used from the Trinity-Ulster and Department of Agriculture study, which included detailed information on sociodemographic, clinical, biochemical, nutritional, and lifestyle factors in 5186 older adults recruited from the Republic of Ireland and Northern Ireland, a proportion of whom (987/5186, 19.03%) were followed up 5-7 years later for reassessment. Cognitive function at both time points was assessed using a battery of tests, including the Repeatable Battery for the Assessment of Neuropsychological Status (RBANS), with a score <70 classed as poorer cognitive performance. This study trained 3 classifiers—decision trees, Naïve Bayes, and random forests—to classify the RBANS score and to identify key health, nutritional, and environmental predictors of cognitive performance and cognitive decline over the follow-up period. It assessed their performance, taking note of the variables that were deemed important for the optimized classifiers for their computational diagnostics. Results In the classification of a low RBANS score (<70), our models performed well (F1 score range 0.73-0.93), all highlighting the individual’s score from the Timed Up and Go (TUG) test, the age at which the participant stopped education, and whether or not the participant’s family reported memory concerns to be of key importance. The classification models performed well in classifying a greater rate of decline in the RBANS score (F1 score range 0.66-0.85), also indicating the TUG score to be of key importance, followed by blood indicators: plasma homocysteine, vitamin B6 biomarker (plasma pyridoxal-5-phosphate), and glycated hemoglobin. Conclusions The results suggest that it may be possible for a health care professional to make an initial evaluation, with a high level of confidence, of the potential for cognitive dysfunction using only a few short, noninvasive questions, thus providing a quick, efficient, and noninvasive way to help them decide whether or not a patient requires a full cognitive evaluation. This approach has the potential benefits of making time and cost savings for health service providers and avoiding stress created through unnecessary cognitive assessments in low-risk patients.


2020 ◽  
Author(s):  
Cecilia Contreras ◽  
Mahdi Khodadadzadeh ◽  
Laura Tusa ◽  
Richard Gloaguen

&lt;p&gt;Drilling is a key task in exploration campaigns to characterize mineral deposits at depth. Drillcores&lt;br&gt;are first logged in the field by a geologist and with regards to, e.g., mineral assemblages,&lt;br&gt;alteration patterns, and structural features. The core-logging information is then used to&lt;br&gt;locate and target the important ore accumulations and select representative samples that are&lt;br&gt;further analyzed by laboratory measurements (e.g., Scanning Electron Microscopy (SEM), Xray&lt;br&gt;diffraction (XRD), X-ray Fluorescence (XRF)). However, core-logging is a laborious task and&lt;br&gt;subject to the expertise of the geologist.&lt;br&gt;Hyperspectral imaging is a non-invasive and non-destructive technique that is increasingly&lt;br&gt;being used to support the geologist in the analysis of drill-core samples. Nonetheless, the&lt;br&gt;benefit and impact of using hyperspectral data depend on the applied methods. With this in&lt;br&gt;mind, machine learning techniques, which have been applied in different research fields,&lt;br&gt;provide useful tools for an advance and more automatic analysis of the data. Lately, machine&lt;br&gt;learning frameworks are also being implemented for mapping minerals in drill-core&lt;br&gt;hyperspectral data.&lt;br&gt;In this context, this work follows an approach to map minerals on drill-core hyperspectral data&lt;br&gt;using supervised machine learning techniques, in which SEM data, integrated with the mineral&lt;br&gt;liberation analysis (MLA) software, are used in training a classifier. More specifically, the highresolution&lt;br&gt;mineralogical data obtained by SEM-MLA analysis is resampled and co-registered&lt;br&gt;to the hyperspectral data to generate a training set. Due to the large difference in spatial&lt;br&gt;resolution between the SEM-MLA and hyperspectral images, a pre-labeling strategy is&lt;br&gt;required to link these two images at the hyperspectral data spatial resolution. In this study,&lt;br&gt;we use the SEM-MLA image to compute the abundances of minerals for each hyperspectral&lt;br&gt;pixel in the corresponding SEM-MLA region. We then use the abundances as features in a&lt;br&gt;clustering procedure to generate the training labels. In the final step, the generated training&lt;br&gt;set is fed into a supervised classification technique for the mineral mapping over a large area&lt;br&gt;of a drill-core. The experiments are carried out on a visible to near-infrared (VNIR) and shortwave&lt;br&gt;infrared (SWIR) hyperspectral data set and based on preliminary tests the mineral&lt;br&gt;mapping task improves significantly.&lt;/p&gt;


Symmetry ◽  
2021 ◽  
Vol 13 (3) ◽  
pp. 403
Author(s):  
Muhammad Waleed ◽  
Tai-Won Um ◽  
Tariq Kamal ◽  
Syed Muhammad Usman

In this paper, we apply the multi-class supervised machine learning techniques for classifying the agriculture farm machinery. The classification of farm machinery is important when performing the automatic authentication of field activity in a remote setup. In the absence of a sound machine recognition system, there is every possibility of a fraudulent activity taking place. To address this need, we classify the machinery using five machine learning techniques—K-Nearest Neighbor (KNN), Support Vector Machine (SVM), Decision Tree (DT), Random Forest (RF) and Gradient Boosting (GB). For training of the model, we use the vibration and tilt of machinery. The vibration and tilt of machinery are recorded using the accelerometer and gyroscope sensors, respectively. The machinery included the leveler, rotavator and cultivator. The preliminary analysis on the collected data revealed that the farm machinery (when in operation) showed big variations in vibration and tilt, but observed similar means. Additionally, the accuracies of vibration-based and tilt-based classifications of farm machinery show good accuracy when used alone (with vibration showing slightly better numbers than the tilt). However, the accuracies improve further when both (the tilt and vibration) are used together. Furthermore, all five machine learning algorithms used for classification have an accuracy of more than 82%, but random forest was the best performing. The gradient boosting and random forest show slight over-fitting (about 9%), but both algorithms produce high testing accuracy. In terms of execution time, the decision tree takes the least time to train, while the gradient boosting takes the most time.


Sign in / Sign up

Export Citation Format

Share Document