Multiple Imputation by Chained Equations–K-Nearest Neighbors and Deep Neural Network Architecture for Kidney Disease Prediction

International Journal of Image and Graphics ◽

10.1142/s0219467823500146 ◽

2021 ◽

Author(s):

M. Dhilsath Fathima ◽

R. Hariharan ◽

S. P. Raja

Keyword(s):

Neural Network ◽

Kidney Disease ◽

Multiple Imputation ◽

Deep Neural Network ◽

Missing Values ◽

Prediction Models ◽

Model Performance ◽

Nearest Neighbors ◽

Risk Level ◽

Clinical Dataset

Chronic kidney disease (CKD) is a health concern that affects people all over the world. Kidney dysfunction or impaired kidney functions are the causes of CKD. The machine learning-based prediction models are used to determine the risk level of CKD and assist healthcare practitioners in delaying and preventing the disease’s progression. The researchers proposed many prediction models for determining the CKD risk level. Although these models performed well, their precision is limited since they do not handle missing values in the clinical dataset adequately. The missing values of a clinical dataset can degrade the training outcomes that leads to false predictions. Thus, imputing missing values increases the prediction model performance. This proposed work developed a novel imputation technique by combining Multiple Imputation by Chained Equations and [Formula: see text]-Nearest Neighbors (MICE–KNN) for imputing the missing values. The experimental results show that MICE–KNN accurately predicts the missing values, and the Deep Neural Network (DNN) improves the prediction performance of the CKD model. Various metrics like mean absolute error, accuracy, specificity, Matthews correlation coefficient, the area under the curve, [Formula: see text]-score, sensitivity, and precision have been used to evaluate the proposed CKD model performance. The performance analysis exhibits that MICE–KNN with deep learning outperforms other classifiers. According to our experimental study, the MICE–KNN imputation algorithm with DNN is more appropriate for predicting the kidney disease.

Download Full-text

Improving Outcome Predictions for Patients Receiving Mechanical Circulatory Support by Optimizing Imputation of Missing Values

Circulation Cardiovascular Quality and Outcomes ◽

10.1161/circoutcomes.120.007071 ◽

2021 ◽

Author(s):

Byron C. Jaeger ◽

Ryan Cantor ◽

Venkata Sthanam ◽

Rongbing Xie ◽

James K. Kirklin ◽

...

Keyword(s):

Multiple Imputation ◽

Risk Prediction ◽

Random Forests ◽

Missing Values ◽

Prediction Models ◽

Model Performance ◽

Circulatory Support ◽

Risk Prediction Models ◽

Prognostic Accuracy ◽

The Mean

Background: Risk prediction models play an important role in clinical decision making. When developing risk prediction models, practitioners often impute missing values to the mean. We evaluated the impact of applying other strategies to impute missing values on the prognostic accuracy of downstream risk prediction models, that is, models fitted to the imputed data. A secondary objective was to compare the accuracy of imputation methods based on artificially induced missing values. To complete these objectives, we used data from the Interagency Registry for Mechanically Assisted Circulatory Support. Methods: We applied 12 imputation strategies in combination with 2 different modeling strategies for mortality and transplant risk prediction following surgery to receive mechanical circulatory support. Model performance was evaluated using Monte-Carlo cross-validation and measured based on outcomes 6 months following surgery using the scaled Brier score, concordance index, and calibration error. We used Bayesian hierarchical models to compare model performance. Results: Multiple imputation with random forests emerged as a robust strategy to impute missing values, increasing model concordance by 0.0030 (25th–75th percentile: 0.0008–0.0052) compared with imputation to the mean for mortality risk prediction using a downstream proportional hazards model. The posterior probability that single and multiple imputation using random forests would improve concordance versus mean imputation was 0.464 and >0.999, respectively. Conclusions: Selecting an optimal strategy to impute missing values such as random forests and applying multiple imputation can improve the prognostic accuracy of downstream risk prediction models.

Download Full-text

A Deep Neural Network for Early Detection and Prediction of Chronic Kidney Disease

Diagnostics ◽

10.3390/diagnostics12010116 ◽

2022 ◽

Vol 12 (1) ◽

pp. 116

Author(s):

Vijendra Singh ◽

Vijayan K. Asari ◽

Rajkumar Rajasekaran

Keyword(s):

Neural Network ◽

Machine Learning ◽

Chronic Kidney Disease ◽

Kidney Disease ◽

Early Detection ◽

Deep Neural Network ◽

Missing Values ◽

Machine Learning Techniques ◽

Recursive Feature Elimination ◽

Support Vector

Diabetes and high blood pressure are the primary causes of Chronic Kidney Disease (CKD). Glomerular Filtration Rate (GFR) and kidney damage markers are used by researchers around the world to identify CKD as a condition that leads to reduced renal function over time. A person with CKD has a higher chance of dying young. Doctors face a difficult task in diagnosing the different diseases linked to CKD at an early stage in order to prevent the disease. This research presents a novel deep learning model for the early detection and prediction of CKD. This research objectives to create a deep neural network and compare its performance to that of other contemporary machine learning techniques. In tests, the average of the associated features was used to replace all missing values in the database. After that, the neural network’s optimum parameters were fixed by establishing the parameters and running multiple trials. The foremost important features were selected by Recursive Feature Elimination (RFE). Hemoglobin, Specific Gravity, Serum Creatinine, Red Blood Cell Count, Albumin, Packed Cell Volume, and Hypertension were found as key features in the RFE. Selected features were passed to machine learning models for classification purposes. The proposed Deep neural model outperformed the other four classifiers (Support Vector Machine (SVM), K-Nearest Neighbor (KNN), Logistic regression, Random Forest, and Naive Bayes classifier) by achieving 100% accuracy. The proposed approach could be a useful tool for nephrologists in detecting CKD.

Download Full-text

Integrating ensemble systems biology feature selection and bimodal deep neural network for breast cancer prognosis prediction

Scientific Reports ◽

10.1038/s41598-021-92864-y ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Li-Hsin Cheng ◽

Te-Cheng Hsu ◽

Che Lin

Keyword(s):

Breast Cancer ◽

Neural Network ◽

Feature Selection ◽

Systems Biology ◽

Ensemble Learning ◽

Microarray Data ◽

Deep Neural Network ◽

Prediction Models ◽

Biological Knowledge ◽

Prognosis Prediction

AbstractBreast cancer is a heterogeneous disease. To guide proper treatment decisions for each patient, robust prognostic biomarkers, which allow reliable prognosis prediction, are necessary. Gene feature selection based on microarray data is an approach to discover potential biomarkers systematically. However, standard pure-statistical feature selection approaches often fail to incorporate prior biological knowledge and select genes that lack biological insights. Besides, due to the high dimensionality and low sample size properties of microarray data, selecting robust gene features is an intrinsically challenging problem. We hence combined systems biology feature selection with ensemble learning in this study, aiming to select genes with biological insights and robust prognostic predictive power. Moreover, to capture breast cancer's complex molecular processes, we adopted a multi-gene approach to predict the prognosis status using deep learning classifiers. We found that all ensemble approaches could improve feature selection robustness, wherein the hybrid ensemble approach led to the most robust result. Among all prognosis prediction models, the bimodal deep neural network (DNN) achieved the highest test performance, further verified by survival analysis. In summary, this study demonstrated the potential of combining ensemble learning and bimodal DNN in guiding precision medicine.

Download Full-text

Mood Prediction in Consideration of Certainty Factor Using Multilayer Deep Neural Network and Storage-Type Prediction Models

Sensors and Materials ◽

10.18494/sam.2016.1186 ◽

2016 ◽

Keyword(s):

Neural Network ◽

Deep Neural Network ◽

Prediction Models ◽

Certainty Factor ◽

And Storage ◽

Storage Type

Download Full-text

A Layered Recurrent Neural Network for Imputing Air Pollutants Missing Data and Prediction of NO 2, O 3, PM 10, and PM 2.5

10.5772/intechopen.93678 ◽

2020 ◽

Author(s):

Hamza Turabieh ◽

Alaa Sheta ◽

Malik Braik ◽

Elvira Kovač-Andrić

Keyword(s):

Neural Network ◽

Air Pollution ◽

Air Quality ◽

Recurrent Neural Network ◽

Missing Values ◽

Prediction Models ◽

Air Pollutant ◽

Forecasting Models ◽

Monitoring Strategies ◽

Artificial Neural Network Ann

To fulfill the national air quality standards, many countries have created emissions monitoring strategies on air quality. Nowadays, policymakers and air quality executives depend on scientific computation and prediction models to monitor that cause air pollution, especially in industrial cities. Air pollution is considered one of the primary problems that could cause many human health problems such as asthma, damage to lungs, and even death. In this study, we present investigated development forecasting models for air pollutant attributes including Particulate Matters (PM2.5, PM10), ground-level Ozone (O3), and Nitrogen Oxides (NO2). The dataset used was collected from Dubrovnik city, which is located in the east of Croatia. The collected data has missing values. Therefore, we suggested the use of a Layered Recurrent Neural Network (L-RNN) to impute the missing value(s) of air pollutant attributes then build forecasting models. We adopted four regression models to forecast air pollutant attributes, which are: Multiple Linear Regression (MLR), Decision Tree Regression (DTR), Artificial Neural Network (ANN) and L-RNN. The obtained results show that the proposed method enhances the overall performance of other forecasting models.

Download Full-text

An efficient oppositional crow search optimisation-based deep neural network classifier for chronic kidney disease identification

International Journal of Innovative Computing and Applications ◽

10.1504/ijica.2021.10038619 ◽

2021 ◽

Vol 12 (4) ◽

pp. 206

Author(s):

Eswaran Perumal ◽

Pramila Arulanthu

Keyword(s):

Neural Network ◽

Chronic Kidney Disease ◽

Kidney Disease ◽

Deep Neural Network ◽

Neural Network Classifier ◽

Disease Identification

Download Full-text

Development of a deep neural network for predicting 6-hour average PM<sub>2.5</sub> concentrations up to two subsequent days using various training data

10.5194/gmd-2021-356 ◽

2021 ◽

Author(s):

Jeong-Beom Lee ◽

Jae-Bum Lee ◽

Youn-Seo Koo ◽

Hee-Yong Kwon ◽

Min-Hyeok Choi ◽

...

Keyword(s):

Neural Network ◽

Air Quality ◽

Deep Neural Network ◽

Model Performance ◽

Prediction Performance ◽

Training Data ◽

Forecasting Model ◽

Observation Data ◽

Forecast Data ◽

Air Quality Forecasting

Abstract. This study aims to develop a deep neural network (DNN) model as an artificial neural network (ANN) for the prediction of 6-hour average fine particulate matter (PM2.5) concentrations for a three-day period—the day of prediction (D+0), one day after prediction (D+1) and two days after prediction (D+2)—using observation data and forecast data obtained via numerical models. The performance of the DNN model was comparatively evaluated against that of the currently operational Community Multiscale Air Quality (CMAQ) modelling system for air quality forecasting in South Korea. In addition, the effect on predictive performance of the DNN model on using different training data was analyzed. For the D+0 forecast, the DNN model performance was superior to that of the CMAQ model, and there was no significant dependence on the training data. For the D+1 and D+2 forecasts, the DNN model that used the observation and forecast data (DNN-ALL) outperformed the CMAQ model. The root-mean-squared error (RMSE) of DNN-ALL was lower than that of the CMAQ model by 2.2 μgm−3, and 3.0 μgm−3 for the D+1 and D+2 forecasts, respectively, because the overprediction of higher concentrations was curtailed. An IOA increase of 0.46 for D+1 prediction and 0.59 for the D+2 prediction was observed in case of the DNN-ALL model compared to the IOA of the DNN model that used only observation data (DNN-OBS). In additionally, An RMSE decrease of 7.2 μgm−3 for the D+1 prediction and 6.3 μgm−3 for the D+2 prediction was observed in case of the DNN-ALL model, compared to the RMSE of DNN-OBS, indicating that the inclusion of forecast data in the training data greatly affected the DNN model performance. Considering the prediction of the 6-hour average PM2.5 concentration, the 8.8 μgm−3 RMSE of the DNN-ALL model was 2.7 μgm−3 lower than that of the CMAQ model, indicating the superior prediction performance of the former. These results suggest that the DNN model could be utilized as a better-performing air quality forecasting model than the CMAQ, and that observation data plays an important role in determining the prediction performance of the DNN model for D+0 forecasting, while prediction data does the same for D+1 and D+2 forecasting. The use of the proposed DNN model as a forecasting model may result in a reduction in the economic losses caused by pollution-mitigation policies and aid better protection of public health.

Download Full-text

Monitoring Population Phenology of Asian Citrus Psyllid Using Deep Learning

Complexity ◽

10.1155/2021/4644213 ◽

2021 ◽

Vol 2021 ◽

pp. 1-10

Author(s):

Maria Bibi ◽

Muhammad Kashif Hanif ◽

Muhammad Umer Sarwar ◽

Muhammad Irfan Khan ◽

Shouket Zaman Khan ◽

...

Keyword(s):

Neural Network ◽

Deep Neural Network ◽

Prediction Models ◽

Mean Squared Error ◽

Maximum Temperature ◽

Economic Losses ◽

Asian Citrus Psyllid ◽

Average Maximum ◽

Average Minimum Temperature ◽

Network Approaches

Asian citrus psyllid, Diaphorina citri Kuwayama (Liviidae: Hemiptera) is a menacing and notorious pest of citrus plants. It vectors a phloem vessel-dwelling bacterium Candidatus Liberibacter asiaticus, which is a causative pathogen of the serious citrus disease known as Huanglongbing. Huanglongbing disease is a major bottleneck in the export of citrus fruits from Pakistan. It is being responsible for huge citrus economic losses globally. In the current study, several prediction models were developed based on regression algorithms of machine learning to monitor different phenological stages of Asian citrus psyllid to predict its population about different abiotic variables (average maximum temperature, average minimum temperature, average weekly temperature, average weekly relative humidity, and average weekly rainfall) and biotic variable (host plant phenological patterns) in citrus-growing regions of Pakistan. The pest prediction models can be used for proper applications of pesticides only when needed for reducing the environmental and cost impacts of pesticides. Pearson’s correlation analysis was performed to find the relationship between different predictor (abiotic and biotic) variables and pest infestation rate on citrus plants. Multiple linear regression, random forest regressor, and deep neural network approaches were compared to predict population dynamics of Asian citrus psyllid. In comparison with other regression techniques, a deep neural network-based prediction model resulted in the least root mean squared error values while predicting egg, nymph, and adult populations.

Download Full-text

Rock Strain Prediction Using Deep Neural Network and Hybrid Models of ANFIS and Meta-Heuristic Optimization Algorithms

Infrastructures ◽

10.3390/infrastructures6090129 ◽

2021 ◽

Vol 6 (9) ◽

pp. 129

Author(s):

T. Pradeep ◽

Abidhan Bardhan ◽

Avijit Burman ◽

Pijush Samui

Keyword(s):

Neural Network ◽

Deep Neural Network ◽

Prediction Models ◽

Optimization Algorithms ◽

Grey Wolf Optimizer ◽

Statistical Parameters ◽

Unconfined Compression Test ◽

Rock Stratum ◽

Inference System ◽

Prediction Approach

The majority of natural ground vibrations are caused by the release of strain energy accumulated in the rock strata. The strain reacts to the formation of crack patterns and rock stratum failure. Rock strain prediction is one of the significant works for the assessment of the failure of rock material. The purpose of this paper is to investigate the development of a new strain prediction approach in rock samples utilizing deep neural network (DNN) and hybrid ANFIS (adaptive neuro-fuzzy inference system) models. Four optimization algorithms, namely particle swarm optimization (PSO), Fireflies algorithm (FF), genetic algorithm (GA), and grey wolf optimizer (GWO), were used to optimize the learning parameters of ANFIS and ANFIS-PSO, ANFIS-FF, ANFIS-GA, and ANFIS-GWO were constructed. For this purpose, the necessary datasets were obtained from an experimental setup of an unconfined compression test of rocks in lateral and longitudinal directions. Various statistical parameters were used to investigate the accuracy of the proposed prediction models. In addition, rank analysis was performed to select the most robust model for accurate rock sample prediction. Based on the experimental results, the constructed DNN is very potential to be a new alternative to assist engineers to estimate the rock strain in the design phase of many engineering projects.

Download Full-text

An efficient oppositional crow search optimisation-based deep neural network classifier for chronic kidney disease identification

International Journal of Innovative Computing and Applications ◽

10.1504/ijica.2021.116671 ◽

2021 ◽

Vol 12 (4) ◽

pp. 206

Author(s):

Pramila Arulanthu ◽

Eswaran Perumal

Keyword(s):

Neural Network ◽

Chronic Kidney Disease ◽

Kidney Disease ◽

Deep Neural Network ◽

Neural Network Classifier ◽

Disease Identification

Download Full-text