Heart Disease Prediction System Using Linear Regression, Smoreg, and Rep Trees Algorithms

The broad area of data mining covers most of the fields of research. Its role in medical diagnosis is very motivative to the researchers. It is very easy for the medical practitioners to analyze and treat the disease of the patients at an early stage. The proposed work deals with predicting heart disease of the patients at an early stage. The method was organized in three stages, Data collection, Data preprocessing and Data classification. The dataset for the work was collected from UCI repository. The collected sample was first preprocessed to clean unwanted information from the dataset. Classification operation is then performed on the preprocessed data. Classification is carried out with three different techniques, Linear regression model, SMOreg and REP trees. The results of the three methods were compared based on Root mean squared error and the Absolute error and are tabulated

Download Full-text

Implementation of Multiple Linear Regression Methods as Prediction of Village Spending on Village Financial Management System

Kursor ◽

10.21107/kursor.v10i2.216 ◽

2020 ◽

Vol 10 (2) ◽

Author(s):

Nisa Hanum Harani ◽

Hanna Theresia Siregar ◽

Cahyo Prianto

Keyword(s):

Linear Regression ◽

Multiple Linear Regression ◽

Financial Management ◽

Mean Squared Error ◽

Sampling Technique ◽

Absolute Error ◽

Budget Process ◽

Squared Error ◽

Financial Management System ◽

The Village

The realization of village welfare and improvement of Village development can be started from the financial management aspects of the village. The village government has authority ranging from planning, implementation, reporting to accountability. There are two important variables as the financial aspects, there is village income, and village expenditure. The village budget process is a plan that will be compiled systematically. Planning has an association with predictions which is an indication of what is supposed to happen and predictions relating to what will happen. To provide a good village budget planning the village budget prediction feature is required. This prediction feature is done using data mining which is modeled i.e. multiple linear regression algorithm. The variable is selected using a purposive sampling technique and the sample count is 29 villages. Dependent variables are village Expenditure as Y, and independent variables i.e. village funds as X1 and village funding allocation as X2. The best values as validation were gained in the 3rd fold with a correlation coefficient of 0.8907, Mean Absolute Error value of 87209395.37, the value of Root Mean Squared Error of 114867675.6, Roll Absolute Error (RAE) Percentage was 42 %, and Root Relative Squared Error was 44 %.

Download Full-text

Modified Liu estimators in the linear regression model: An application to Tobacco data

PLoS ONE ◽

10.1371/journal.pone.0259991 ◽

2021 ◽

Vol 16 (11) ◽

pp. e0259991

Author(s):

Iqra Babar ◽

Hamdi Ayed ◽

Sohail Chand ◽

Muhammad Suhail ◽

Yousaf Ali Khan ◽

...

Keyword(s):

Linear Regression ◽

Mean Squared Error ◽

Prediction Interval ◽

Evaluation Criteria ◽

Absolute Error ◽

Vital Role ◽

Predictor Variables ◽

Linear Regression Models ◽

Squared Error ◽

Optimal Value

Background The problem of multicollinearity in multiple linear regression models arises when the predictor variables are correlated among each other. The variance of the ordinary least squared estimator become unstable in such situation. In order to mitigate the problem of multicollinearity, Liu regression is widely used as a biased method of estimation with shrinkage parameter ‘d’. The optimal value of shrinkage parameter plays a vital role in bias-variance trade-off. Limitation Several estimators are available in literature for the estimation of shrinkage parameter. But the existing estimators do not perform well in terms of smaller mean squared error when the problem of multicollinearity is high or severe. Methodology In this paper, some new estimators for the shrinkage parameter are proposed. The proposed estimators are the class of estimators that are based on quantile of the regression coefficients. The performance of the new estimators is compared with the existing estimators through Monte Carlo simulation. Mean squared error and mean absolute error is considered as evaluation criteria of the estimators. Tobacco dataset is used as an application to illustrate the benefits of the new estimators and support the simulation results. Findings The new estimators outperform the existing estimators in most of the considered scenarios including high and severe cases of multicollinearity. 95% mean prediction interval of all the estimators is also computed for the Tobacco data. The new estimators give the best mean prediction interval among all other estimators. The implications of the findings We recommend the use of new estimators to practitioners when the problem of high to severe multicollinearity exists among the predictor variables.

Download Full-text

A Novel Computational Intelligence Approach for Coal Consumption Forecasting in Iran

Sustainability ◽

10.3390/su13147612 ◽

2021 ◽

Vol 13 (14) ◽

pp. 7612

Author(s):

Mahdis sadat Jalaee ◽

Alireza Shakibaei ◽

Amin GhasemiNejad ◽

Sayyed Abdolmajid Jalaee ◽

Reza Derakhshani

Keyword(s):

Hybrid Method ◽

Computational Intelligence ◽

Energy Demand ◽

Mean Squared Error ◽

Absolute Error ◽

Coal Consumption ◽

Squared Error ◽

Artificial Neural ◽

Socio Economic Variables ◽

Intelligence Approach

Coal as a fossil and non-renewable fuel is one of the most valuable energy minerals in the world with the largest volume reserves. Artificial neural networks (ANN), despite being one of the highest breakthroughs in the field of computational intelligence, has some significant disadvantages, such as slow training, susceptibility to falling into a local optimal points, sensitivity of initial weights, and bias. To overcome these shortcomings, this study presents an improved ANN structure, that is optimized by a proposed hybrid method. The aim of this study is to propose a novel hybrid method for predicting coal consumption in Iran based on socio-economic variables using the bat and grey wolf optimization algorithm with an artificial neural network (BGWAN). For this purpose, data from 1981 to 2019 have been used for modelling and testing the method. The available data are partly used to find the optimal or near-optimal values of the weighting parameters (1980–2014) and partly to test the model (2015–2019). The performance of the BGWAN is evaluated by mean squared error (MSE), mean absolute error (MAE), root mean squared error (RMSE), standard deviation error (STD), and correlation coefficient (R^2) between the output of the method and the actual dataset. The result of this study showed that BGWAN performance was excellent and proved its efficiency as a useful and reliable tool for monitoring coal consumption or energy demand in Iran.

Download Full-text

Estimating Physical Activity Energy Expenditure Using an Ensemble Model-Based Patch-Type Sensor Module

Electronics ◽

10.3390/electronics10070861 ◽

2021 ◽

Vol 10 (7) ◽

pp. 861

Author(s):

Kyeung Ho Kang ◽

Mingu Kang ◽

Siho Shin ◽

Jaehyo Jung ◽

Meina Li

Keyword(s):

Physical Activity ◽

Energy Expenditure ◽

Mean Squared Error ◽

Absolute Error ◽

Ensemble Model ◽

Direct Calorimetry ◽

Patch Type ◽

Mortality And Morbidity ◽

Squared Error ◽

Sensor Module

Chronic diseases, such as coronary artery disease and diabetes, are caused by inadequate physical activity and are the leading cause of increasing mortality and morbidity rates. Direct calorimetry by calorie production and indirect calorimetry by energy expenditure (EE) has been regarded as the best method for estimating the physical activity and EE. However, this method is inconvenient, owing to the use of an oxygen respiration measurement mask. In this study, we propose a model that estimates physical activity EE using an ensemble model that combines artificial neural networks and genetic algorithms using the data acquired from patch-type sensors. The proposed ensemble model achieved an accuracy of more than 92% (Root Mean Squared Error (RMSE) = 0.1893, R2 = 0.91, Mean Squared Error (MSE) = 0.014213, Mean Absolute Error (MAE) = 0.14020) by testing various structures through repeated experiments.

Download Full-text

A hybrid recommender system based-on link prediction for movie baskets analysis

Journal Of Big Data ◽

10.1186/s40537-021-00422-0 ◽

2021 ◽

Vol 8 (1) ◽

Author(s):

Mohammadsadegh Vahidi Farashah ◽

Akbar Etebarian ◽

Reza Azmi ◽

Reza Ebrahimzadeh Dastjerdi

Keyword(s):

Recommender System ◽

Link Prediction ◽

Mean Squared Error ◽

Spatial Clustering ◽

Randomized Algorithm ◽

Similarity Criterion ◽

Absolute Error ◽

Similarity Criteria ◽

Online Systems ◽

Squared Error

AbstractOver the past decade, recommendation systems have been one of the most sought after by various researchers. Basket analysis of online systems’ customers and recommending attractive products (movies) to them is very important. Providing an attractive and favorite movie to the customer will increase the sales rate and ultimately improve the system. Various methods have been proposed so far to analyze customer baskets and offer entertaining movies but each of the proposed methods has challenges, such as lack of accuracy and high error of recommendations. In this paper, a link prediction-based method is used to meet the challenges of other methods. The proposed method in this paper consists of four phases: (1) Running the CBRS that in this phase, all users are clustered using Density-based spatial clustering of applications with noise algorithm (DBScan), and classification of new users using Deep Neural Network (DNN) algorithm. (2) Collaborative Recommender System (CRS) Based on Hybrid Similarity Criterion through which similarities are calculated based on a threshold (lambda) between the new user and the users in the selected category. Similarity criteria are determined based on age, gender, and occupation. The collaborative recommender system extracts users who are the most similar to the new user. Then, the higher-rated movie services are suggested to the new user based on the adjacency matrix. (3) Running improved Friendlink algorithm on the dataset to calculate the similarity between users who are connected through the link. (4) This phase is related to the combination of collaborative recommender system’s output and improved Friendlink algorithm. The results show that the Mean Squared Error (MSE) of the proposed model has decreased respectively 8.59%, 8.67%, 8.45% and 8.15% compared to the basic models such as Naive Bayes, multi-attribute decision tree and randomized algorithm. In addition, Mean Absolute Error (MAE) of the proposed method decreased by 4.5% compared to SVD and approximately 4.4% compared to ApproSVD and Root Mean Squared Error (RMSE) of the proposed method decreased by 6.05 % compared to SVD and approximately 6.02 % compared to ApproSVD.

Download Full-text

Day-Ahead Forecasting of Hourly Photovoltaic Power Based on Robust Multilayer Perception

Sustainability ◽

10.3390/su10124863 ◽

2018 ◽

Vol 10 (12) ◽

pp. 4863 ◽

Cited By ~ 6

Author(s):

Chao Huang ◽

Longpeng Cao ◽

Nanxin Peng ◽

Sijia Li ◽

Jing Zhang ◽

...

Keyword(s):

Power Plants ◽

Mean Squared Error ◽

Absolute Error ◽

Multilayer Perception ◽

Squared Error ◽

The Mean ◽

Effectiveness And Efficiency ◽

Mlp Network ◽

Grid Operation ◽

Better Than

Photovoltaic (PV) modules convert renewable and sustainable solar energy into electricity. However, the uncertainty of PV power production brings challenges for the grid operation. To facilitate the management and scheduling of PV power plants, forecasting is an essential technique. In this paper, a robust multilayer perception (MLP) neural network was developed for day-ahead forecasting of hourly PV power. A generic MLP is usually trained by minimizing the mean squared loss. The mean squared error is sensitive to a few particularly large errors that can lead to a poor estimator. To tackle the problem, the pseudo-Huber loss function, which combines the best properties of squared loss and absolute loss, was adopted in this paper. The effectiveness and efficiency of the proposed method was verified by benchmarking against a generic MLP network with real PV data. Numerical experiments illustrated that the proposed method performed better than the generic MLP network in terms of root mean squared error (RMSE) and mean absolute error (MAE).

Download Full-text

Comparison of ANFIS and FAHP-FGP methods for supplier selection

Kybernetes ◽

10.1108/k-09-2014-0195 ◽

2016 ◽

Vol 45 (3) ◽

pp. 474-489 ◽

Cited By ~ 6

Author(s):

Moloud sadat Asgari ◽

Abbas Abbasi ◽

Moslem Alimohamadlou

Keyword(s):

Performance Measures ◽

Supplier Selection ◽

Mean Squared Error ◽

Selection Process ◽

Evaluation Criteria ◽

Absolute Error ◽

Global Market ◽

Content Type ◽

Squared Error ◽

Fuzzy Delphi

Purpose – In the contemporary global market, supplier selection represents a crucial process for enhancing firms’ competitiveness. This is a multi-criteria decision-making problem that involves consideration of multiple criteria. Therefore this requires reliable methods to select the best suppliers. The purpose of this paper is to examine and propose appropriate method for selecting suppliers. Design/methodology/approach – ANFIS and fuzzy analytic hierarchy process-fuzzy goal programming (FAHP-FGP) are new methods for evaluating and selecting the best suppliers. These methods are used in this study for evaluating suppliers of dairy industries and the results obtained from methods are compared by performance measures such as Mean Squared Error, Root Mean Squared Error, Normalized Root Men Squared Error, Mean Absolute Error, Normalized Root Men Squared Error, Minimum Absolute Error and R2. Findings – The results indicate that the ANFIS method provides better performance compared to the FAHP-FGP method in terms of the selected suppliers scoring higher in all the performance measures. Practical implications – The proposed method could help companies select the best supplier, by avoiding the influence of personal judgment. Originality/value – This study uses the well-structured method of the fuzzy Delphi in order to determine the supplier evaluation criteria as well as the most recent ANFIS and FAHP-FGP methods for supplier selection. In addition, unlike most other studies, it performs the selection process among all available suppliers.

Download Full-text

Software-based simulation for preprocedural assessment of braided stent sizing: a validation study

Journal of Neurosurgery ◽

10.3171/2018.5.jns18976 ◽

2019 ◽

Vol 131 (5) ◽

pp. 1423-1429 ◽

Cited By ~ 2

Author(s):

Krishna Chaitanya Joshi ◽

Ignacio Larrabide ◽

Ahmed Saied ◽

Nada Elsaid ◽

Hector Fernandez ◽

...

Keyword(s):

Intracranial Aneurysms ◽

Mean Squared Error ◽

Absolute Error ◽

Study Data ◽

Optimal Placement ◽

Squared Error ◽

Single Center Study ◽

Measured Length ◽

The Right ◽

Ruptured Intracranial Aneurysms

OBJECTIVEThe authors sought to validate the use of a software-based simulation for preassessment of braided self-expanding stents in the treatment of wide-necked intracranial aneurysms.METHODSThis was a retrospective, observational, single-center study of 13 unruptured and ruptured intracranial aneurysms treated with braided self-expanding stents. Pre- and postprocedural angiographic studies were analyzed. ANKYRAS software was used to compare the following 3 variables: the manufacturer-given nominal length (NL), software-calculated simulated length (SL), and the actual measured length (ML) of the stent. Appropriate statistical methods were used to draw correlations among the 3 lengths.RESULTSIn this study, data obtained in 13 patients treated with braided self-expanding stents were analyzed. Data for the 3 lengths were collected for all patients. Error discrepancy was calculated by mean squared error (NL to ML −22.2; SL to ML −6.14, p < 0.05), mean absolute error (NL to ML 3.88; SL to ML −1.84, p < 0.05), and mean error (NL to ML −3.81; SL to ML −1.22, p < 0.05).CONCLUSIONSThe ML was usually less than the NL given by the manufacturer, indicating significant change in length in most cases. Computational software-based simulation for preassessment of the braided self-expanding stents is a safe and effective way for accurately calculating the change in length to aid in choosing the right-sized stent for optimal placement in complex intracranial vasculature.

Download Full-text

Prediksi Indeks Harga Saham Gabungan (IHSG) Menggunakan Algoritma Neural Network

Jurnal Edukasi dan Penelitian Informatika (JEPIN) ◽

10.26418/jp.v4i1.25384 ◽

2018 ◽

Vol 4 (1) ◽

pp. 24

Author(s):

Imam Halimi ◽

Wahyu Andhyka Kusuma

Keyword(s):

Neural Network ◽

Data Mining ◽

Linear Regression ◽

Mean Squared Error ◽

Composite Index ◽

T Test ◽

Sliding Windows ◽

Root Mean Squared Error ◽

Squared Error

Investasi saham merupakan hal yang tidak asing didengar maupun dilakukan. Ada berbagai macam saham di Indonesia, salah satunya adalah Indeks Harga Saham Gabungan (IHSG) atau dalam bahasa inggris disebut Indonesia Composite Index, ICI, atau IDX Composite. IHSG merupakan parameter penting yang dipertimbangkan pada saat akan melakukan investasi mengingat IHSG adalah saham gabungan. Penelitian ini bertujuan memprediksi pergerakan IHSG dengan teknik data mining menggunakan algoritma neural network dan dibandingkan dengan algoritma linear regression, yang dapat dijadikan acuan investor saat akan melakukan investasi. Hasil dari penelitian ini berupa nilai Root Mean Squared Error (RMSE) serta label tambahan angka hasil prediksi yang didapatkan setelah dilakukan validasi menggunakan sliding windows validation dengan hasil paling baik yaitu pada pengujian yang menggunakan algoritma neural network yang menggunakan windowing yaitu sebesar 37,786 dan pada pengujian yang tidak menggunakan windowing sebesar 13,597 dan untuk pengujian algoritma linear regression yang menggunakan windowing yaitu sebesar 35,026 dan pengujian yang tidak menggunakan windowing sebesar 12,657. Setelah dilakukan pengujian T-Test menunjukan bahwa pengujian menggunakan neural network yang dibandingkan dengan linear regression memiliki hasil yang tidak signifikan dengan nilai T-Test untuk pengujian dengan windowing dan tanpa windowing hasilnya sama, yaitu sebesar 1,000.

Download Full-text

Multiscale Aggregate Networks with Dense Connections for Crowd Counting

Computational Intelligence and Neuroscience ◽

10.1155/2021/9996232 ◽

2021 ◽

Vol 2021 ◽

pp. 1-12

Author(s):

Pengfei Li ◽

Min Zhang ◽

Jian Wan ◽

Ming Jiang

Keyword(s):

Mean Squared Error ◽

Absolute Error ◽

Image Features ◽

Convolutional Network ◽

Crowd Counting ◽

Squared Error ◽

Crowd Density ◽

Density Maps ◽

Density Map ◽

Map Decoder

The most advanced method for crowd counting uses a fully convolutional network that extracts image features and then generates a crowd density map. However, this process often encounters multiscale and contextual loss problems. To address these problems, we propose a multiscale aggregation network (MANet) that includes a feature extraction encoder (FEE) and a density map decoder (DMD). The FEE uses a cascaded scale pyramid network to extract multiscale features and obtains contextual features through dense connections. The DMD uses deconvolution and fusion operations to generate features containing detailed information. These features can be further converted into high-quality density maps to accurately calculate the number of people in a crowd. An empirical comparison using four mainstream datasets (ShanghaiTech, WorldExpo’10, UCF_CC_50, and SmartCity) shows that the proposed method is more effective in terms of the mean absolute error and mean squared error. The source code is available at https://github.com/lpfworld/MANet.

Download Full-text