Machine learning meets pKa

We present a small molecule pKa prediction tool entirely written in Python. It predicts the macroscopic pKa value and is trained on a literature compilation of monoprotic compounds. Different machine learning models were tested and random forest performed best given a five-fold cross-validation (mean absolute error=0.682, root mean squared error=1.032, correlation coefficient r2 =0.82). We test our model on two external validation sets, where our model performs comparable to Marvin and is better than a recently published open source model. Our Python tool and all data is freely available at https://github.com/czodrowskilab/Machine-learning-meets-pKa.

Download Full-text

Machine learning and Grad-Cam based vascular aging assessment using photoplethysmogram (Preprint)

10.2196/preprints.31709 ◽

2021 ◽

Author(s):

Hangsik Shin

Keyword(s):

Machine Learning ◽

Correlation Coefficient ◽

Age Estimation ◽

Mean Squared Error ◽

Mean Absolute Error ◽

Absolute Error ◽

Coefficient Of Determination ◽

Vascular Aging ◽

Squared Error ◽

Vascular Age

BACKGROUND Arterial stiffness due to vascular aging is a major indicator for evaluating cardiovascular risk. OBJECTIVE In this study, we propose a method of estimating age by applying machine learning to photoplethysmogram for non-invasive vascular age assessment. METHODS The machine learning-based age estimation model that consists of three convolutional layers and two-layer fully connected layers, was developed using segmented photoplethysmogram by pulse from a total of 752 adults aged 19–87 years. The performance of the developed model was quantitatively evaluated using mean absolute error, root-mean-squared-error, Pearson’s correlation coefficient, coefficient of determination. The Grad-Cam was used to explain the contribution of photoplethysmogram waveform characteristic in vascular age estimation. RESULTS Mean absolute error of 8.03, root mean squared error of 9.96, 0.62 of correlation coefficient, and 0.38 of coefficient of determination were shown through 10-fold cross validation. Grad-Cam, used to determine the weight that the input signal contributes to the result, confirmed that the contribution to the age estimation of the photoplethysmogram segment was high around the systolic peak. CONCLUSIONS The machine learning-based vascular aging analysis method using the PPG waveform showed comparable or superior performance compared to previous studies without complex feature detection in evaluating vascular aging. CLINICALTRIAL 2015-0104

Download Full-text

Machine Learning Algorithmic Study of the Naira Exchange Rate

European Journal of Engineering Research and Science ◽

10.24018/ejers.2020.5.2.1739 ◽

2020 ◽

Vol 5 (2) ◽

pp. 183-186

Author(s):

Ledisi Giok Kabari ◽

Marcus B. Chigoziri ◽

Joseph Eneotu

Keyword(s):

Machine Learning ◽

Exchange Rate ◽

Exchange Rates ◽

Mean Squared Error ◽

Mean Absolute Error ◽

Absolute Error ◽

Machine Learning Algorithms ◽

Coefficient Of Determination ◽

Rate Data ◽

Squared Error

In this study, we discuss various machine learning algorithms and architectures suitable for the Nigerian Naira exchange rate forecast. Our analyses were focused on the exchange rates of the British Pounds, US Dollars and the Euro against the Naira. The exchange rate data was sourced from the Central Bank of Nigeria. The performances of the algorithms were evaluated using Mean Squared Error, Root Mean Squared Error, Mean Absolute Error and the coefficient of determination (R-Squared score). Finally, we compared the performances of these algorithms in forecasting the exchange rates.

Download Full-text

PENERAPAN ALGORITMA RANDOM FOREST UNTUK IDENTIFIKASI DEHIDRASI BERBASIS CITRA URINE

Jurnal Informatika Polinema ◽

10.33795/jip.v6i3.348 ◽

2020 ◽

Vol 6 (3) ◽

pp. 49-54

Author(s):

Niyalatul Muna ◽

Faisal Lutfi Afriansyah ◽

Ameng Bagus Suprayogy

Keyword(s):

Random Forest ◽

Cross Validation ◽

Mean Squared Error ◽

Mean Absolute Error ◽

Absolute Error ◽

Root Mean Squared Error ◽

Squared Error

Tingkat dehidrasi tidak hanya bisa dirasakan secara langsung akan tetapi dapat diamati dan dilihat secara fisik berbasis visual. Secara visual salah satu gejala dari dehidrasi dapat dilihat dari warna urine. Gejala ini biasanya tidak begitu diperhatikan dan dianggap biasa. Padahal gejala hipohidrasi atau dehidrasi merupakan dampak yang merugikan dari asupan air yang tidak memadai sehingga mempengaruhi warna urine yang dihasilkan. Kesulitan panca indra manusia membedakan gejala dehidrasi dan melihat perbedaan warna urine secara visual sering diterjemahkan berbeda-beda, dikarenakan tingkat kemiripan warna yang dihasilkan. Beberapa penelitian menunjukkan adanya pemanfaatan teknologi kamera dengan sistem cerdas dapat membantu kesulitan dan keterbatasan panca indra manusia. Penelitian ini menggunakan citra urine diambil dari sample orang dewasa yang dikelompokkan berdasarkan kategori warna urine hasil penelitian terdahulu. Pengambilan fitur dari setiap citra urine diambil nilai warna dari YCbCr. Model warna yang dihasilkan dari setiap sampel akan diidentifikasi menggunakan algoritma Random Forest dengan cross-validation. Hasil dari percobaan yang dilakukan menunjukkan akurasi 90% dari 30 dataset yang diujikan dengan nilai precision 90.2%, recall 90%, Mean absolute error 0.2473, dan Root mean squared error sebesar 0.3208.

Download Full-text

Day-Ahead Forecasting of Hourly Photovoltaic Power Based on Robust Multilayer Perception

Sustainability ◽

10.3390/su10124863 ◽

2018 ◽

Vol 10 (12) ◽

pp. 4863 ◽

Cited By ~ 6

Author(s):

Chao Huang ◽

Longpeng Cao ◽

Nanxin Peng ◽

Sijia Li ◽

Jing Zhang ◽

...

Keyword(s):

Power Plants ◽

Mean Squared Error ◽

Absolute Error ◽

Multilayer Perception ◽

Squared Error ◽

The Mean ◽

Effectiveness And Efficiency ◽

Mlp Network ◽

Grid Operation ◽

Better Than

Photovoltaic (PV) modules convert renewable and sustainable solar energy into electricity. However, the uncertainty of PV power production brings challenges for the grid operation. To facilitate the management and scheduling of PV power plants, forecasting is an essential technique. In this paper, a robust multilayer perception (MLP) neural network was developed for day-ahead forecasting of hourly PV power. A generic MLP is usually trained by minimizing the mean squared loss. The mean squared error is sensitive to a few particularly large errors that can lead to a poor estimator. To tackle the problem, the pseudo-Huber loss function, which combines the best properties of squared loss and absolute loss, was adopted in this paper. The effectiveness and efficiency of the proposed method was verified by benchmarking against a generic MLP network with real PV data. Numerical experiments illustrated that the proposed method performed better than the generic MLP network in terms of root mean squared error (RMSE) and mean absolute error (MAE).

Download Full-text

Designing accurate emulators for scientific processes using calibration-driven deep models

Nature Communications ◽

10.1038/s41467-020-19448-8 ◽

2020 ◽

Vol 11 (1) ◽

Author(s):

Jayaraman J. Thiagarajan ◽

Bindya Venkatesh ◽

Rushil Anirudh ◽

Peer-Timo Bremer ◽

Jim Gaffney ◽

...

Keyword(s):

Mean Squared Error ◽

Mean Absolute Error ◽

Absolute Error ◽

Heterogeneous Data ◽

Small Data ◽

Machine Learning Methods ◽

Squared Error ◽

Noise Structure ◽

The Mean ◽

Modern Machine

Abstract Predictive models that accurately emulate complex scientific processes can achieve speed-ups over numerical simulators or experiments and at the same time provide surrogates for improving the subsequent analysis. Consequently, there is a recent surge in utilizing modern machine learning methods to build data-driven emulators. In this work, we study an often overlooked, yet important, problem of choosing loss functions while designing such emulators. Popular choices such as the mean squared error or the mean absolute error are based on a symmetric noise assumption and can be unsuitable for heterogeneous data or asymmetric noise distributions. We propose Learn-by-Calibrating, a novel deep learning approach based on interval calibration for designing emulators that can effectively recover the inherent noise structure without any explicit priors. Using a large suite of use-cases, we demonstrate the efficacy of our approach in providing high-quality emulators, when compared to widely-adopted loss function choices, even in small-data regimes.

Download Full-text

Electricity Consumption Analysis Using Demographic Variables Case Study – Nakhonratchasima, Thailand

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.734-737.1679 ◽

2013 ◽

Vol 734-737 ◽

pp. 1679-1682

Author(s):

Sureeporn Meehom ◽

Nopphadon Khodpun

Keyword(s):

High Speed ◽

Mean Squared Error ◽

Mean Absolute Error ◽

Electricity Consumption ◽

Absolute Error ◽

Demographic Variables ◽

Resources Management ◽

Squared Error ◽

High Speed Diesel

Electricity energy is vital in social and economic for nation development. The electricity consumption analysis plays an important role for sustainable energy and electricity resources management in the future. In this paper, the influence of demographical variables on the annual electricity consumption in Nakhonratchasima has been investigated by multiple regression analysis. It is founded that the electricity consumption correlated with four demographic variables, which are the number of electricity consumers, the amount of high speed diesel usages, the number of industrial factory and the number of employed labor force. The historical electricity consumption and all variables for the period 20022010 have been analyzed in 8 models for electricity prediction in 2011. In conclusion, the effective model has been selected by comparison of adjusted R2, mean absolute error (MAE) and root mean squared error (RMSE) of the proposed models. Model 8 is acceptable in relation to electricity consumption analysis with adjusted-R2, RMSE and MAE equal to 0.9980, 0.7540% and 0.6095% respectively. The results indicate that the model using all four variables has strong ability to predict future annual electricity consumption with 4,195,837,877 kWh in 2011.

Download Full-text

On the mean squared error, the mean absolute error and the like

Communication in Statistics- Theory and Methods ◽

10.1080/03610929908832390 ◽

1999 ◽

Vol 28 (8) ◽

pp. 1813-1822 ◽

Cited By ~ 6

Author(s):

Shaul K. Bar-Lev ◽

Benzion Boukai ◽

Peter Enis

Keyword(s):

Mean Squared Error ◽

Mean Absolute Error ◽

Absolute Error ◽

Squared Error ◽

The Mean

Download Full-text

Visualization & Prediction of COVID-19 Future Outbreak by Using Machine Learning

International Journal of Information Technology and Computer Science ◽

10.5815/ijitcs.2021.03.02 ◽

2021 ◽

Vol 13 (3) ◽

pp. 16-32

Author(s):

Ahmed Hassan Mohammed Hassan ◽

◽

Arfan Ali Mohammed Qasem ◽

Walaa Faisal Mohammed Abdalla ◽

Omer H. Elhassan

Keyword(s):

Machine Learning ◽

Polynomial Regression ◽

Mean Squared Error ◽

Absolute Error ◽

Future Perspective ◽

Support Vector ◽

Squared Error ◽

Vector Machines ◽

The World ◽

Negative Factors

Day by day, the accumulative incidence of COVID-19 is rapidly increasing. After the spread of the Corona epidemic and the death of more than a million people around the world countries, scientists and researchers have tended to conduct research and take advantage of modern technologies to learn machine to help the world to get rid of the Coronavirus (COVID-19) epidemic. To track and predict the disease Machine Learning (ML) can be deployed very effectively. ML techniques have been anticipated in areas that need to identify dangerous negative factors and define their priorities. The significance of a proposed system is to find the predict the number of people infected with COVID19 using ML. Four standard models anticipate COVID-19 prediction, which are Neural Network (NN), Support Vector Machines (SVM), Bayesian Network (BN) and Polynomial Regression (PR). The data utilized to test these models content of number of deaths, newly infected cases, and recoveries in the next 20 days. Five measures parameters were used to evaluate the performance of each model, namely root mean squared error (RMSE), mean squared error (MAE), mean absolute error (MSE), Explained Variance score and r2 score (R2). The significance and value of proposed system auspicious mechanism to anticipate these models for the current cenario of the COVID-19 epidemic. The results showed NN outperformed the other models, while in the available dataset the SVM performs poorly in all the prediction. Reference to our results showed that injuries will increase slightly in the coming days. Also, we find that the results give rise to hope due to the low death rate. For future perspective, case explanation and data amalgamation must be kept up persistently.

Download Full-text

Peramalan Penjualan dengan Metode Exponential Smoothing (Studi Kasus : Penjualan Bakso Kemasaan/Kiloan Rumah Bakso Bang Ipul)

Journal of Mathematics, Computations, and Statistics ◽

10.35580/jmathcos.v1i1.9168 ◽

2019 ◽

Vol 1 (1) ◽

pp. 1

Author(s):

Hisyam Ihsan ◽

Rahmat Syam ◽

Fahrul Ahmad

Keyword(s):

Mean Squared Error ◽

Mean Absolute Error ◽

Absolute Error ◽

Exponential Smoothing ◽

Trial And Error ◽

Squared Error ◽

Forecasting Method ◽

Operational Activities ◽

Exponential Smoothing Method ◽

Single Exponential

Abstrak. Peramalan penjualan memungkinkan sebuah perusahan memilih kebijakan yang optimal untuk membuat keputusan yang sesuai dan mempertahankan efisiensi dari kegiatan operasional. Rumah Bakso Bang Ipul adalah salah satu usaha yang melakukan penjualan yakni penjualan bakso kemasaan/kiloan. Oleh sebab itu,. Rumah Bakso Bang Ipul sangat memerlukan peramalan penjualan untuk meningkatkan keuntungan dan menghindari terjadinya kelebihan atau kekurangan persedian bakso kemasaan/kiloan. Penelitian ini dilakukan peramalan dengan metode exponential smoothing. Adapun parameter atau a yang digunakan dalam meramalkan penjualan adalah a = 0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8, dan 0.9. Singel exponential smoothing melakukan perbandingan dalam menentukan nilai a, dengan mencari nilai a tersebut secara trial and error sampai menemukan a yang memiliki error minimum dengan pencarian menggunakan metode mean absolute error (MAE) dan metode Mean Squaered error (MSE). Sehingga dipilih a = 0.1 dengan nilai MAE = 6.23 dan nilai MSE = 58.32. berdasarkan hasil ini, dengan menggunakan metode singel exponential smoothing dan a =0.1 diperoleh hasil peramalan penjualan bakso bang ipul pada bulan juni 2018 sebanyak 48 kilogram.Kata Kunci: Peramalan, Metode Exponential Smoothing, Metode Singel Exponential SmoothingAbstract. Sales forecasting enables an optimal policy of the company had to make the appropriate decision and maintain the efficiency of operational activities. Rumah Bakso Bang Ipul is a business that sells packaged meatballs. Therefore, Rumah Bakso Bang Ipul is in need of sales forecasting to increase profit and avoid the occurrence or lack of supply of packaged meatballs. This research was conducted by the method of exponential smoothing forecasting. As for parameter or a used predicting sales is a = 0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8, and 0.9. single exponential smoothing do a comparison in determining the value of a, by searching for the value of such a trial and error to find a that has minimum error with search method using the mean absolute error (MAE) and mean squared error (MSE). So that selected a = 0.1 with MAE value = 6.23 and MSE Value = 58.32. Based on these results, using the method of single exponential smoothing and retrieved results forecasting Rumah Bakso Bang Ipul in July 2018 as much as 48 kilograms.Keywords: Forecasting, Method of exponential smoothing, Method of single exponential smoothing.

Download Full-text