Multilayer Perceptron untuk Prediksi Sessions pada Sebuah Website Journal Elektronik

Peramalan session website journal dilakukan untuk pendukung pengambilan keputusan dalam rangka meningkatkan kualitas dan nilai akreditasi pada website jurnal. Data sessions dianalisis berdasarkan pergerakan pola data time series menggunakan metode multilayer perceptron. Karakteristik yang dimiliki oleh multilayer perceptron yaitu keunggulan dalam penentuan nilai bobot yang lebih baik daripada metode lain, multilayer perceptron dapat digunakan tanpa pengetahuan sebelumnya dan algoritma dapat diimplementasikan dengan mudah serta mampu menyelesaikan masalah linear dan nonlinear sehingga nilai peramalan menjadi lebih baik. Penelitian menggunakan berbagai persentase data train dan test. Perbandingan data train dan test yang memiliki nilai terbaik adalah 80% data train dan 20% data test dengan learning rate 0.4 dan arsitektur 2-1-1. Hasil evaluasi model diperoleh nilai MSE dan RMSE, 0.015357 dan 0.123999 untuk training set serta, 0.018996 dan 0.137826 untuk MSE dan RMSE dari test set. Waktu eksekusi yang dibutuhkan untuk melakukan peramalan adalah 580.0651 second atau 9.667751 menit.

Download Full-text

Implementasi Multilayer Perceptron Pada Jaringan Saraf Tiruan Untuk Memprediksi Nilai Valuta Asing

INTEGER: Journal of Information Technology ◽

10.31284/j.integer.2020.v5i1.909 ◽

2020 ◽

Vol 5 (1) ◽

Author(s):

Tommy Ferdian Hadimarta ◽

Rani Rotul Muhima ◽

Muchamad Kurniawan

Keyword(s):

Time Series ◽

Foreign Exchange ◽

Multilayer Perceptron ◽

Supply And Demand ◽

Learning Rate ◽

Process Data ◽

Nerve Network ◽

Price Patterns ◽

Available Resources

Abstract. In the context of FOREX investment, the fluctuation of currency becomes a common thing in which movement is greatly influenced by supply and demand. If the demand is higher, the price will increase and conversely, if the supply is higher, the price will go downward. There is a principle that the behavior of price patterns will repeat randomly and make unpredictable movement of FOREX. These patterns of currency fluctuation have deceived many investors and brought losses and even capital failure. Basically, the value of foreign exchange belongs to the data of time series and Multilayer Perceptron is very suitable to process data of time series as it is often used to make prediction. Therefore, this research aimed at implementing Multilayer Perceptron in the artificial nerve network for predicting the value of foreign exchange on the available resources using the attributes of open, high, low, and close. To process the data from the existing attributes, there must be initialization first in X1 (open), X2 (high), and X3 (low) as the inputs and Y (close) as the data target, and then they were normalized so as to calculate sigmoid. The increasing number of epoch does not guarantee that the errors will be smaller. On the contrary, perhaps, the error value will increase. The best result of training occurred by epoch 200 and learning rate 3 within the smallest values of MSE 281.02518, MAD 13.168, and deviation standard 10.294.

Download Full-text

Comparison of ARIMA model and XGBoost model for prediction of human brucellosis in mainland China: a time-series study

BMJ Open ◽

10.1136/bmjopen-2020-039676 ◽

2020 ◽

Vol 10 (12) ◽

pp. e039676

Author(s):

Mirxat Alim ◽

Guo-Hua Ye ◽

Peng Guan ◽

De-Sheng Huang ◽

Bao-Sen Zhou ◽

...

Keyword(s):

Time Series ◽

Arima Model ◽

Mainland China ◽

Human Brucellosis ◽

Training Set ◽

Test Set ◽

Time Series Study ◽

Extreme Gradient Boosting ◽

Series Study ◽

Monthly Incidence

ObjectivesHuman brucellosis is a public health problem endangering health and property in China. Predicting the trend and the seasonality of human brucellosis is of great significance for its prevention. In this study, a comparison between the autoregressive integrated moving average (ARIMA) model and the eXtreme Gradient Boosting (XGBoost) model was conducted to determine which was more suitable for predicting the occurrence of brucellosis in mainland China.DesignTime-series study.SettingMainland China.MethodsData on human brucellosis in mainland China were provided by the National Health and Family Planning Commission of China. The data were divided into a training set and a test set. The training set was composed of the monthly incidence of human brucellosis in mainland China from January 2008 to June 2018, and the test set was composed of the monthly incidence from July 2018 to June 2019. The mean absolute error (MAE), root mean square error (RMSE) and mean absolute percentage error (MAPE) were used to evaluate the effects of model fitting and prediction.ResultsThe number of human brucellosis patients in mainland China increased from 30 002 in 2008 to 40 328 in 2018. There was an increasing trend and obvious seasonal distribution in the original time series. For the training set, the MAE, RSME and MAPE of the ARIMA(0,1,1)×(0,1,1)12 model were 338.867, 450.223 and 10.323, respectively, and the MAE, RSME and MAPE of the XGBoost model were 189.332, 262.458 and 4.475, respectively. For the test set, the MAE, RSME and MAPE of the ARIMA(0,1,1)×(0,1,1)12 model were 529.406, 586.059 and 17.676, respectively, and the MAE, RSME and MAPE of the XGBoost model were 249.307, 280.645 and 7.643, respectively.ConclusionsThe performance of the XGBoost model was better than that of the ARIMA model. The XGBoost model is more suitable for prediction cases of human brucellosis in mainland China.

Download Full-text

Sunspot Time Series Forecasting using Deep Learning

International Journal of Computer and Information Technology(2279-0764) ◽

10.24203/ijcit.v9i2.8 ◽

2020 ◽

Vol 9 (2) ◽

Author(s):

Mahmoud Elgamal

Keyword(s):

Time Series ◽

Deep Learning ◽

Solar Cycle ◽

Short Term Memory ◽

Time Series Forecasting ◽

Training Set ◽

Short Term ◽

Test Set ◽

Term Memory ◽

Long Short Term Memory

In order to forecast solar cycle 25, sunspot numbers(SSN) from 1700 ∼ 2018 was used as a time series to predict the next eleven years. deep long short-term memory(LSTM) was exploited to do the forecast, ﬁrst the dataset was split into training set(80%) and (20%) for the test set, the achieved accuracy led us to forecast the next eleven years. The result shows that the cycle will be from 2019 ∼ 2029 with peak at 2024.

Download Full-text

PREDIKSI KUALITAS AIR SUNGAI CILIWUNG DENGAN MENGGUNAKAN ALGORITMA POHON KEPUTUSAN

Jurnal Air Indonesia ◽

10.29122/jai.v12i2.4364 ◽

2021 ◽

Vol 12 (2) ◽

Author(s):

Mohammad Haekal ◽

Henki Bayu Seta ◽

Mayanda Mega Santoni

Keyword(s):

Data Mining ◽

Decision Tree ◽

Cross Validation ◽

Online Monitoring ◽

Training Set ◽

Microsoft Excel ◽

Test Set

Untuk memprediksi kualitas air sungai Ciliwung, telah dilakukan pengolahan data-data hasil pemantauan secara Online Monitoring dengan menggunakan Metode Data Mining. Pada metode ini, pertama-tama data-data hasil pemantauan dibuat dalam bentuk tabel Microsoft Excel, kemudian diolah menjadi bentuk Pohon Keputusan yang disebut Algoritma Pohon Keputusan (Decision Tree) mengunakan aplikasi WEKA. Metode Pohon Keputusan dipilih karena lebih sederhana, mudah dipahami dan mempunyai tingkat akurasi yang sangat tinggi. Jumlah data hasil pemantauan kualitas air sungai Ciliwung yang diolah sebanyak 5.476 data. Hasil klarifikasi dengan Pohon Keputusan, dari 5.476 data ini diperoleh jumlah data yang mengindikasikan sungai Ciliwung Tidak Tercemar sebanyak 1.059 data atau sebesar 19,3242%, dan yang mengindikasikan Tercemar sebanyak 4.417 data atau 80,6758%. Selanjutnya data-data hasil pemantauan ini dievaluasi menggunakan 4 Opsi Tes (Test Option) yaitu dengan Use Training Set, Supplied Test Set, Cross-Validation folds 10, dan Percentage Split 66%. Hasil evaluasi dengan 4 opsi tes yang digunakan ini, semuanya menunjukkan tingkat akurasi yang sangat tinggi, yaitu diatas 99%. Dari data-data hasil peneltian ini dapat diprediksi bahwa sungai Ciliwung terindikasi sebagai sungai tercemar bila mereferensi kepada Peraturan Pemerintah Republik Indonesia nomor 82 tahun 2001 dan diketahui pula bahwa penggunaan aplikasi WEKA dengan Algoritma Pohon Keputusan untuk mengolah data-data hasil pemantauan dengan mengambil tiga parameter (pH, DO dan Nitrat) adalah sangat akuran dan tepat. Kata Kunci : Kualitas air sungai, Data Mining, Algoritma Pohon Keputusan, Aplikasi WEKA.

Download Full-text

QSPR modelling of the octanol/water partition coefficient of organometallic substances by optimal SMILES-based descriptors

Open Chemistry ◽

10.2478/s11532-009-0095-y ◽

2009 ◽

Vol 7 (4) ◽

pp. 846-856 ◽

Cited By ~ 6

Author(s):

Andrey Toropov ◽

Alla Toropova ◽

Emilio Benfenati

Keyword(s):

Partition Coefficient ◽

Organometallic Compounds ◽

Applicability Domain ◽

Training Set ◽

Input Line ◽

Test Set ◽

Water Partition Coefficient ◽

Definition Of

AbstractUsually, QSPR is not used to model organometallic compounds. We have modeled the octanol/water partition coefficient for organometallic compounds of Na, K, Ca, Cu, Fe, Zn, Ni, As, and Hg by optimal descriptors calculated with simplified molecular input line entry system (SMILES) notations. The best model is characterized by the following statistics: n=54, r2=0.9807, s=0.677, F=2636 (training set); n=26, r2=0.9693, s=0.969, F=759 (test set). Empirical criteria for the definition of the applicability domain for these models are discussed.

Download Full-text

Feature-Weighted Sampling for Proper Evaluation of Classification Models

Applied Sciences ◽

10.3390/app11052039 ◽

2021 ◽

Vol 11 (5) ◽

pp. 2039

Author(s):

Hyunseok Shin ◽

Sejong Oh

Keyword(s):

Random Sampling ◽

Sampling Method ◽

Classification Model ◽

Training Set ◽

Test Set ◽

Feature Importance ◽

Proper Training ◽

Machine Learning Applications ◽

Test Sets ◽

The Given

In machine learning applications, classification schemes have been widely used for prediction tasks. Typically, to develop a prediction model, the given dataset is divided into training and test sets; the training set is used to build the model and the test set is used to evaluate the model. Furthermore, random sampling is traditionally used to divide datasets. The problem, however, is that the performance of the model is evaluated differently depending on how we divide the training and test sets. Therefore, in this study, we proposed an improved sampling method for the accurate evaluation of a classification model. We first generated numerous candidate cases of train/test sets using the R-value-based sampling method. We evaluated the similarity of distributions of the candidate cases with the whole dataset, and the case with the smallest distribution–difference was selected as the final train/test set. Histograms and feature importance were used to evaluate the similarity of distributions. The proposed method produces more proper training and test sets than previous sampling methods, including random and non-random sampling.

Download Full-text

Weakly supervised deep learning for determining the prognostic value of 18F-FDG PET/CT in extranodal natural killer/T cell lymphoma, nasal type

European Journal of Nuclear Medicine and Molecular Imaging ◽

10.1007/s00259-021-05232-3 ◽

2021 ◽

Author(s):

Rui Guo ◽

Xiaobin Hu ◽

Haoming Song ◽

Pengpeng Xu ◽

Haoping Xu ◽

...

Keyword(s):

Deep Learning ◽

Fdg Pet ◽

Cell Lymphoma ◽

Training Set ◽

Test Set ◽

Natural Killer T Cell ◽

Pet Ct ◽

Weakly Supervised ◽

Fdg Pet Ct ◽

Killer T Cell

Abstract Purpose To develop a weakly supervised deep learning (WSDL) method that could utilize incomplete/missing survival data to predict the prognosis of extranodal natural killer/T cell lymphoma, nasal type (ENKTL) based on pretreatment 18F-FDG PET/CT results. Methods One hundred and sixty-seven patients with ENKTL who underwent pretreatment 18F-FDG PET/CT were retrospectively collected. Eighty-four patients were followed up for at least 2 years (training set = 64, test set = 20). A WSDL method was developed to enable the integration of the remaining 83 patients with incomplete/missing follow-up information in the training set. To test generalization, these data were derived from three types of scanners. Prediction similarity index (PSI) was derived from deep learning features of images. Its discriminative ability was calculated and compared with that of a conventional deep learning (CDL) method. Univariate and multivariate analyses helped explore the significance of PSI and clinical features. Results PSI achieved area under the curve scores of 0.9858 and 0.9946 (training set) and 0.8750 and 0.7344 (test set) in the prediction of progression-free survival (PFS) with the WSDL and CDL methods, respectively. PSI threshold of 1.0 could significantly differentiate the prognosis. In the test set, WSDL and CDL achieved prediction sensitivity, specificity, and accuracy of 87.50% and 62.50%, 83.33% and 83.33%, and 85.00% and 75.00%, respectively. Multivariate analysis confirmed PSI to be an independent significant predictor of PFS in both the methods. Conclusion The WSDL-based framework was more effective for extracting 18F-FDG PET/CT features and predicting the prognosis of ENKTL than the CDL method.

Download Full-text

Liquid State Machine to Generate the Movement Profiles for the Gait Cycle of a 6 DOF Bipedal Robot in a Sagittal Plane

Volume 5B: 42nd Mechanisms and Robotics Conference ◽

10.1115/detc2018-86206 ◽

2018 ◽

Author(s):

Jesús Franco-Robles ◽

Alejandro De Lucio-Rangel ◽

Karla A. Camarillo-Gómez ◽

Gerardo I. Pérez-Soto ◽

Jesús Rivera-Guillén

Keyword(s):

Time Series ◽

Multilayer Perceptron ◽

Liquid State ◽

Gait Cycle ◽

Sagittal Plane ◽

Experimental Result ◽

Forward Kinematics ◽

Neuronal System ◽

Bipedal Robot ◽

Input Time

In this paper, a neuronal system with the ability to generate motion profiles and profiles of the ZMP in a 6DoF bipedal robot in the sagittal plane, is presented. The input time series for LSM training are movement profiles of the oscillating foot trajectory obtained by forward kinematics performed by a previously trained ANN multilayer perceptron. The profiles of objective movement for training are acquired from the analysis of the human walk. Based on a previous simulation of the bipedal robot, a profile of the objective ZMP will be generated for the y–axis and another for the z–axis to know its behavior during the training walk. As an experimental result, the LSM generates new motion profiles and ZMP, given a different trajectory with which it was trained. With the LSM it will be possible to propose new trajectories of the oscillating foot, where it will be known if this trajectory will be stable, by the ZMP, and what movement profile for each articulation will be required to reach this trajectory.

Download Full-text

Prediction of the Toxicity of Binary Mixtures by QSAR Approach Using the Hypothetical Descriptors

International Journal of Molecular Sciences ◽

10.3390/ijms19113423 ◽

2018 ◽

Vol 19 (11) ◽

pp. 3423 ◽

Cited By ~ 12

Author(s):

Ting Wang ◽

Lili Tang ◽

Feng Luan ◽

M. Natália D. S. Cordeiro

Keyword(s):

Correlation Coefficient ◽

Binary Mixtures ◽

Quantitative Structure Activity Relationship ◽

Training Set ◽

Statistical Parameters ◽

Test Set ◽

Qsar Models ◽

Forward Stepwise ◽

Leave One Out ◽

External Test

Organic compounds are often exposed to the environment, and have an adverse effect on the environment and human health in the form of mixtures, rather than as single chemicals. In this paper, we try to establish reliable and developed classical quantitative structure–activity relationship (QSAR) models to evaluate the toxicity of 99 binary mixtures. The derived QSAR models were built by forward stepwise multiple linear regression (MLR) and nonlinear radial basis function neural networks (RBFNNs) using the hypothetical descriptors, respectively. The statistical parameters of the MLR model provided were N (number of compounds in training set) = 79, R2 (the correlation coefficient between the predicted and observed activities)= 0.869, LOOq2 (leave-one-out correlation coefficient) = 0.864, F (Fisher’s test) = 165.494, and RMS (root mean square) = 0.599 for the training set, and Next (number of compounds in external test set) = 20, R2 = 0.853, qext2 (leave-one-out correlation coefficient for test set)= 0.825, F = 30.861, and RMS = 0.691 for the external test set. The RBFNN model gave the statistical results, namely N = 79, R2 = 0.925, LOOq2 = 0.924, F = 950.686, RMS = 0.447 for the training set, and Next = 20, R2 = 0.896, qext2 = 0.890, F = 155.424, RMS = 0.547 for the external test set. Both of the MLR and RBFNN models were evaluated by some statistical parameters and methods. The results confirm that the built models are acceptable, and can be used to predict the toxicity of the binary mixtures.

Download Full-text

Identification of Multi-omics Biomarkers and Construction of the Novel Prognostic Model for Hepatocellular Carcinoma

10.21203/rs.3.rs-452644/v1 ◽

2021 ◽

Author(s):

Xiaokai Yan ◽

Chiying Xiao ◽

Kunyan Yue ◽

Min Chen ◽

Hang Zhou

Keyword(s):

Hepatocellular Carcinoma ◽

Survival Analysis ◽

Prognostic Model ◽

Prognostic Models ◽

Prognostic Indicators ◽

Omics Data ◽

Training Set ◽

Test Set ◽

Model Based ◽

Cox Analysis

Abstract Background: Change in the genome plays a crucial role in cancerogenesis and many biomarkers can be used as effective prognostic indicators in diverse tumors. Currently, although many studies have constructed some predictive models for hepatocellular carcinoma (HCC) based on molecular signatures, the performance of which is unsatisfactory. To fill this shortcoming, we hope to construct a novel and accurate prognostic model with multi-omics data to guide prognostic assessments of HCC. Methods: The TCGA training set was used to identify crucial biomarkers and construct single-omic prognostic models through difference analysis, univariate Cox, and LASSO/stepwise Cox analysis. Then the performances of single-omic models were evaluated and validated through survival analysis, Harrell’s concordance index (C-index), and receiver operating characteristic (ROC) curve, in the TCGA test set and external cohorts. Besides, a comprehensive model based on multi-omics data was constructed via multiple Cox analysis, and the performance of which was evaluated in the TCGA training set and TCGA test set. Results: We identified 16 key mRNAs, 20 key lncRNAs, 5 key miRNAs, 5 key CNV genes, and 7 key SNPs which were significantly associated with the prognosis of HCC, and constructed 5 single-omic models which showed relatively good performance in prognostic prediction with c-index ranged from 0.63 to 0.75 in the TCGA training set and test set. Besides, we validated the mRNA model and the SNP model in two independent external datasets respectively, and good discriminating abilities were observed through survival analysis (P < 0.05). Moreover, the multi-omics model based on mRNA, lncRNA, miRNA, CNV, and SNP information presented a quite strong predictive ability with c-index over 0.80 and all AUC values at 1,3,5-years more than 0.84.Conclusion: In this study, we identified many biomarkers that may help study underlying carcinogenesis mechanisms in HCC, and constructed five single-omic models and an integrated multi-omics model that may provide effective and reliable guides for prognosis assessment and treatment decision-making.

Download Full-text