scholarly journals IOL formula constants – strategies for optimization and defining standards for presenting data

2021 ◽  
Author(s):  
Achim Langenbucher ◽  
Nóra Szentmáry ◽  
Alan Cayless ◽  
Michael Müller ◽  
Timo Eppig ◽  
...  

Purpose: To present strategies for optimization of lens power formula constants and to show options how to present the results adequately. Methods: A dataset of N=1601 preoperative biometric values, lens power data and postoperative refraction data was split into a training set and a test set using a random sequence. Based on the training set we calculated the formula constants for established lens calculation formulae with different methods. Based on the test set we derived the formula prediction error as difference of the achieved refraction from the formula predicted refraction. Results: For formulae with 1 constant it is possible to back-calculate the individual constant for each case using formula inversion. However, this is not possible for formulae with more than 1 constant. In these cases, more advanced concepts such as nonlinear optimization strategies are necessary to derive the formula constants. During cross-validation, measures such as the mean absolute or the root mean squared prediction error or the ratio of cases within mean absolute prediction error limits could be used as quality measures. Conclusions: Different constant optimization concepts yield different results. To test the performance of optimized formula constants a cross-validation strategy is mandatory. We recommend performance curves, where the ratio of cases within absolute prediction error limits is plotted against the mean absolute prediction error.

2021 ◽  
Vol 12 (2) ◽  
Author(s):  
Mohammad Haekal ◽  
Henki Bayu Seta ◽  
Mayanda Mega Santoni

Untuk memprediksi kualitas air sungai Ciliwung, telah dilakukan pengolahan data-data hasil pemantauan secara Online Monitoring dengan menggunakan Metode Data Mining. Pada metode ini, pertama-tama data-data hasil pemantauan dibuat dalam bentuk tabel Microsoft Excel, kemudian diolah menjadi bentuk Pohon Keputusan yang disebut Algoritma Pohon Keputusan (Decision Tree) mengunakan aplikasi WEKA. Metode Pohon Keputusan dipilih karena lebih sederhana, mudah dipahami dan mempunyai tingkat akurasi yang sangat tinggi. Jumlah data hasil pemantauan kualitas air sungai Ciliwung yang diolah sebanyak 5.476 data. Hasil klarifikasi dengan Pohon Keputusan, dari 5.476 data ini diperoleh jumlah data yang mengindikasikan sungai Ciliwung Tidak Tercemar sebanyak 1.059 data atau sebesar 19,3242%, dan yang mengindikasikan Tercemar sebanyak 4.417 data atau 80,6758%. Selanjutnya data-data hasil pemantauan ini dievaluasi menggunakan 4 Opsi Tes (Test Option) yaitu dengan Use Training Set, Supplied Test Set, Cross-Validation folds 10, dan Percentage Split 66%. Hasil evaluasi dengan 4 opsi tes yang digunakan ini, semuanya menunjukkan tingkat akurasi yang sangat tinggi, yaitu diatas 99%. Dari data-data hasil peneltian ini dapat diprediksi bahwa sungai Ciliwung terindikasi sebagai sungai tercemar bila mereferensi kepada Peraturan Pemerintah Republik Indonesia nomor 82 tahun 2001 dan diketahui pula bahwa penggunaan aplikasi WEKA dengan Algoritma Pohon Keputusan untuk mengolah data-data hasil pemantauan dengan mengambil tiga parameter (pH, DO dan Nitrat) adalah sangat akuran dan tepat. Kata Kunci : Kualitas air sungai, Data Mining, Algoritma Pohon Keputusan, Aplikasi WEKA.


2018 ◽  
Vol 6 (1) ◽  
pp. 1
Author(s):  
Qomariyatul Hasanah ◽  
Anang Andrianto ◽  
Muhammad Arief Hidayat

Sistem informasi posyandu ibu hamil dapat mengelola data kesehatan ibu hamil yang berkaitan dengan faktor resiko kehamilan. Faktor resiko kehamilan berdasarkan ketentuan Kartu Skor Poedji Rochyati (KSPR) digunakan bidan untuk menentukan resiko kehamilan dengan memberikan skor pada masing-masing parameter. KSPR memiliki kelemahan tidak dapat memberikan skor pada parameter yang belum pasti sehingga jika belum diketahui dengan pasti maka dianggap tidak terjadi. Konsep membaca pola data yang diadopsi dari teknik datamining menggunakan metode klasifikasi naive bayes dapat menjadi alternatif untuk kelemahan KSPR tersebut yaitu dengan mengklasifikasikan resiko kehamilan. Metode naïve bayes menghitung probabilitas parameter tertentu berdasarkan data pada periode sebelumnya yang telah ditentukan sebagai data training, berdasarkan hasil perhitungan tersebut dapat diketahui resiko kehamilan secara tepat sesuai parameter yang telah diketahui. Metode naïve bayes dipilih karena memiliki tingkat akurasi yang cukup tinggi daripada metode klasifikasi lainnya. Sistem informasi ini dibangun berbasis website agar dapat diakses secara mudah oleh beberapa posyandu yang berbeda tempat. Sistem dibangun mengadopsi dari model Waterfall. Sistem informasi posyandu ibu hamil dirancang dan dibangun dengan tiga (3) hak akses yaitu admin, bidan dan kader dengan masing-masing fitur yang dapat memudahkan penggunanya. Hasil dari penelitian ini adalah sistem informasi posyandu ibu hamil dengan penerapan klasifikasi resiko kehamilan menggunakan metode naïve bayes, dengan tingkat akurasi ketika menggunakan 17 atribut didapatkan 53.913%, 19 atribut didapatkan 54.348%, , 21 atribut didapatkan 54.783%, dan 22 atribut didapatkan 56.957%. Tingkat akurasi klasifikasi diperoleh menggunakan metode pengujian menggunakan Ten-Fold Cross Validation dimana training set dibagi menjadi 10 kelompok, jika kelompok 1 dijadikan test set maka kelompok 2 hingga 10 menjadi training set. Kata Kunci: Posyandu, Resiko Kehamilan, Waterfall, Datamining, Klasifikasi, Naïve bayes


2019 ◽  
Vol 26 (3) ◽  
pp. 543-548
Author(s):  
Toshihisa Nakashima ◽  
Takayuki Ohno ◽  
Keiichi Koido ◽  
Hironobu Hashimoto ◽  
Hiroyuki Terakado

Background In cancer patients treated with vancomycin, therapeutic drug monitoring is currently performed by the Bayesian method that involves estimating individual pharmacokinetics from population pharmacokinetic parameters and trough concentrations rather than the Sawchuk–Zaske method using peak and trough concentrations. Although the presence of malignancy influences the pharmacokinetic parameters of vancomycin, it is unclear whether cancer patients were included in the Japanese patient populations employed to estimate population pharmacokinetic parameters for this drug. The difference of predictive accuracy between the Sawchuk–Zaske and Bayesian methods in Japanese cancer patients is not completely understood. Objective To retrospectively compare the accuracy of predicting vancomycin concentrations between the Sawchuk–Zaske method and the Bayesian method in Japanese cancer patients. Methods Using data from 48 patients with various malignancies, the predictive accuracy (bias) and precision of the two methods were assessed by calculating the mean prediction error, the mean absolute prediction error, and the root mean squared prediction error. Results Prediction of the trough and peak vancomycin concentrations by the Sawchuk–Zaske method and the peak concentration by the Bayesian method showed a bias toward low values according to the mean prediction error. However, there were no significant differences between the two methods with regard to the changes of the mean prediction error, mean absolute prediction error, and root mean squared prediction error. Conclusion The Sawchuk–Zaske method and Bayesian method showed similar accuracy for predicting vancomycin concentrations in Japanese cancer patients.


Molecules ◽  
2018 ◽  
Vol 23 (12) ◽  
pp. 3271 ◽  
Author(s):  
Imane Naboulsi ◽  
Aziz Aboulmouhajir ◽  
Lamfeddal Kouisni ◽  
Faouzi Bekkaoui ◽  
Abdelaziz Yasri

Lyn kinase, a member of the Src family of protein tyrosine kinases, is mainly expressed by various hematopoietic cells, neural and adipose tissues. Abnormal Lyn kinase regulation causes various diseases such as cancers. Thus, Lyn represents, a potential target to develop new antitumor drugs. In the present study, using 176 molecules (123 training set molecules and 53 test set molecules) known by their inhibitory activities (IC50) against Lyn kinase, we constructed predictive models by linking their physico-chemical parameters (descriptors) to their biological activity. The models were derived using two different methods: the generalized linear model (GLM) and the artificial neural network (ANN). The ANN Model provided the best prediction precisions with a Square Correlation coefficient R2 = 0.92 and a Root of the Mean Square Error RMSE = 0.29. It was able to extrapolate to the test set successfully (R2 = 0.91 and RMSE = 0.33). In a second step, we have analyzed the used descriptors within the models as well as the structural features of the molecules in the training set. This analysis resulted in a transparent and informative SAR map that can be very useful for medicinal chemists to design new Lyn kinase inhibitors.


2013 ◽  
Vol 655-657 ◽  
pp. 963-968
Author(s):  
Yan Feng Zhang ◽  
Ting Ting Li

C4.5, Bayesian network and Sequential Minimal Optimization (SMO) are three typical classification algorithms in data mining. Using cross-validation method with 10 folds get analysis and calculation results of the experiments for the three classification algorithms in the same training set and test set. The main metrics include accuracy, precision, speed, robustness, scalability and comprehensibility, we use margin curve show these. It provides a theoretical and experimental basis for users to select a proper classification algorithm with different training sets in quality and amount.


2017 ◽  
Author(s):  
Shayan Tabe-Bordbar ◽  
Amin Emad ◽  
Sihai Dave Zhao ◽  
Saurabh Sinha

AbstractCross-validation (CV) is a technique to assess the generalizability of a model to unseen data. This technique relies on assumptions that may not be satisfied when studying genomics datasets. For example, random CV (RCV) assumes that a randomly selected set of samples, the test set, well represents unseen data. This assumption does not hold true where samples are obtained from different experimental conditions, and the goal is to learn regulatory relationships among the genes that generalize beyond the observed conditions. In this study, we investigated how the CV procedure affects the assessment of methods used to learn gene regulatory networks. We compared the performance of a regression-based method for gene expression prediction, estimated using RCV with that estimated using a clustering-based CV (CCV) procedure. Our analysis illustrates that RCV can produce over-optimistic estimates of generalizability of the model compared to CCV. Next, we defined the ‘distinctness’ of a test set from a training set and showed that this measure is predictive of the performance of the regression method. Finally, we introduced a simulated annealing method to construct partitions with gradually increasing distinctness and showed that performance of different gene expression prediction methods can be better evaluated using this method.


Author(s):  
AHMET ALPTEKIN ◽  
OLCAY KURSUN

Leave-one-out (LOO) and its generalization, K-Fold, are among most well-known cross-validation methods, which divide the sample into many folds, each one of which is, in turn, left out for testing, while the other parts are used for training. In this study, as an extension of this idea, we propose a new cross-validation approach that we called miss-one-out (MOO) that mislabels the example(s) in each fold and keeps this fold in the training set as well, rather than leaving it out as LOO does. Then, MOO tests whether the trained classifier can correct the erroneous label of the training sample. In principle, having only one fold deliberately labeled incorrectly should have only a small effect on the classifier that uses this bad-fold along with K - 1 good folds and can be utilized as a generalization measure of the classifier. Experimental results on a number of benchmark datasets and three real bioinformatics dataset show that MOO can better estimate the test set accuracy of the classifier.


2021 ◽  
Author(s):  
Zhilong Yi ◽  
Siqi Hu ◽  
Xiaofeng Lin ◽  
Qiong Zou ◽  
MinHong Zou ◽  
...  

Abstract Purpose 68Ga-PSMA PET/CT has high specificity and sensitivity for the detection of both intraprostatic tumor focal lesions and metastasis. However, approximately 10% of primary prostate cancer are invisible on PSMA-PET (exhibit no or minimal uptake). In this work, we investigated whether machine learning-based radiomics models derived from PSMA-PET images could predict invisible intraprostatic lesions on 68Ga-PSMA-11 PET in patients with primary prostate cancer.Methods In this retrospective study, patients with or without prostate cancer who underwent 68Ga-PSMA PET/CT and presented negative on PSMA-PET image at either of two different institutions were included: institution 1 (between 2017 to 2020) for the training set and institution 2 (between 2019 to 2020) for the external test set. Three random forest (RF) models were built using selected features extract from standard PET images, delayed PET images, and both standard and delayed PET images. Then, subsequent 10-fold cross-validation was performed. In the test phase, the three RF models and PSA density (PSAD, cut-off value: 0.15ng/ml/ml) were tested with the external test set. The area under the receiver operating characteristic curve (AUC) was calculated for the models and PSAD. The AUCs of the radiomics model and PSAD were compared.Results A total of 64 patients (39 with prostate cancer and 25 with benign prostate disease) were in the training set, and 36 (21 with prostate cancer and 15 with benign prostate disease) were in the test set. The average AUCs of the three RF models from 10-fold cross-validation were 0.87 (95% CI: 0.72, 1.00), 0.86 (95% CI: 0.63, 1.00) and 0.91 (95% CI: 0.69, 1.00), respectively. In the test set, the AUCs of the three trained RF models and PSAD were 0.903 (95% CI: 0.830, 0.975), 0.856 (95% CI: 0.748, 0.964), 0.925 (95% CI:0.838, 1.00), and 0.662 (95% CI: 0.510, 0.813). The AUCs of the three radiomics models were higher than that of PSAD (0.903, 0.856 and 0.925 vs 0.662, respectively; P = .007, P = .045 and P = .005, respectively).Conclusion Random forest models developed by 68Ga-PSMA-11 PET-based radiomics features were proven useful for accurate prediction of invisible intraprostatic lesion on 68Ga-PSMA-11 PET in patients with primary prostate cancer and showed better diagnostic performance compared with PSAD.


PLoS ONE ◽  
2021 ◽  
Vol 16 (6) ◽  
pp. e0252102
Author(s):  
Achim Langenbucher ◽  
Nóra Szentmáry ◽  
Alan Cayless ◽  
Johannes Weisensee ◽  
Ekkehard Fabian ◽  
...  

Background To explain the concept of the Castrop lens power calculation formula and show the application and results from a large dataset compared to classical formulae. Methods The Castrop vergence formula is based on a pseudophakic model eye with 4 refractive surfaces. This was compared against the SRKT, Hoffer-Q, Holladay1, simplified Haigis with 1 optimized constant and Haigis formula with 3 optimized constants. A large dataset of preoperative biometric values, lens power data and postoperative refraction data was split into training and test sets. The training data were used for formula constant optimization, and the test data for cross-validation. Constant optimization was performed for all formulae using nonlinear optimization, minimising root mean squared prediction error. Results The constants for all formulae were derived with the Levenberg-Marquardt algorithm. Applying these constants to the test data, the Castrop formula showed a slightly better performance compared to the classical formulae in terms of prediction error and absolute prediction error. Using the Castrop formula, the standard deviation of the prediction error was lowest at 0.45 dpt, and 95% of all eyes in the test data were within the limit of 0.9 dpt of prediction error. Conclusion The calculation concept of the Castrop formula and one potential option for optimization of the 3 Castrop formula constants (C, H, and R) are presented. In a large dataset of 1452 data points the performance of the Castrop formula was slightly superior to the respective results of the classical formulae such as SRKT, Hoffer-Q, Holladay1 or Haigis.


Sign in / Sign up

Export Citation Format

Share Document