The effect of imperfect data on default prediction validation tests

2012 ◽  
Vol 6 (1) ◽  
pp. 77-96 ◽  
Author(s):  
Heather Russell ◽  
Douglas Dwyer ◽  
Qing Kang Tang
Author(s):  
Jiapeng Liu ◽  
Ting Hei Wan ◽  
Francesco Ciucci

<p>Electrochemical impedance spectroscopy (EIS) is one of the most widely used experimental tools in electrochemistry and has applications ranging from energy storage and power generation to medicine. Considering the broad applicability of the EIS technique, it is critical to validate the EIS data against the Hilbert transform (HT) or, equivalently, the Kramers–Kronig relations. These mathematical relations allow one to assess the self-consistency of obtained spectra. However, the use of validation tests is still uncommon. In the present article, we aim at bridging this gap by reformulating the HT under a Bayesian framework. In particular, we developed the Bayesian Hilbert transform (BHT) method that interprets the HT probabilistic. Leveraging the BHT, we proposed several scores that provide quick metrics for the evaluation of the EIS data quality.<br></p>


2018 ◽  
Vol 69 (7) ◽  
pp. 1830-1837
Author(s):  
Cristian Nicolescu ◽  
Alaxendru Pop ◽  
Alin Mihu ◽  
Luminita Pilat ◽  
Ovidiu Bedreag ◽  
...  

This article presents an observational randomized prospective study done on 65 patients, who underwent major surgical interventions in the field of orthopedic surgery-total hip replacement or general surgery � total colectomy. The level of albuminemia in these cases were determined before the surgical intervention, after 6 hours of the intervention and after 24 h of the intervention. The measurements of the plasmatic concentration of the pro-inflammatory cytokines Tumor Necrosis factor -alpha (TNF-alpha) and interleukin 6 (IL6) were simultaneously done with the determination of the plasmatic levels of albumin. Values of hemoglobin and hematocrit were determined 24 h after the surgical procedure in order to exclude hemodilution, which could lead to a possible drop in the levels of plasmatic albumin. After the collection of the data, the statistical work was done and it consisted of descriptive statistics, correlation and comparison tests as well as statistical validation tests. Obtained results indicate that IL-6 plays a major role comparatively with that of TNF-alfa, regarding the decrease of the plasmatic level of albumin, and due to this, the primordial cause for hypoalbuminemia is an acute hepatic phase reaction. Supplemental permeability of the capillary wall under the action of TNF alpha has a secondary role, but could lead to a faster decrease in plasmatic albumin in the first hours after the surgical procedure.


2020 ◽  
Vol 65 (7-8) ◽  
pp. 37-41
Author(s):  
E. N. Semenova ◽  
S. I. Kuleshova ◽  
E. I. Sakanyan

A method for the quantitative determination of streptomycin sulfate in medicines by the turbidimetric method has been developedand validated. Based on the results of the experiments, it was found that the metrological characteristics of such validation parameters of the method as linearity, precision, and correctness do not exceed the validation criteria. Linearity was noted in the range of streptomycin concentrations from 3.75 to 8.43 μg/ml. The results of validation tests of the method for the quantitative determination of streptomycin indicate the prospects and feasibility of introducing the turbidimetric method into the domestic system for standardization and quality assessment of aminoglycoside antibiotics.


2019 ◽  
Vol 24 (34) ◽  
pp. 4013-4022 ◽  
Author(s):  
Xiang Cheng ◽  
Xuan Xiao ◽  
Kuo-Chen Chou

Knowledge of protein subcellular localization is vitally important for both basic research and drug development. With the avalanche of protein sequences emerging in the post-genomic age, it is highly desired to develop computational tools for timely and effectively identifying their subcellular localization based on the sequence information alone. Recently, a predictor called “pLoc-mPlant” was developed for identifying the subcellular localization of plant proteins. Its performance is overwhelmingly better than that of the other predictors for the same purpose, particularly in dealing with multi-label systems in which some proteins, called “multiplex proteins”, may simultaneously occur in two or more subcellular locations. Although it is indeed a very powerful predictor, more efforts are definitely needed to further improve it. This is because pLoc-mPlant was trained by an extremely skewed dataset in which some subsets (i.e., the protein numbers for some subcellular locations) were more than 10 times larger than the others. Accordingly, it cannot avoid the biased consequence caused by such an uneven training dataset. To overcome such biased consequence, we have developed a new and bias-free predictor called pLoc_bal-mPlant by balancing the training dataset. Cross-validation tests on exactly the same experimentconfirmed dataset have indicated that the proposed new predictor is remarkably superior to pLoc-mPlant, the existing state-of-the-art predictor in identifying the subcellular localization of plant proteins. To maximize the convenience for the majority of experimental scientists, a user-friendly web-server for the new predictor has been established at http://www.jci-bioinfo.cn/pLoc_bal-mPlant/, by which users can easily get their desired results without the need to go through the detailed mathematics.


2019 ◽  
Vol 15 (5) ◽  
pp. 472-485 ◽  
Author(s):  
Kuo-Chen Chou ◽  
Xiang Cheng ◽  
Xuan Xiao

<P>Background/Objective: Information of protein subcellular localization is crucially important for both basic research and drug development. With the explosive growth of protein sequences discovered in the post-genomic age, it is highly demanded to develop powerful bioinformatics tools for timely and effectively identifying their subcellular localization purely based on the sequence information alone. Recently, a predictor called “pLoc-mEuk” was developed for identifying the subcellular localization of eukaryotic proteins. Its performance is overwhelmingly better than that of the other predictors for the same purpose, particularly in dealing with multi-label systems where many proteins, called “multiplex proteins”, may simultaneously occur in two or more subcellular locations. Although it is indeed a very powerful predictor, more efforts are definitely needed to further improve it. This is because pLoc-mEuk was trained by an extremely skewed dataset where some subset was about 200 times the size of the other subsets. Accordingly, it cannot avoid the biased consequence caused by such an uneven training dataset. </P><P> Methods: To alleviate such bias, we have developed a new predictor called pLoc_bal-mEuk by quasi-balancing the training dataset. Cross-validation tests on exactly the same experimentconfirmed dataset have indicated that the proposed new predictor is remarkably superior to pLocmEuk, the existing state-of-the-art predictor in identifying the subcellular localization of eukaryotic proteins. It has not escaped our notice that the quasi-balancing treatment can also be used to deal with many other biological systems. </P><P> Results: To maximize the convenience for most experimental scientists, a user-friendly web-server for the new predictor has been established at http://www.jci-bioinfo.cn/pLoc_bal-mEuk/. </P><P> Conclusion: It is anticipated that the pLoc_bal-Euk predictor holds very high potential to become a useful high throughput tool in identifying the subcellular localization of eukaryotic proteins, particularly for finding multi-target drugs that is currently a very hot trend trend in drug development.</P>


2021 ◽  
pp. 1-13
Author(s):  
Kai Zhuang ◽  
Sen Wu ◽  
Xiaonan Gao

To deal with the systematic risk of financial institutions and the rapid increasing of loan applications, it is becoming extremely important to automatically predict the default probability of a loan. However, this task is non-trivial due to the insufficient default samples, hard decision boundaries and numerous heterogeneous features. To the best of our knowledge, existing related researches fail in handling these three difficulties simultaneously. In this paper, we propose a weakly supervised loan default prediction model WEAKLOAN that systematically solves all these challenges based on deep metric learning. WEAKLOAN is composed of three key modules which are used for encoding loan features, learning evaluation metrics and calculating default risk scores. By doing so, WEAKLOAN can not only extract the features of a loan itself, but also model the hidden relationships in loan pairs. Extensive experiments on real-life datasets show that WEAKLOAN significantly outperforms all compared baselines even though the default loans for training are limited.


Materials ◽  
2021 ◽  
Vol 14 (7) ◽  
pp. 1795
Author(s):  
Norshahira Roslan ◽  
Shayfull Zamree Abd Rahim ◽  
Abdellah El-hadj Abdellah ◽  
Mohd Mustafa Al Bakri Abdullah ◽  
Katarzyna Błoch ◽  
...  

Achieving good quality of products from plastic injection moulding processes is very challenging, since the process comprises many affecting parameters. Common defects such as warpage are hard to avoid, and the defective parts will eventually go to waste, leading to unnecessary costs to the manufacturer. The use of recycled material from postindustrial waste has been studied by a few researchers. However, the application of an optimisation method by which to optimise processing parameters to mould parts using recycled materials remains lacking. In this study, Response Surface Methodology (RSM) and Particle Swarm Optimisation (PSO) methods were conducted on thick plate parts moulded using virgin and recycled low-density polyethylene (LDPE) materials (100:0, 70:30, 60:40 and 50:50; virgin to recycle material ratios) to find the optimal input parameters for each of the material ratios. Shrinkage in the x and y directions increased in correlation with the recycled ratio, compared to virgin material. Meanwhile, the tensile strength of the thick plate part continued to decrease when the recycled ratio increased. R30 (70:30) had the optimum shrinkage in the x direction with respect to R0 (100:0) material where the shrinkage increased by 24.49% (RSM) and 33.20% (PSO). On the other hand, the shrinkage in the y direction for R30 material increased by 4.48% (RSM) and decreased by 2.67% (PSO), while the tensile strength of R30 (70:30) material decreased by 0.51% (RSM) and 2.68% (PSO) as compared to R0 (100:0) material. Validation tests indicated that the optimal setting of processing parameter suggested by PSO and RSM for R0 (100:0), R30 (70:30), R40 (60:40) and R50 (50:50) was less than 10%.


Sign in / Sign up

Export Citation Format

Share Document