validation data
Recently Published Documents





Eileen O. Dareng ◽  
Jonathan P. Tyrer ◽  
Daniel R. Barnes ◽  
Michelle R. Jones ◽  
Xin Yang ◽  

AbstractPolygenic risk scores (PRS) for epithelial ovarian cancer (EOC) have the potential to improve risk stratification. Joint estimation of Single Nucleotide Polymorphism (SNP) effects in models could improve predictive performance over standard approaches of PRS construction. Here, we implemented computationally efficient, penalized, logistic regression models (lasso, elastic net, stepwise) to individual level genotype data and a Bayesian framework with continuous shrinkage, “select and shrink for summary statistics” (S4), to summary level data for epithelial non-mucinous ovarian cancer risk prediction. We developed the models in a dataset consisting of 23,564 non-mucinous EOC cases and 40,138 controls participating in the Ovarian Cancer Association Consortium (OCAC) and validated the best models in three populations of different ancestries: prospective data from 198,101 women of European ancestries; 7,669 women of East Asian ancestries; 1,072 women of African ancestries, and in 18,915 BRCA1 and 12,337 BRCA2 pathogenic variant carriers of European ancestries. In the external validation data, the model with the strongest association for non-mucinous EOC risk derived from the OCAC model development data was the S4 model (27,240 SNPs) with odds ratios (OR) of 1.38 (95% CI: 1.28–1.48, AUC: 0.588) per unit standard deviation, in women of European ancestries; 1.14 (95% CI: 1.08–1.19, AUC: 0.538) in women of East Asian ancestries; 1.38 (95% CI: 1.21–1.58, AUC: 0.593) in women of African ancestries; hazard ratios of 1.36 (95% CI: 1.29–1.43, AUC: 0.592) in BRCA1 pathogenic variant carriers and 1.49 (95% CI: 1.35–1.64, AUC: 0.624) in BRCA2 pathogenic variant carriers. Incorporation of the S4 PRS in risk prediction models for ovarian cancer may have clinical utility in ovarian cancer prevention programs.

Hallie C Prescott ◽  
Rajendra P Kadel ◽  
Julie R Eyman ◽  
Ron Freyberg ◽  
Matthew Quarrick ◽  

Abstract Background The US Veterans Affairs (VA) healthcare system began reporting risk-adjusted mortality for intensive care (ICU) admissions in 2005. However, while the VA’s mortality model has been updated and adapted for risk-adjustment of all inpatient hospitalizations, recent model performance has not been published. We sought to assess the current performance of VA’s 4 standardized mortality models: acute care 30-day mortality (acute care SMR-30); ICU 30-day mortality (ICU SMR-30); acute care in-hospital mortality (acute care SMR); and ICU in-hospital mortality (ICU SMR). Methods Retrospective cohort study with split derivation and validation samples. Standardized mortality models were fit using derivation data, with coefficients applied to the validation sample. Nationwide VA hospitalizations that met model inclusion criteria during fiscal years 2017–2018(derivation) and 2019 (validation) were included. Model performance was evaluated using c-statistics to assess discrimination and comparison of observed versus predicted deaths to assess calibration. Results Among 1,143,351 hospitalizations eligible for the acute care SMR-30 during 2017–2019, in-hospital mortality was 1.8%, and 30-day mortality was 4.3%. C-statistics for the SMR models in validation data were 0.870 (acute care SMR-30); 0.864 (ICU SMR-30); 0.914 (acute care SMR); and 0.887 (ICU SMR). There were 16,036 deaths (4.29% mortality) in the SMR-30 validation cohort versus 17,458 predicted deaths (4.67%), reflecting 0.38% over-prediction. Across deciles of predicted risk, the absolute difference in observed versus predicted percent mortality was a mean of 0.38%, with a maximum error of 1.81% seen in the highest-risk decile. Conclusions and Relevance The VA’s SMR models, which incorporate patient physiology on presentation, are highly predictive and demonstrate good calibration both overall and across risk deciles. The current SMR models perform similarly to the initial ICU SMR model, indicating appropriate adaption and re-calibration.

S. El Kohli ◽  
Y. Jannaj ◽  
M. Maanan ◽  
H. Rhinane

Abstract. Cheating in exams is a worldwide phenomenon that hinders efforts to assess the skills and growth of students. With scientific and technological progress, it has become possible to develop detection systems in particular a system to monitor the movements and gestures of the candidates during the exam. Individually or collectively. Deep learning (DL) concepts are widely used to investigate image processing and machine learning applications. Our system is based on the advances in artificial intelligence, particularly 3D Convolutional Neural Network (3D CNN), object detector methods, OpenCV and especially Google Tensor Flow, to provides a real-time optimized Computer Vision. The proposal approach, we provide a detection system able to predict fraud during exams. Using the 3D CNN to generate a model from 7,638 selected images and objects detector to identify prohibited things. These experimental studies provide a detection performance with 95% accuracy of correlation between the training and validation data set.

2022 ◽  
Vol 2 (1) ◽  
pp. 26-31
Hendra Rohman

Background: Analysis of accuracy and validity fill code diagnosis on medical record document is very important because if diagnosis code is not appropriate with ICD-10, will cause decline in quality services health center, generated data have this validation data level is low, because accuracy code very important for health center such as index process and statistical report, as basis for making outpatient morbidity report and top ten diseases reports, as well as influencing policies will be taken by primary health center management. This study aims to analyze accuracy and validity diagnosis disease code based on ICD-10 fourth quarter in 2020 Imogiri I Health Center Bantul.Methods: Descriptive qualitative approach, case study design. Subject is a doctor, nurse, head record medical and staff. Object is outpatients medical record document in Imogiri I Health Center Bantul. Total sample 99 medical record file. Obtaining data from this study through interviews and observations.Results: Number of complete accurate diagnosis codes is 60 (60,6%), incomplete accurate diagnosis codes is 26 (26.3%) and inaccurate diagnosis codes is 13 (13.1%). Inaccuracies include errors in determining code, errors in determining 4th character ICD-10 code, not adding 4th and 5th characters, not including external cause, and multiple diseases.Conclusions: Inaccuracy factors are not competence medical record staff, incomplete diagnosis writing and no training, no evaluation or coding audit has been carried out, and standard operational procedure is not socialized.

Sensors ◽  
2022 ◽  
Vol 22 (2) ◽  
pp. 454
German Sternharz ◽  
Jonas Skackauskas ◽  
Ayman Elhalwagy ◽  
Anthony J. Grichnik ◽  
Tatiana Kalganova ◽  

This paper introduces a procedure to compare the functional behaviour of individual units of electronic hardware of the same type. The primary use case for this method is to estimate the functional integrity of an unknown device unit based on the behaviour of a known and proven reference unit. This method is based on the so-called virtual sensor network (VSN) approach, where the output quantity of a physical sensor measurement is replicated by a virtual model output. In the present study, this approach is extended to model the functional behaviour of electronic hardware by a neural network (NN) with Long-Short-Term-Memory (LSTM) layers to encapsulate potential time-dependence of the signals. The proposed method is illustrated and validated on measurements from a remote-controlled drone, which is operated with two variants of controller hardware: a reference controller unit and a malfunctioning counterpart. It is demonstrated that the presented approach successfully identifies and describes the unexpected behaviour of the test device. In the presented case study, the model outputs a signal sample prediction in 0.14 ms and achieves a reconstruction accuracy of the validation data with a root mean square error (RMSE) below 0.04 relative to the data range. In addition, three self-protection features (multidimensional boundary-check, Mahalanobis distance, auxiliary autoencoder NN) are introduced to gauge the certainty of the VSN model output.

2022 ◽  
pp. 1-17
Saleh Albahli ◽  
Ghulam Nabi Ahmad Hassan Yar

Diabetic retinopathy is an eye deficiency that affects retina as a result of the patient having diabetes mellitus caused by high sugar levels, which may eventually lead to macular edema. The objective of this study is to design and compare several deep learning models that detect severity of diabetic retinopathy, determine risk of leading to macular edema, and segment different types of disease patterns using retina images. Indian Diabetic Retinopathy Image Dataset (IDRiD) dataset was used for disease grading and segmentation. Since images of the dataset have different brightness and contrast, we employed three techniques for generating processed images from the original images, which include brightness, color and, contrast (BCC) enhancing, color jitters (CJ), and contrast limited adaptive histogram equalization (CLAHE). After image preporcessing, we used pre-trained ResNet50, VGG16, and VGG19 models on these different preprocessed images both for determining the severity of the retinopathy and also the chances of macular edema. UNet was also applied to segment different types of diseases. To train and test these models, image dataset was divided into training, testing, and validation data at 70%, 20%, and 10% ratios, respectively. During model training, data augmentation method was also applied to increase the number of training images. Study results show that for detecting the severity of retinopathy and macular edema, ResNet50 showed the best accuracy using BCC and original images with an accuracy of 60.2% and 82.5%, respectively, on validation dataset. In segmenting different types of diseases, UNet yielded the highest testing accuracy of 65.22% and 91.09% for microaneurysms and hard exudates using BCC images, 84.83% for optic disc using CJ images, 59.35% and 89.69% for hemorrhages and soft exudates using CLAHE images, respectively. Thus, image preprocessing can play an important role to improve efficacy and performance of deep learning models.

2022 ◽  
Sepideh Etemadi ◽  
Mehdi Khashei

Abstract Modeling and forecasting are among the most powerful and widely-used tools in decision support systems. The Fuzzy Linear Regression (FLR) is the most fundamental method in the fuzzy modeling area in which the uncertain relationship between the target and explanatory variables is estimated and has been frequently used in a broad range of real-world applications efficaciously. The operation logic in this method is to minimize the vagueness of the model, defined as the sum of individual spreads of the fuzzy coefficients. Although this process is coherent and can obtain the narrowest α-cut interval and exceptionally the most accurate results in the training data sets, it can not guarantee to achieve the desired level of generalization. While the quality of made managerial decisions in the modeling-based field is dependent on the generalization ability of the used method. On the other hand, the generalizability of a method is generally dependent on the precision as well as reliability of results, simultaneously. In this paper, a novel methodology is presented for the fuzzy linear regression modeling; in which in contrast to conventional methods, the constructed models' reliability is maximized instead of minimizing the vagueness. In the proposed model, fuzzy parameters are estimated in such a way that the variety of the ambiguity of the model is minimized in different data conditions. In other words, the weighted variance of different ambiguities in each validation data situation is minimized in order to estimate the unknown fuzzy parameters. To comprehensively assess the proposed method's performance, 74 benchmark datasets are regarded from the UCI. Empirical outcomes show that, in 64.86% of case studies, the proposed method has better generalizability, i.e., narrower α-cut interval as well as more accurate results in the interval and point estimation, than classic versions. It is obviously demonstrated the importance of the outcomes' reliability in addition to the precision that is not considered in the traditional FLR modeling processes. Hence, the presented EFLR method can be considered as a suitable alternative in fuzzy modeling fields, especially when more generalization is favorable.

2022 ◽  
Vol 1212 (1) ◽  
pp. 012008
A Rahman

Abstract The climate and environmental aspects are one of the things that affect architectural products. The city as a gathering place that interacts for a particular purpose has influenced the shape and visual of the city. The density of buildings in a city has affected the urban microclimate. Urban get hotter than rural areas. Urban planners need to pay attention to several aspects related to the solution to the design of the humid tropics. The concept of greening the city evenly and thoroughly, so that the thermal is not too high that can affect comfort. In this research used Rhinoceros 5, Grasshopper, Ladybug, and ladybug for simulation and validation data of wet bulb temperature on a psychometric chart and CBE Thermal Comfort Tool from ASHRAE-55 standard. The purpose of this study used simulation is to facilitate and predict the thermal conditions of buildings and the environment. This application is also used by researchers and architect designers. Based on the simulation, the indoor maximum effective temperature and standard effective temperature are always uncomfortable zones. The building condition with wooden construction is higher of thermal comfort compared with concrete construction buildings and the wood construction is faster reaches maximum value compared to concrete construction.

Leonardo Valderrama ◽  
Bogdan Demczuk Jr. ◽  
Patrícia Valderrama ◽  
Eduardo Carasek

A potential eco-friendly method without organic solvents is presented by integrating a chromatographic fingerprint and multivariate control chart based on Q residuals to differentiate grape juices from different farming practices. The sample preparation was only water dilution, and the mobile phase was water acidified with sulfuric acid, which can be readily neutralized before its disposal. The proposed method is shown to be a simple way to distinguish between organic and non-organic grape juices in a non-target way, successfully evaluating an external validation data set, where organic and non-organic samples were correctly assigned. Through the chromatographic profile, it is possible to suggest that one of the species responsible for this distinction may be from the anthocyanins class.

2021 ◽  
Vol 14 (6) ◽  
pp. 3225
Juarez Antonio da Silva Júnior ◽  
Ubiratan Joaquim da Silva Júnior ◽  
Admilson Da Penha Pacheco

A disponibilidade gratuita de dados de sensoriamento remoto em áreas atingidas por incêndios florestais em escala global oferece a oportunidade de geração sistemática de produtos terrestres de média resolução espacial, porém as conhecidas limitações de precisão é objeto de estudo em todo o mundo. Este artigo tem como objetivo analisar a acurácia da detecção de áreas queimadas utilizando o classificador Random Forest (RF) por meio de uma cena do sensor Radiômetro de Imagem Infravermelho Visível (VIIRS) (1Km) em quatro pontos da savana brasileira. Os resultados foram validados através dos produtos de referência espacial de áreas queimadas: Aq30m, Fire_cci e MCD64A1 por meio de uma abordagem estratificada possibilitando a amostragem dos dados no espaço e tempo. Os modelos de RF avaliados com seus parâmetros de entrada, em que, incluiu-se 400 árvores e um atributo, fornecendo uma taxa de erro abaixo de 4%. Os resultados mostraram que o mapeamento validado com o produto Aq30m apresentou importantes estimativas de Coeficiente de Sorensen-Dice enquanto a validação realizada entre os modelos globais, o MCD64A1 mostrou-se com maior exatidão (>50%) principalmente em feições de áreas queimadas de grandes proporções (> 200Km²). Em particular, a análise sugere que a validação de produtos de área queimada sempre deve estar ligada ao tempo mínimo da data dos dados de validação e o tamanho da área atingida pelo fogo. Os resultados mostram que esta abordagem é muito útil para ser usado para determinar áreas de floresta queimada.      Accuracy analysis for mapping burnt areas using a 1Km VIIRS scene and Random Forest classification A B S T R A C TThe availability of remote sensing data with medium spatial resolution has offered several mapping possibilities for areas affected by forest fires on the Earth's surface. In this context, the analysis of sensor spatial accuracy limitations has been the subject of global research. The objective of this study was to analyze the mapping accuracy of the VIIRS sensor on board the NOAA satellite, using the Random Forest (RF) classifier for the detection of burned areas, in four points of the Chapada dos Veadeiros National Park - Goiás, inserted in the Brazilian savanna. The methodology consisted in validating the classification using the Sorensen-Dice coefficient (SD) in a stratified approach, using as reference the products: Aq30m, Fire_cci and MCD64A1. As a result, the RF models, included 400 trees and one attribute, with an error of less than 4%. Among the global models, the MCD64A1 presented a significant accuracy, greater than 50%, especially in features of burned areas greater than 200Km². Thus, the data suggest that the quality of accuracy of the validation process of mapping products for burned areas is associated with the minimum time interval of availability of validation data and the size of the area affected by fire. Based on this, the results show effectiveness in using the RF algorithm on medium spatial resolution images for fire detection in seasonally dry forests, such as the Cerrado.Keywords: Cerrado, fires, Random Forest.

Sign in / Sign up

Export Citation Format

Share Document