Better Practices in the Development and Validation of Recidivism Risk Assessments: The Minnesota Sex Offender Screening Tool–4

2017 ◽  
Vol 30 (4) ◽  
pp. 538-564 ◽  
Author(s):  
Grant Duwe

This study examines the development and validation of the Minnesota Sex Offender Screening Tool–4 (MnSOST-4) on a dataset consisting of 5,745 sex offenders released from Minnesota prisons between 2003 and 2012. Bootstrap resampling was used to select predictors, and k-fold and split-sample methods were used to internally validate the MnSOST-4. Using sex offense reconviction within 4 years of release from prison as the failure criterion, the data showed that 130 (2.3%) offenders in the overall sample were recidivists. Multiple classification methods and performance metrics were used to develop the MnSOST-4 and evaluate its predictive performance on the test set. The results from the regularized logistic regression algorithm showed that the MnSOST-4 performed well in predicting sexual recidivism in the test set, achieving an area under the curve (AUC) of 0.835. Additional analyses on the test set revealed that the MnSOST-4 outperformed the Minnesota Sex Offender Screening Tool–3 (MnSOST-3), Minnesota Sex Offender Screening Tool–Revised (MnSOST-R), and Static-99 in predicting sexual reoffending.

2021 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Minh Thanh Vo ◽  
Anh H. Vo ◽  
Tuong Le

PurposeMedical images are increasingly popular; therefore, the analysis of these images based on deep learning helps diagnose diseases become more and more essential and necessary. Recently, the shoulder implant X-ray image classification (SIXIC) dataset that includes X-ray images of implanted shoulder prostheses produced by four manufacturers was released. The implant's model detection helps to select the correct equipment and procedures in the upcoming surgery.Design/methodology/approachThis study proposes a robust model named X-Net to improve the predictability for shoulder implants X-ray image classification in the SIXIC dataset. The X-Net model utilizes the Squeeze and Excitation (SE) block integrated into Residual Network (ResNet) module. The SE module aims to weigh each feature map extracted from ResNet, which aids in improving the performance. The feature extraction process of X-Net model is performed by both modules: ResNet and SE modules. The final feature is obtained by incorporating the extracted features from the above steps, which brings more important characteristics of X-ray images in the input dataset. Next, X-Net uses this fine-grained feature to classify the input images into four classes (Cofield, Depuy, Zimmer and Tornier) in the SIXIC dataset.FindingsExperiments are conducted to show the proposed approach's effectiveness compared with other state-of-the-art methods for SIXIC. The experimental results indicate that the approach outperforms the various experimental methods in terms of several performance metrics. In addition, the proposed approach provides the new state of the art results in all performance metrics, such as accuracy, precision, recall, F1-score and area under the curve (AUC), for the experimental dataset.Originality/valueThe proposed method with high predictive performance can be used to assist in the treatment of injured shoulder joints.


2017 ◽  
Vol 44 (9) ◽  
pp. 1125-1140 ◽  
Author(s):  
Seung C. Lee ◽  
R. Karl Hanson

Although considerable research has found overall moderate predictive validity of Static-99R, a sex offender risk prediction tool, relatively little research has addressed its potential for cultural bias. This prospective study evaluated the predictive validity of Static-99R across the three major ethnic groups (White, n = 789; Black, n = 466; Hispanic, n = 719) in the state of California. Static-99R was able to discriminate recidivists from nonrecidivists among White, Black, and Hispanic sex offenders (all area under the curve [AUC] values >.70; odds ratios >1.39). Base rates (at a Static-99R score of 2) with a fixed 5-year follow-up across ethnic groups were very similar (2.4%-3.0%) but were significantly lower than the norms (5.6%). The current findings support the use of Static-99R in risk assessment procedures for sex offenders of White, Black, and Hispanic heritage, but it should be used with caution in estimating absolute sexual recidivism rates, particularly for Hispanic sex offenders.


Author(s):  
Roberto Porto ◽  
Jose M. Molina ◽  
Antonio Berlanga ◽  
Miguel A. Patricio

Learning systems have been very focused on creating models that are capable of obtaining the best results in error metrics. Recently, the focus has shifted to improvement in order to interpret and explain their results. The need for interpretation is greater when these models are used to support decision making. In some areas this becomes an indispensable requirement, such as in medicine. This paper focuses on the prediction of cardiovascular disease by analyzing the well-known Statlog (Heart) Data Set from the UCI’s Automated Learning Repository. This study will analyze the cost of making predictions easier to interpret by reducing the number of features that explain the classification of health status versus the cost in accuracy. It will be analyzed on a large set of classification techniques and performance metrics. Demonstrating that it is possible to make explainable and reliable models that have a good commitment to predictive performance.


2019 ◽  
Vol 100 (2) ◽  
pp. 173-200
Author(s):  
Grant Duwe

This study presents the results from the development and validation of a fully automated, gender-specific risk assessment system designed to predict severe and frequent prison misconduct on a recurring, semiannual basis. K-fold and split-population methods were applied to train and test the predictive models. Regularized logistic regression was the classifier used on the training and test sets that contained 35,506 males and 3,849 females who were released from Minnesota prisons between 2006 and 2011. Using multiple metrics, the results showed the models achieved a relatively high level of predictive performance. For example, the average area under the curve (AUC) was 0.832 for the female prisoner models and 0.836 for the male prisoner models. The findings provide support for the notion that better predictive performance can be obtained by developing assessments that are customized to the population on which they will be used.


2021 ◽  
Vol 11 (3) ◽  
pp. 1285
Author(s):  
Roberto Porto ◽  
José M. Molina ◽  
Antonio Berlanga ◽  
Miguel A. Patricio

Learning systems have been focused on creating models capable of obtaining the best results in error metrics. Recently, the focus has shifted to improvement in the interpretation and explanation of the results. The need for interpretation is greater when these models are used to support decision making. In some areas, this becomes an indispensable requirement, such as in medicine. The goal of this study was to define a simple process to construct a system that could be easily interpreted based on two principles: (1) reduction of attributes without degrading the performance of the prediction systems and (2) selecting a technique to interpret the final prediction system. To describe this process, we selected a problem, predicting cardiovascular disease, by analyzing the well-known Statlog (Heart) data set from the University of California’s Automated Learning Repository. We analyzed the cost of making predictions easier to interpret by reducing the number of features that explain the classification of health status versus the cost in accuracy. We performed an analysis on a large set of classification techniques and performance metrics, demonstrating that it is possible to construct explainable and reliable models that provide high quality predictive performance.


2020 ◽  
Vol 7 (4) ◽  
pp. 131
Author(s):  
José Miguel Calderón ◽  
Julio Álvarez-Pitti ◽  
Irene Cuenca ◽  
Francisco Ponce ◽  
Pau Redon

Obstructive sleep apnea syndrome is a reduction of the airflow during sleep which not only produces a reduction in sleep quality but also has major health consequences. The prevalence in the obese pediatric population can surpass 50%, and polysomnography is the current gold standard method for its diagnosis. Unfortunately, it is expensive, disturbing and time-consuming for experienced professionals. The objective is to develop a patient-friendly screening tool for the obese pediatric population to identify those children at higher risk of suffering from this syndrome. Three supervised learning classifier algorithms (i.e., logistic regression, support vector machine and AdaBoost) common in the field of machine learning were trained and tested on two very different datasets where oxygen saturation raw signal was recorded. The first dataset was the Childhood Adenotonsillectomy Trial (CHAT) consisting of 453 individuals, with ages between 5 and 9 years old and one-third of the patients being obese. Cross-validation was performed on the second dataset from an obesity assessment consult at the Pediatric Department of the Hospital General Universitario of Valencia. A total of 27 patients were recruited between 5 and 17 years old; 42% were girls and 63% were obese. The performance of each algorithm was evaluated based on key performance indicators (e.g., area under the curve, accuracy, recall, specificity and positive predicted value). The logistic regression algorithm outperformed (accuracy = 0.79, specificity = 0.96, area under the curve = 0.9, recall = 0.62 and positive predictive value = 0.94) the support vector machine and the AdaBoost algorithm when trained with the CHAT datasets. Cross-validation tests, using the Hospital General de Valencia (HG) dataset, confirmed the higher performance of the logistic regression algorithm in comparison with the others. In addition, only a minor loss of performance (accuracy = 0.75, specificity = 0.88, area under the curve = 0.85, recall = 0.62 and positive predictive value = 0.83) was observed despite the differences between the datasets. The proposed minimally invasive screening tool has shown promising performance when it comes to identifying children at risk of suffering obstructive sleep apnea syndrome. Moreover, it is ideal to be implemented in an outpatient consult in primary and secondary care.


2015 ◽  
Author(s):  
Nikolaus Kriegeskorte

Crossvalidation is a method for estimating predictive performance and adjudicating between multiple models. On each of k folds of the process, k-1 of k independent subsets of the data (training set) are used to fit the parameters of each model and the left-out subset (test set) is used to estimate predictive performance. The method is statistically efficient, because training data are reused for testing and performance estimates combined across folds. The method requires no assumptions, provides nearly unbiased (slightly conservative) estimates of predictive performance, and is generally applicable because it amounts to a direct empirical test of each model.


2021 ◽  
pp. jim-2021-002037
Author(s):  
Adrian Soto-Mota ◽  
Braulio Alejandro Marfil-Garza ◽  
Santiago Castiello-de Obeso ◽  
Erick Jose Martinez Rodriguez ◽  
Daniel Alberto Carrillo Vazquez ◽  
...  

Most COVID-19 mortality scores were developed at the beginning of the pandemic and clinicians now have more experience and evidence-based interventions. Therefore, we hypothesized that the predictive performance of COVID-19 mortality scores is now lower than originally reported. We aimed to prospectively evaluate the current predictive accuracy of six COVID-19 scores and compared it with the accuracy of clinical gestalt predictions. 200 patients with COVID-19 were enrolled in a tertiary hospital in Mexico City between September and December 2020. The area under the curve (AUC) of the LOW-HARM, qSOFA, MSL-COVID-19, NUTRI-CoV, and NEWS2 scores and the AUC of clinical gestalt predictions of death (as a percentage) were determined. In total, 166 patients (106 men and 60 women aged 56±9 years) with confirmed COVID-19 were included in the analysis. The AUC of all scores was significantly lower than originally reported: LOW-HARM 0.76 (95% CI 0.69 to 0.84) vs 0.96 (95% CI 0.94 to 0.98), qSOFA 0.61 (95% CI 0.53 to 0.69) vs 0.74 (95% CI 0.65 to 0.81), MSL-COVID-19 0.64 (95% CI 0.55 to 0.73) vs 0.72 (95% CI 0.69 to 0.75), NUTRI-CoV 0.60 (95% CI 0.51 to 0.69) vs 0.79 (95% CI 0.76 to 0.82), NEWS2 0.65 (95% CI 0.56 to 0.75) vs 0.84 (95% CI 0.79 to 0.90), and neutrophil to lymphocyte ratio 0.65 (95% CI 0.57 to 0.73) vs 0.74 (95% CI 0.62 to 0.85). Clinical gestalt predictions were non-inferior to mortality scores, with an AUC of 0.68 (95% CI 0.59 to 0.77). Adjusting scores with locally derived likelihood ratios did not improve their performance; however, some scores outperformed clinical gestalt predictions when clinicians’ confidence of prediction was <80%. Despite its subjective nature, clinical gestalt has relevant advantages in predicting COVID-19 clinical outcomes. The need and performance of most COVID-19 mortality scores need to be evaluated regularly.


Author(s):  
Douglas L. Epperson ◽  
James D. Kaul ◽  
Stephen Huot ◽  
Robin Goldman ◽  
Will Alexander
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document