scholarly journals What can data mining tell us about patient safety? Using linear discriminant analysis to identify characteristics associated with positive safety rating in London NHS organisations.

Author(s):  
Roberto Fernandez Crespo ◽  
Ana Liusa Neves ◽  
Mohammed Abdulhadi Alagha ◽  
Melanie Leis ◽  
Kelsey Flott ◽  
...  

Objective: To identify key characteristics associated with a CQC positive and negative safety rating across London NHS organisations. Design: Advanced data analytics and linear discriminant analysis. Data sources: Linked CQC data with patient safety variables sources from 10 publicly available datasets. Methods: Iterative cycles of data extraction, insight generation, and analysis refinement were done and involved regular meetings between the NHS London Patient Safety Leadership Forum and analytic team to optimise academic robustness alongside with translational impact. Ten datasets were selected based on data availability, usability, and relevance and included data from April 2018 to December 2019. Data pre-processing was conducted in R. Missing values were imputed using the median value while empty variables were removed. London NHS organisations were categorised based on their safety rating into two groups: those rated as "inadequate" or "requires improvement" (RI) and those rated as "Good" or "outstanding" (Good). Variable filtering reduced the number of variables from 1104 to 207. The top ten variables with the largest effect sizes associated with Good and RI organisations were selected for inspection. A Linear Discriminant Analysis (LDA) was trained using the 207 variables. Effect sizes and confidence intervals for each variable were calculated. Dunn′s and Kruskal-Wallis tests were used to identify significant differences between RI and Good organisations. Results: Ten variables for Good and RI NHS organisations were identified. Key variables for Good organisations included: Organisation response to address own concerns (answered by nurse/midwife) (Good organisation = 0.691, RI organisation = 0.618, P<.001); fair career progression (answered by medical/dental staff) (Good organisation = 0.905, RI organisation = 0.843, P<.001); existence of annual work appraisal (answered by medical/dental staff)) (Good organisation = 0.922, RI organisation = 0.873, P<.001); organisation's response to patients' concerns (Good organisation = 0.791, RI organisation = 0.717, P<.001); harassment, bullying or abuse from staff (answered by AHPHSSP) (Good organisation = 0.527, RI organisation = 0.454, P<.001); adequate materials supplies and equipment (answered by "Other" staff) (Good organisation = 0.663, RI organisation = 0.544, P<.001); organisation response to address own concerns (answered by medical/dental staff) (Good organisation = 0.634, RI organisation = 0.537, P<.001); staff engagement (answered by medical/dental staff) (Good organisation = 0.468, RI organisation = 0.376, P<.001); provision of clear feedback (answered by "other" staff) (Good organisation = 0.719, RI organisation = 0.650, P<.001); and collection of patient feedback (answered by wider healthcare team) (Good organisation = 0.888, RI organisation = 0.804, P<.001). Conclusions: Our study shows that healthcare providers that received positive safety inspections from regulators have significantly different characteristics in terms of staff perceptions of safety than those providers rated as inadequate or requiring improvement. Particularly, organisations rated as good or outstanding are associated with higher levels of organisational safety, staff engagement and capacities to collect and listen to patient experience feedback. This work exemplifies how a partnership between applied healthcare and academic research organisations can be used to address practical considerations in patient safety, resulting in a translational piece of work.

1987 ◽  
Vol 17 (9) ◽  
pp. 1150-1152 ◽  
Author(s):  
David L. Verbyla

Classification trees are discriminant models structured as dichtomous keys. A simple classification tree is presented and contrasted with a linear discriminant function. Classification trees have several advantages when compared with linear discriminant analysis. The method is robust with respect to outlier cases. It is nonparametric and can use nominal, ordinal, interval, and ratio scaled predictor variables. Cross-validation is used during tree development to prevent overrating the tree with too many predictor variables. Missing values are handled by using surrogate splits based on nonmissing predictor variables. Classification trees, like linear discriminant analysis, have potential prediction bias and therefore should be validated before being accepted.


2020 ◽  
Vol 16 (8) ◽  
pp. 1079-1087
Author(s):  
Jorgelina Z. Heredia ◽  
Carlos A. Moldes ◽  
Raúl A. Gil ◽  
José M. Camiña

Background: The elemental composition of maize grains depends on the soil, land and environment characteristics where the crop grows. These effects are important to evaluate the availability of nutrients with complex dynamics, such as the concentration of macro and micronutrients in soils, which can vary according to different topographies. There is available scarce information about the influence of topographic characteristics (upland and lowland) where culture is developed with the mineral composition of crop products, in the present case, maize seeds. On the other hand, the study of the topographic effect on crops using multivariate analysis tools has not been reported. Objective: This paper assesses the effect of topographic conditions on plants, analyzing the mineral profiles in maize seeds obtained in two land conditions: uplands and lowlands. Materials and Methods: The mineral profile was studied by microwave plasma atomic emission spectrometry. Samples were collected from lowlands and uplands of cultivable lands of the north-east of La Pampa province, Argentina. Results: Differentiation of maize seeds collected from both topographical areas was achieved by principal components analysis (PCA), cluster analysis (CA) and linear discriminant analysis (LDA). PCA model based on mineral profile allowed to differentiate seeds from upland and lowlands by the influence of Cr and Mg variables. A significant accumulation of Cr and Mg in seeds from lowlands was observed. Cluster analysis confirmed such grouping but also, linear discriminant analysis achieved a correct classification of both the crops, showing the effect of topography on elemental profile. Conclusions: Multi-elemental analysis combined with chemometric tools proved useful to assess the effect of topographic characteristics on crops.


2020 ◽  
Vol 15 ◽  
Author(s):  
Mohanad Mohammed ◽  
Henry Mwambi ◽  
Bernard Omolo

Background: Colorectal cancer (CRC) is the third most common cancer among women and men in the USA, and recent studies have shown an increasing incidence in less developed regions, including Sub-Saharan Africa (SSA). We developed a hybrid (DNA mutation and RNA expression) signature and assessed its predictive properties for the mutation status and survival of CRC patients. Methods: Publicly-available microarray and RNASeq data from 54 matched formalin-fixed paraffin-embedded (FFPE) samples from the Affymetrix GeneChip and RNASeq platforms, were used to obtain differentially expressed genes between mutant and wild-type samples. We applied the support-vector machines, artificial neural networks, random forests, k-nearest neighbor, naïve Bayes, negative binomial linear discriminant analysis, and the Poisson linear discriminant analysis algorithms for classification. Cox proportional hazards model was used for survival analysis. Results: Compared to the genelist from each of the individual platforms, the hybrid genelist had the highest accuracy, sensitivity, specificity, and AUC for mutation status, across all the classifiers and is prognostic for survival in patients with CRC. NBLDA method was the best performer on the RNASeq data while the SVM method was the most suitable classifier for CRC across the two data types. Nine genes were found to be predictive of survival. Conclusion: This signature could be useful in clinical practice, especially for colorectal cancer diagnosis and therapy. Future studies should determine the effectiveness of integration in cancer survival analysis and the application on unbalanced data, where the classes are of different sizes, as well as on data with multiple classes.


Sign in / Sign up

Export Citation Format

Share Document