stratification method
Recently Published Documents


TOTAL DOCUMENTS

101
(FIVE YEARS 56)

H-INDEX

9
(FIVE YEARS 2)

2022 ◽  
Vol 23 (1) ◽  
Author(s):  
Mehrdad Mansouri ◽  
Sahand Khakabimamaghani ◽  
Leonid Chindelevitch ◽  
Martin Ester

Abstract Background There has been a simultaneous increase in demand and accessibility across genomics, transcriptomics, proteomics and metabolomics data, known as omics data. This has encouraged widespread application of omics data in life sciences, from personalized medicine to the discovery of underlying pathophysiology of diseases. Causal analysis of omics data may provide important insight into the underlying biological mechanisms. Existing causal analysis methods yield promising results when identifying potential general causes of an observed outcome based on omics data. However, they may fail to discover the causes specific to a particular stratum of individuals and missing from others. Methods To fill this gap, we introduce the problem of stratified causal discovery and propose a method, Aristotle, for solving it. Aristotle addresses the two challenges intrinsic to omics data: high dimensionality and hidden stratification. It employs existing biological knowledge and a state-of-the-art patient stratification method to tackle the above challenges and applies a quasi-experimental design method to each stratum to find stratum-specific potential causes. Results Evaluation based on synthetic data shows better performance for Aristotle in discovering true causes under different conditions compared to existing causal discovery methods. Experiments on a real dataset on Anthracycline Cardiotoxicity indicate that Aristotle’s predictions are consistent with the existing literature. Moreover, Aristotle makes additional predictions that suggest further investigations.


2021 ◽  
pp. 1-44
Author(s):  
Konstantinos N Fountoulakis ◽  
Maria K. Apostolidou ◽  
Marina B. Atsiova ◽  
Anna K. Filippidou ◽  
Angeliki K. Florou ◽  
...  

Abstract Introduction: The aim of the study was to investigate mental health and conspiracy theory beliefs concerning COVID-19 among Health Care Professionals (HCPs). Material and Methods: During lockdown, an online questionnaire gathered data from 507 HCPs (432 females aged 33.86±8.63 and 75 males aged 39.09±9.54). Statistical Analysis: A post-stratification method to transform the study sample was used; descriptive statistics were calculated. Results: Anxiety and probable depression were increased 1.5-2-fold and were higher in females and nurses. Previous history of depression was the main risk factor. The rates of the believing in conspiracy theories concerning the COVID-19 were alarming with the majority of individuals (especially females) following some theory to at least some extend. Conclusions: The current paper reports high rates of depression, distress and suicidal thoughts in the HCPs during the lockdown, with a high prevalence of beliefs in conspiracy theories. Female gender and previous history of depression acted as risk factors while the belief in conspiracy theories might act as a protective factor. The results should be considered with caution due to the nature of the data (online survey on a self-selected but stratified sample)


2021 ◽  
Vol 8 (12) ◽  
pp. 301
Author(s):  
Jamin Koo ◽  
Kyucheol Choi ◽  
Peter Lee ◽  
Amanda Polley ◽  
Raghavendra Sumanth Pudupakam ◽  
...  

First-line treatments of cancer do not always work, and even when they do, they cure the disease at unequal rates mostly owing to biological and clinical heterogeneity across patients. Accurate prediction of clinical outcome and survival following the treatment can support and expedite the process of comparing alternative treatments. We describe the methodology to dynamically determine remission probabilities for individual patients, as well as their prospects of progression-free survival (PFS). The proposed methodology utilizes the ex vivo drug sensitivity of cancer cells, their immunophenotyping results, and patient information, such as age and breed, in training machine learning (ML) models, as well as the Cox hazards model to predict the probability of clinical remission (CR) or relapse across time for a given patient. We applied the methodology using the three types of data obtained from 242 canine lymphoma patients treated by (L)-CHOP chemotherapy. The results demonstrate substantial enhancement in the predictive accuracy of the ML models by utilizing features from all the three types of data. They also highlight superior performance and utility in predicting survival compared to the conventional stratification method. We believe that the proposed methodology can contribute to improving and personalizing the care of cancer patients.


2021 ◽  
Vol 11 (23) ◽  
pp. 11380
Author(s):  
Jianxiang Wei ◽  
Lu Cheng ◽  
Pu Han ◽  
Yunxia Zhu ◽  
Weidong Huang

Data masking is an inborn defect of measures of disproportionality in adverse drug reactions signal detection. Some improved methods which used gender and age for data stratification only considered the patient-related confounding factors, ignoring the drug-related influencing factors. Due to a large number of reports and the high proportion of antibiotics in the Chinese spontaneous reporting database, this paper proposes a decision tree-stratification method for the minimization of the masking effect by integrating the relevant factors of patients and drugs. The adverse drug reaction monitoring reports of Jiangsu Province in China from 2011 to 2018 were selected for this study. First, the age division interval was determined based on the statistical analysis of antibiotic-related data. Secondly, correlation analysis was conducted based on the patient’s gender and age respectively with the drug category attributes. Thirdly, the decision tree based on age and gender was constructed by the J48 algorithm, which was used to determine if drugs belonged to antibiotics as a classification label. Fourthly, some performance evaluation indicators were constructed based on the data of drug package inserts as a standard signal library: recall, precision, and F (the arithmetic harmonic mean of recall and precision). Finally, four experiments were carried out by means of the proportional reporting ratio method: non-stratification (total data), gender-stratification, age-stratification and decision tree-stratification, and the performance of the signal detection results was compared. The experimental results showed that the decision tree-stratification was superior to the other three methods. Therefore, the data-masking effect can be further minimized by comprehensively considering the patient and drug-related confounding factors.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Farideh Jalali-najafabadi ◽  
Michael Stadler ◽  
Nick Dand ◽  
Deepak Jadon ◽  
Mehreen Soomro ◽  
...  

AbstractIn view of the growth of clinical risk prediction models using genetic data, there is an increasing need for studies that use appropriate methods to select the optimum number of features from a large number of genetic variants with a high degree of redundancy between features due to linkage disequilibrium (LD). Filter feature selection methods based on information theoretic criteria, are well suited to this challenge and will identify a subset of the original variables that should result in more accurate prediction. However, data collected from cohort studies are often high-dimensional genetic data with potential confounders presenting challenges to feature selection and risk prediction machine learning models. Patients with psoriasis are at high risk of developing a chronic arthritis known as psoriatic arthritis (PsA). The prevalence of PsA in this patient group can be up to 30% and the identification of high risk patients represents an important clinical research which would allow early intervention and a reduction of disability. This also provides us with an ideal scenario for the development of clinical risk prediction models and an opportunity to explore the application of information theoretic criteria methods. In this study, we developed the feature selection and psoriatic arthritis (PsA) risk prediction models that were applied to a cross-sectional genetic dataset of 1462 PsA cases and 1132 cutaneous-only psoriasis (PsC) cases using 2-digit HLA alleles imputed using the SNP2HLA algorithm. We also developed stratification method to mitigate the impact of potential confounder features and illustrate that confounding features impact the feature selection. The mitigated dataset was used in training of seven supervised algorithms. 80% of data was randomly used for training of seven supervised machine learning methods using stratified nested cross validation and 20% was selected randomly as a holdout set for internal validation. The risk prediction models were then further validated in UK Biobank dataset containing data on 1187 participants and a set of features overlapping with the training dataset.Performance of these methods has been evaluated using the area under the curve (AUC), accuracy, precision, recall, F1 score and decision curve analysis(net benefit). The best model is selected based on three criteria: the ‘lowest number of feature subset’ with the ‘maximal average AUC over the nested cross validation’ and good generalisability to the UK Biobank dataset. In the original dataset, with over 100 different bootstraps and seven feature selection (FS) methods, HLA_C_*06 was selected as the most informative genetic variant. When the dataset is mitigated the single most important genetic features based on rank was identified as HLA_B_*27 by the seven different feature selection methods, consistent with previous analyses of this data using regression based methods. However, the predictive accuracy of these single features in post mitigation was found to be moderate (AUC= 0.54 (internal cross validation), AUC=0.53 (internal hold out set), AUC=0.55(external data set)). Sequentially adding additional HLA features based on rank improved the performance of the Random Forest classification model where 20 2-digit features selected by Interaction Capping (ICAP) demonstrated (AUC= 0.61 (internal cross validation), AUC=0.57 (internal hold out set), AUC=0.58 (external dataset)). The stratification method for mitigation of confounding features and filter information theoretic feature selection can be applied to a high dimensional dataset with the potential confounders.


2021 ◽  
Author(s):  
Corneliu A Bodea ◽  
Michael Macoritto ◽  
Yingchun Liu ◽  
Wenliang Zhang ◽  
Jozsef Karman ◽  
...  

Crohn's disease (CD) and ulcerative colitis (UC) are inflammatory bowel diseases (IBD) with a strong genetic component. Genome-wide association studies (GWAS) have successfully identified over 240 genetic loci that are statistically associated with risk of developing IBD, and these associations provide valuable insights into disease pathobiology. Building on GWAS findings, conventional polygenic risk scores (cPRS) aim to quantify the aggregated disease risk based on DNA variation, and these scores can identify individuals at high risk. While stratifying individuals based on cPRS has the potential to inform clinical care, the development of novel therapeutics requires deep insight into how aggregated genetic risk leads to disruption of specific biological pathways. Here, we developed a pathway-specific PRS (pPRS) methodology to assess IBD common variant genetic risk burden across 31 manually curated pathways. We first prioritized 206 genes based on comprehensive fine-mapping and eQTL colocalization analyses of genome-wide significant IBD GWAS loci and 58 highly penetrant genes based on their involvement in early onset IBD or autoimmunity-related colitis. These 264 genes were assigned to at least one of the 31 pathways based on Gene Ontology annotations and manual curation. Finally, we integrated these inputs into a novel pPRS model and performed an extensive investigation of IBD disease risk, severity, complications, and anti-TNF treatment response by applying our pPRS approach to three complementary datasets encompassing IBD cases and controls. Our analysis identified multiple promising pathways that can inform drug target discovery and provides a patient stratification method that offers insights into the biology of treatment response.


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Biaobin Jiang ◽  
Quanhua Mu ◽  
Fufang Qiu ◽  
Xuefeng Li ◽  
Weiqi Xu ◽  
...  

AbstractMetastatic cancer is associated with poor patient prognosis but its spatiotemporal behavior remains unpredictable at early stage. Here we develop MetaNet, a computational framework that integrates clinical and sequencing data from 32,176 primary and metastatic cancer cases, to assess metastatic risks of primary tumors. MetaNet achieves high accuracy in distinguishing the metastasis from the primary in breast and prostate cancers. From the prediction, we identify Metastasis-Featuring Primary (MFP) tumors, a subset of primary tumors with genomic features enriched in metastasis and demonstrate their higher metastatic risk and shorter disease-free survival. In addition, we identify genomic alterations associated with organ-specific metastases and employ them to stratify patients into various risk groups with propensities toward different metastatic organs. This organotropic stratification method achieves better prognostic value than the standard histological grading system in prostate cancer, especially in the identification of Bone-MFP and Liver-MFP subtypes, with potential in informing organ-specific examinations in follow-ups.


2021 ◽  
Vol 4 ◽  
Author(s):  
Aida Santaolalla ◽  
Tim Hulsen ◽  
Jenson Davis ◽  
Hashim U. Ahmed ◽  
Caroline M. Moore ◽  
...  

Introduction. Prostate cancer (PCa) is the most frequent cancer diagnosis in men worldwide. Our ability to identify those men whose cancer will decrease their lifespan and/or quality of life remains poor. The ReIMAGINE Consortium has been established to improve PCa diagnosis.Materials and methods. MRI will likely become the future cornerstone of the risk-stratification process for men at risk of early prostate cancer. We will, for the first time, be able to combine the underlying molecular changes in PCa with the state-of-the-art imaging. ReIMAGINE Screening invites men for MRI and PSA evaluation. ReIMAGINE Risk includes men at risk of prostate cancer based on MRI, and includes biomarker testing.Results. Baseline clinical information, genomics, blood, urine, fresh prostate tissue samples, digital pathology and radiomics data will be analysed. Data will be de-identified, stored with correlated mpMRI disease endotypes and linked with long term follow-up outcomes in an instance of the Philips Clinical Data Lake, consisting of cloud-based software. The ReIMAGINE platform includes application programming interfaces and a user interface that allows users to browse data, select cohorts, manage users and access rights, query data, and more. Connection to analytics tools such as Python allows statistical and stratification method pipelines to run profiling regression analyses. Discussion. The ReIMAGINE Multimodal Warehouse comprises a unique data source for PCa research, to improve risk stratification for PCa and inform clinical practice. The de-identified dataset characterized by clinical, imaging, genomics and digital pathology PCa patient phenotypes will be a valuable resource for the scientific and medical community.


2021 ◽  
Author(s):  
Mohammad Hossein Nikoo ◽  
Alireza Sadeghi ◽  
Niloofar Dehdari Ebrahimi ◽  
Alireza Estedlal ◽  
Amirhossein Maktabi ◽  
...  

Abstract Background More recently, a growing body of literature on COVID-19 has investigated the electrophysiological issues presetting as a disease manifestation of COVID-19 and highlight the spectrum of arrhythmias observed in patients with COVID-19 infection. This Study discuss the prevalence of arrhythmias and conduction system disease in patients with COVID-19. Method electrocardiographic data and comorbidity data of 432 expired COVID-19 patients admitted to Faghihi Hospital of Shiraz University of Medical Sciences from August 1st until December 1st were reviewed. Results AVB was found in 40(9.3%) patients. 28(6.5%) of the patients suffered from 1st degree AVB, and 12(2.8%) suffered from CHB. Changes in ST-T wave compatible with myocardial infarction or localized myocarditis appeared in 189(59.0%) patients. Findings compatible with myocardial injury such as fragmented QRS, and prolonged QTc were assessed with prevalence of 21.1% (91 patients), 6.5% (28 patients). In victims of COVID-19, conduction disease was not related to any underlying medical condition. Fragmented QRS, axis deviation, presence of S1Q3T3 and poor R wave progression were significantly related to conduction system disease in victims of COVID-19 (P value > 0.05, Table 3) Conclusion Our findings can serve in future studies that aim to develop a risk stratification method for susceptible COVID-19 patients. Myocardial injury appears to role significantly in COVID-19 morbidity and mortality. Consequently, we recommend health policymakers to consider separate catheterization laboratories that provide service only to COVID-19 patients.


Sign in / Sign up

Export Citation Format

Share Document