Regularization method for predicting an ordinal response using longitudinal high-dimensional genomic data

AbstractAn ordinal scale is commonly used to measure health status and disease related outcomes in hospital settings as well as in translational medical research. In addition, repeated measurements are common in clinical practice for tracking and monitoring the progression of complex diseases. Classical methodology based on statistical inference, in particular, ordinal modeling has contributed to the analysis of data in which the response categories are ordered and the number of covariates (

Download Full-text

A Feature Sampling Strategy for Analysis of High Dimensional Genomic Data

IEEE/ACM Transactions on Computational Biology and Bioinformatics ◽

10.1109/tcbb.2017.2779492 ◽

2019 ◽

Vol 16 (2) ◽

pp. 434-441 ◽

Cited By ~ 2

Author(s):

Jie Zhang ◽

Zhigen Zhao ◽

Kai Zhang ◽

Zhi Wei

Keyword(s):

Genomic Data ◽

Sampling Strategy ◽

High Dimensional

Download Full-text

PATH-11. TRANSLATING GENOMIC DATA OF GLIOBLASTOMA INTO CLINICAL PRACTICE: A CASE STUDY

Neuro-Oncology ◽

10.1093/neuonc/noy148.667 ◽

2018 ◽

Vol 20 (suppl_6) ◽

pp. vi160-vi160

Author(s):

Toni Rose Jue ◽

Julia Yin ◽

Anna Siddell ◽

Victor Lu ◽

Robert Rapkins ◽

...

Keyword(s):

Clinical Practice ◽

Genomic Data

Download Full-text

A novel Cox proportional hazards model for high - dimensional genomic data in cancer prognosis

IEEE/ACM Transactions on Computational Biology and Bioinformatics ◽

10.1109/tcbb.2019.2961667 ◽

2019 ◽

pp. 1-1

Author(s):

HaiHui Huang ◽

Yong Liang

Keyword(s):

Proportional Hazards ◽

Proportional Hazards Model ◽

Genomic Data ◽

Cox Proportional Hazards ◽

Cox Proportional Hazards Model ◽

Cancer Prognosis ◽

High Dimensional ◽

Hazards Model

Download Full-text

Bayesian penalized cumulative logit model for high‐dimensional data with an ordinal response

Statistics in Medicine ◽

10.1002/sim.8851 ◽

2020 ◽

Author(s):

Yiran Zhang ◽

Kellie J. Archer

Keyword(s):

Logit Model ◽

High Dimensional Data ◽

High Dimensional ◽

Ordinal Response ◽

Cumulative Logit Model ◽

Cumulative Logit

Download Full-text

Bayesian clinical classification from high-dimensional data: Signatures versus variability

Statistical Methods in Medical Research ◽

10.1177/0962280216628901 ◽

2016 ◽

Vol 27 (2) ◽

pp. 336-351 ◽

Cited By ~ 2

Author(s):

Akram Shalabi ◽

Masato Inoue ◽

Johnathan Watkins ◽

Emanuele De Rinaldis ◽

Anthony CC Coolen

Keyword(s):

Clinical Outcome ◽

Bayesian Methods ◽

Outcome Prediction ◽

High Dimensional Data ◽

Genomic Data ◽

Bayesian Prediction ◽

High Dimensional ◽

Clinical Classification ◽

Numerical Integrations ◽

Clinical Outcome Prediction

When data exhibit imbalance between a large number d of covariates and a small number n of samples, clinical outcome prediction is impaired by overfitting and prohibitive computation demands. Here we study two simple Bayesian prediction protocols that can be applied to data of any dimension and any number of outcome classes. Calculating Bayesian integrals and optimal hyperparameters analytically leaves only a small number of numerical integrations, and CPU demands scale as O(nd). We compare their performance on synthetic and genomic data to the mclustDA method of Fraley and Raftery. For small d they perform as well as mclustDA or better. For d = 10,000 or more mclustDA breaks down computationally, while the Bayesian methods remain efficient. This allows us to explore phenomena typical of classification in high-dimensional spaces, such as overfitting and the reduced discriminative effectiveness of signatures compared to intra-class variability.

Download Full-text

Comments on: Augmenting the bootstrap to analyze high dimensional genomic data

Test ◽

10.1007/s11749-008-0101-2 ◽

2008 ◽

Vol 17 (1) ◽

pp. 25-27 ◽

Cited By ~ 8

Author(s):

Korbinian Strimmer

Keyword(s):

Genomic Data ◽

High Dimensional

Download Full-text

FIFS: A data mining method for informative marker selection in high dimensional population genomic data

Computers in Biology and Medicine ◽

10.1016/j.compbiomed.2017.09.020 ◽

2017 ◽

Vol 90 ◽

pp. 146-154 ◽

Cited By ~ 7

Author(s):

Ioannis Kavakiotis ◽

Patroklos Samaras ◽

Alexandros Triantafyllidis ◽

Ioannis Vlahavas

Keyword(s):

Data Mining ◽

Genomic Data ◽

High Dimensional ◽

Mining Method ◽

Informative Marker ◽

Data Mining Method ◽

Marker Selection ◽

Population Genomic

Download Full-text

High-dimensional genomic data bias correction and data integration using MANCIE

Nature Communications ◽

10.1038/ncomms11305 ◽

2016 ◽

Vol 7 (1) ◽

Cited By ~ 20

Author(s):

Chongzhi Zang ◽

Tao Wang ◽

Ke Deng ◽

Bo Li ◽

Sheng’en Hu ◽

...

Keyword(s):

Data Integration ◽

Bias Correction ◽

Genomic Data ◽

High Dimensional

Download Full-text

Next-Generation Sequencing–Based Cancer Panel Data Conversion Using International Standards to Implement a Clinical Next-Generation Sequencing Research System: Single-Institution Study

JMIR Medical Informatics ◽

10.2196/14710 ◽

2020 ◽

Vol 8 (4) ◽

pp. e14710 ◽

Cited By ~ 1

Author(s):

Phillip Park ◽

Soo-Yong Shin ◽

Seog Yun Park ◽

Jeonghee Yun ◽

Chulmin Shin ◽

...

Keyword(s):

Clinical Practice ◽

Next Generation Sequencing ◽

Clinical Data ◽

Genomic Data ◽

International Standards ◽

Research System ◽

Next Generation ◽

Sequencing Data ◽

Clinical Sequencing ◽

Generation Sequencing

Background The analytical capacity and speed of next-generation sequencing (NGS) technology have been improved. Many genetic variants associated with various diseases have been discovered using NGS. Therefore, applying NGS to clinical practice results in precision or personalized medicine. However, as clinical sequencing reports in electronic health records (EHRs) are not structured according to recommended standards, clinical decision support systems have not been fully utilized. In addition, integrating genomic data with clinical data for translational research remains a great challenge. Objective To apply international standards to clinical sequencing reports and to develop a clinical research information system to integrate standardized genomic data with clinical data. Methods We applied the recently published ISO/TS 20428 standard to 367 clinical sequencing reports generated by panel (91 genes) sequencing in EHRs and implemented a clinical NGS research system by extending the clinical data warehouse to integrate the necessary clinical data for each patient. We also developed a user interface with a clinical research portal and an NGS result viewer. Results A single clinical sequencing report with 28 items was restructured into four database tables and 49 entities. As a result, 367 patients’ clinical sequencing data were connected with clinical data in EHRs, such as diagnosis, surgery, and death information. This system can support the development of cohort or case-control datasets as well. Conclusions The standardized clinical sequencing data are not only for clinical practice and could be further applied to translational research.

Download Full-text

Deep learning for predicting disease status using genomic data

10.7287/peerj.preprints.27123 ◽

2018 ◽

Cited By ~ 1

Author(s):

Qianfan Wu ◽

Adel Boueiz ◽

Alican Bozkurt ◽

Arya Masoomi ◽

Allan Wang ◽

...

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Rapid Development ◽

Learning Algorithms ◽

Genomic Data ◽

Disease Status ◽

Machine Learning Algorithms ◽

High Dimensional ◽

Learning Approach ◽

Low Dimensional

Predicting disease status for a complex human disease using genomic data is an important, yet challenging, step in personalized medicine. Among many challenges, the so-called curse of dimensionality problem results in unsatisfied performances of many state-of-art machine learning algorithms. A major recent advance in machine learning is the rapid development of deep learning algorithms that can efficiently extract meaningful features from high-dimensional and complex datasets through a stacked and hierarchical learning process. Deep learning has shown breakthrough performance in several areas including image recognition, natural language processing, and speech recognition. However, the performance of deep learning in predicting disease status using genomic datasets is still not well studied. In this article, we performed a review on the four relevant articles that we found through our thorough literature review. All four articles used auto-encoders to project high-dimensional genomic data to a low dimensional space and then applied the state-of-the-art machine learning algorithms to predict disease status based on the low-dimensional representations. This deep learning approach outperformed existing prediction approaches, such as prediction based on probe-wise screening and prediction based on principal component analysis. The limitations of the current deep learning approach and possible improvements were also discussed.

Download Full-text