scholarly journals Regularization method for predicting an ordinal response using longitudinal high-dimensional genomic data

Author(s):  
Jiayi Hou ◽  
Kellie J. Archer

AbstractAn ordinal scale is commonly used to measure health status and disease related outcomes in hospital settings as well as in translational medical research. In addition, repeated measurements are common in clinical practice for tracking and monitoring the progression of complex diseases. Classical methodology based on statistical inference, in particular, ordinal modeling has contributed to the analysis of data in which the response categories are ordered and the number of covariates (

2018 ◽  
Vol 20 (suppl_6) ◽  
pp. vi160-vi160
Author(s):  
Toni Rose Jue ◽  
Julia Yin ◽  
Anna Siddell ◽  
Victor Lu ◽  
Robert Rapkins ◽  
...  

2016 ◽  
Vol 27 (2) ◽  
pp. 336-351 ◽  
Author(s):  
Akram Shalabi ◽  
Masato Inoue ◽  
Johnathan Watkins ◽  
Emanuele De Rinaldis ◽  
Anthony CC Coolen

When data exhibit imbalance between a large number d of covariates and a small number n of samples, clinical outcome prediction is impaired by overfitting and prohibitive computation demands. Here we study two simple Bayesian prediction protocols that can be applied to data of any dimension and any number of outcome classes. Calculating Bayesian integrals and optimal hyperparameters analytically leaves only a small number of numerical integrations, and CPU demands scale as O(nd). We compare their performance on synthetic and genomic data to the mclustDA method of Fraley and Raftery. For small d they perform as well as mclustDA or better. For d = 10,000 or more mclustDA breaks down computationally, while the Bayesian methods remain efficient. This allows us to explore phenomena typical of classification in high-dimensional spaces, such as overfitting and the reduced discriminative effectiveness of signatures compared to intra-class variability.


2017 ◽  
Vol 90 ◽  
pp. 146-154 ◽  
Author(s):  
Ioannis Kavakiotis ◽  
Patroklos Samaras ◽  
Alexandros Triantafyllidis ◽  
Ioannis Vlahavas

2016 ◽  
Vol 7 (1) ◽  
Author(s):  
Chongzhi Zang ◽  
Tao Wang ◽  
Ke Deng ◽  
Bo Li ◽  
Sheng’en Hu ◽  
...  

10.2196/14710 ◽  
2020 ◽  
Vol 8 (4) ◽  
pp. e14710 ◽  
Author(s):  
Phillip Park ◽  
Soo-Yong Shin ◽  
Seog Yun Park ◽  
Jeonghee Yun ◽  
Chulmin Shin ◽  
...  

Background The analytical capacity and speed of next-generation sequencing (NGS) technology have been improved. Many genetic variants associated with various diseases have been discovered using NGS. Therefore, applying NGS to clinical practice results in precision or personalized medicine. However, as clinical sequencing reports in electronic health records (EHRs) are not structured according to recommended standards, clinical decision support systems have not been fully utilized. In addition, integrating genomic data with clinical data for translational research remains a great challenge. Objective To apply international standards to clinical sequencing reports and to develop a clinical research information system to integrate standardized genomic data with clinical data. Methods We applied the recently published ISO/TS 20428 standard to 367 clinical sequencing reports generated by panel (91 genes) sequencing in EHRs and implemented a clinical NGS research system by extending the clinical data warehouse to integrate the necessary clinical data for each patient. We also developed a user interface with a clinical research portal and an NGS result viewer. Results A single clinical sequencing report with 28 items was restructured into four database tables and 49 entities. As a result, 367 patients’ clinical sequencing data were connected with clinical data in EHRs, such as diagnosis, surgery, and death information. This system can support the development of cohort or case-control datasets as well. Conclusions The standardized clinical sequencing data are not only for clinical practice and could be further applied to translational research.


Author(s):  
Qianfan Wu ◽  
Adel Boueiz ◽  
Alican Bozkurt ◽  
Arya Masoomi ◽  
Allan Wang ◽  
...  

Predicting disease status for a complex human disease using genomic data is an important, yet challenging, step in personalized medicine. Among many challenges, the so-called curse of dimensionality problem results in unsatisfied performances of many state-of-art machine learning algorithms. A major recent advance in machine learning is the rapid development of deep learning algorithms that can efficiently extract meaningful features from high-dimensional and complex datasets through a stacked and hierarchical learning process. Deep learning has shown breakthrough performance in several areas including image recognition, natural language processing, and speech recognition. However, the performance of deep learning in predicting disease status using genomic datasets is still not well studied. In this article, we performed a review on the four relevant articles that we found through our thorough literature review. All four articles used auto-encoders to project high-dimensional genomic data to a low dimensional space and then applied the state-of-the-art machine learning algorithms to predict disease status based on the low-dimensional representations. This deep learning approach outperformed existing prediction approaches, such as prediction based on probe-wise screening and prediction based on principal component analysis. The limitations of the current deep learning approach and possible improvements were also discussed.


Sign in / Sign up

Export Citation Format

Share Document