scholarly journals Supervised dimension reduction for large-scale “omics” data with censored survival outcomes under possible non-proportional hazards

2019 ◽  
Author(s):  
Lauren Spirko-Burns ◽  
Karthik Devarajan

AbstractThe past two decades have witnessed significant advances in high-throughput “omics” technologies such as genomics, proteomics, metabolomics, transcriptomics and radiomics. These technologies have enabled simultaneous measurement of the expression levels of tens of thousands of features from individual patient samples and have generated enormous amounts of data that require analysis and interpretation. One specific area of interest has been in studying the relationship between these features and patient outcomes, such as overall and recurrence-free survival, with the goal of developing a predictive “omics” profile. Large-scale studies often suffer from the presence of a large fraction of censored observations and potential time-varying effects of features, and methods for handling them have been lacking. In this paper, we propose supervised methods for feature selection and survival prediction that simultaneously deal with both issues. Our approach utilizes continuum power regression (CPR) - a framework that includes a variety of regression methods - in conjunction with the parametric or semi-parametric accelerated failure time (AFT) model. Both CPR and AFT fall within the linear models framework and, unlike black-box models, the proposed prognostic index has a simple yet useful interpretation. We demonstrate the utility of our methods using simulated and publicly available cancer genomics data.

Author(s):  
P. B. Basham ◽  
H. L. Tsai

The use of transmission electron microscopy (TEM) to support process development of advanced microelectronic devices is often challenged by a large amount of samples submitted from wafer fabrication areas and specific-spot analysis. Improving the TEM sample preparation techniques for a fast turnaround time is critical in order to provide a timely support for customers and improve the utilization of TEM. For the specific-area sample preparation, a technique which can be easily prepared with the least amount of effort is preferred. For these reasons, we have developed several techniques which have greatly facilitated the TEM sample preparation.For specific-area analysis, the use of a copper grid with a small hole is found to be very useful. With this small-hole grid technique, TEM sample preparation can be proceeded by well-established conventional methods. The sample is first polished to the area of interest, which is then carefully positioned inside the hole. This polished side is placed against the grid by epoxy Fig. 1 is an optical image of a TEM cross-section after dimpling to light transmission.


Author(s):  
Ron Avi Astor ◽  
Rami Benbenisthty

Since 2005, the bullying, school violence, and school safety literatures have expanded dramatically in content, disciplines, and empirical studies. However, with this massive expansion of research, there is also a surprising lack of theoretical and empirical direction to guide efforts on how to advance our basic science and practical applications of this growing scientific area of interest. Parallel to this surge in interest, cultural norms, media coverage, and policies to address school safety and bullying have evolved at a remarkably quick pace over the past 13 years. For example, behaviors and populations that just a decade ago were not included in the school violence, bullying, and school safety discourse are now accepted areas of inquiry. These include, for instance, cyberbullying, sexting, social media shaming, teacher–student and student–teacher bullying, sexual harassment and assault, homicide, and suicide. Populations in schools not previously explored, such as lesbian, gay, bisexual, transgender, and queer students and educators and military- and veteran-connected students, become the foci of new research, policies, and programs. As a result, all US states and most industrialized countries now have a complex quilt of new school safety and bullying legislation and policies. Large-scale research and intervention funding programs are often linked to these policies. This book suggests an empirically driven unifying model that brings together these previously distinct literatures. This book presents an ecological model of school violence, bullying, and safety in evolving contexts that integrates all we have learned in the 13 years, and suggests ways to move forward.


2021 ◽  
Vol 10 (6) ◽  
pp. 1211
Author(s):  
Li-Te Lin ◽  
Kuan-Hao Tsui

The relationship between serum dehydroepiandrosterone sulphate (DHEA-S) and anti-Mullerian hormone (AMH) levels has not been fully established. Therefore, we performed a large-scale cross-sectional study to investigate the association between serum DHEA-S and AMH levels. The study included a total of 2155 infertile women aged 20 to 46 years who were divided into four quartile groups (Q1 to Q4) based on serum DHEA-S levels. We found that there was a weak positive association between serum DHEA-S and AMH levels in infertile women (r = 0.190, p < 0.001). After adjusting for potential confounders, serum DHEA-S levels positively correlated with serum AMH levels in infertile women (β = 0.103, p < 0.001). Infertile women in the highest DHEA-S quartile category (Q4) showed significantly higher serum AMH levels (p < 0.001) compared with women in the lowest DHEA-S quartile category (Q1). The serum AMH levels significantly increased across increasing DHEA-S quartile categories in infertile women (p = 0.014) using generalized linear models after adjustment for potential confounders. Our data show that serum DHEA-S levels are positively associated with serum AMH levels.


2021 ◽  
Vol 11 (2) ◽  
pp. 214
Author(s):  
Anna Kaiser ◽  
Pascal-M. Aggensteiner ◽  
Martin Holtmann ◽  
Andreas Fallgatter ◽  
Marcel Romanos ◽  
...  

Electroencephalography (EEG) represents a widely established method for assessing altered and typically developing brain function. However, systematic studies on EEG data quality, its correlates, and consequences are scarce. To address this research gap, the current study focused on the percentage of artifact-free segments after standard EEG pre-processing as a data quality index. We analyzed participant-related and methodological influences, and validity by replicating landmark EEG effects. Further, effects of data quality on spectral power analyses beyond participant-related characteristics were explored. EEG data from a multicenter ADHD-cohort (age range 6 to 45 years), and a non-ADHD school-age control group were analyzed (ntotal = 305). Resting-state data during eyes open, and eyes closed conditions, and task-related data during a cued Continuous Performance Task (CPT) were collected. After pre-processing, general linear models, and stepwise regression models were fitted to the data. We found that EEG data quality was strongly related to demographic characteristics, but not to methodological factors. We were able to replicate maturational, task, and ADHD effects reported in the EEG literature, establishing a link with EEG-landmark effects. Furthermore, we showed that poor data quality significantly increases spectral power beyond effects of maturation and symptom severity. Taken together, the current results indicate that with a careful design and systematic quality control, informative large-scale multicenter trials characterizing neurophysiological mechanisms in neurodevelopmental disorders across the lifespan are feasible. Nevertheless, results are restricted to the limitations reported. Future work will clarify predictive value.


2021 ◽  
pp. 096228022110092
Author(s):  
Mingyue Du ◽  
Hui Zhao ◽  
Jianguo Sun

Cox’s proportional hazards model is the most commonly used model for regression analysis of failure time data and some methods have been developed for its variable selection under different situations. In this paper, we consider a general type of failure time data, case K interval-censored data, that include all of other types discussed as special cases, and propose a unified penalized variable selection procedure. In addition to its generality, another significant feature of the proposed approach is that unlike all of the existing variable selection methods for failure time data, the proposed approach allows dependent censoring, which can occur quite often and could lead to biased or misleading conclusions if not taken into account. For the implementation, a coordinate descent algorithm is developed and the oracle property of the proposed method is established. The numerical studies indicate that the proposed approach works well for practical situations and it is applied to a set of real data arising from Alzheimer’s Disease Neuroimaging Initiative study that motivated this study.


2021 ◽  
pp. 153537022199201
Author(s):  
Runmin Li ◽  
Guosheng Wang ◽  
ZhouJie Wu ◽  
HuaGuang Lu ◽  
Gen Li ◽  
...  

Multiple-omics sequencing information with high-throughput has laid a solid foundation to identify genes associated with cancer prognostic process. Multiomics information study is capable of revealing the cancer occurring and developing system according to several aspects. Currently, the prognosis of osteosarcoma is still poor, so a genetic marker is needed for predicting the clinically related overall survival result. First, Office of Cancer Genomics (OCG Target) provided RNASeq, copy amount variations information, and clinically related follow-up data. Genes associated with prognostic process and genes exhibiting copy amount difference were screened in the training group, and the mentioned genes were integrated for feature selection with least absolute shrinkage and selection operator (Lasso). Eventually, effective biomarkers received the screening process. Lastly, this study built and demonstrated one gene-associated prognosis mode according to the set of the test and gene expression omnibus validation set; 512 prognosis-related genes ( P < 0.01), 336 copies of amplified genes ( P < 0.05), and 36 copies of deleted genes ( P < 0.05) were obtained, and those genes of the mentioned genomic variants display close associations with tumor occurring and developing mechanisms. This study generated 10 genes for candidates through the integration of genomic variant genes as well as prognosis-related genes. Six typical genes (i.e. MYC, CHIC2, CCDC152, LYL1, GPR142, and MMP27) were obtained by Lasso feature selection and stepwise multivariate regression study, many of which are reported to show a relationship to tumor progressing process. The authors conducted Cox regression study for building 6-gene sign, i.e. one single prognosis-related element, in terms of cases carrying osteosarcoma. In addition, the samples were able to be risk stratified in the training group, test set, and externally validating set. The AUC of five-year survival according to the training group and validation set reached over 0.85, with superior predictive performance as opposed to the existing researches. Here, 6-gene sign was built to be new prognosis-related marking elements for assessing osteosarcoma cases’ surviving state.


2017 ◽  
Vol 26 (01) ◽  
pp. 188-192 ◽  
Author(s):  
H. Dauchel ◽  
T. Lecroq

Summary Objective: To summarize excellent current research and propose a selection of best papers published in 2016 in the field of Bioinformatics and Translational Informatics with applications in the health domain and clinical care. Methods: We provide a synopsis of the articles selected for the IMIA Yearbook 2017, from which we attempt to derive a synthetic overview of current and future activities in the field. As in 2016, a first step of selection was performed by querying MEDLINE with a list of MeSH descriptors completed by a list of terms adapted to the section coverage. Each section editor evaluated separately the set of 951 articles returned and evaluation results were merged for retaining 15 candidate best papers for peer-review. Results: The selection and evaluation process of papers published in the Bioinformatics and Translational Informatics field yielded four excellent articles focusing this year on the secondary use and massive integration of multi-omics data for cancer genomics and non-cancer complex diseases. Papers present methods to study the functional impact of genetic variations, either at the level of the transcription or at the levels of pathway and network. Conclusions: Current research activities in Bioinformatics and Translational Informatics with applications in the health domain continue to explore new algorithms and statistical models to manage, integrate, and interpret large-scale genomic datasets. As addressed by some of the selected papers, future trends would include the question of the international collaborative sharing of clinical and omics data, and the implementation of intelligent systems to enhance routine medical genomics.


2021 ◽  
Vol 13 (13) ◽  
pp. 2564
Author(s):  
Mauro Martini ◽  
Vittorio Mazzia ◽  
Aleem Khaliq ◽  
Marcello Chiaberge

The increasing availability of large-scale remote sensing labeled data has prompted researchers to develop increasingly precise and accurate data-driven models for land cover and crop classification (LC&CC). Moreover, with the introduction of self-attention and introspection mechanisms, deep learning approaches have shown promising results in processing long temporal sequences in the multi-spectral domain with a contained computational request. Nevertheless, most practical applications cannot rely on labeled data, and in the field, surveys are a time-consuming solution that pose strict limitations to the number of collected samples. Moreover, atmospheric conditions and specific geographical region characteristics constitute a relevant domain gap that does not allow direct applicability of a trained model on the available dataset to the area of interest. In this paper, we investigate adversarial training of deep neural networks to bridge the domain discrepancy between distinct geographical zones. In particular, we perform a thorough analysis of domain adaptation applied to challenging multi-spectral, multi-temporal data, accurately highlighting the advantages of adapting state-of-the-art self-attention-based models for LC&CC to different target zones where labeled data are not available. Extensive experimentation demonstrated significant performance and generalization gain in applying domain-adversarial training to source and target regions with marked dissimilarities between the distribution of extracted features.


PLoS Medicine ◽  
2016 ◽  
Vol 13 (12) ◽  
pp. e1002209 ◽  
Author(s):  
Elaine R. Mardis ◽  
Marc Ladanyi
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document