Feature screening with large scale and high dimensional survival data

Biometrics ◽  
2021 ◽  
Author(s):  
Grace Y. Yi ◽  
Wenqing He ◽  
Raymond. J. Carroll
2018 ◽  
Vol 46 (6) ◽  
pp. 979-994 ◽  
Author(s):  
Meiling Hao ◽  
Yuanyuan Lin ◽  
Xianhui Liu ◽  
Wenlu Tang

2018 ◽  
Vol 61 (9) ◽  
pp. 1617-1636 ◽  
Author(s):  
Yuanyuan Lin ◽  
Xianhui Liu ◽  
Meiling Hao

2009 ◽  
Vol 35 (7) ◽  
pp. 859-866
Author(s):  
Ming LIU ◽  
Xiao-Long WANG ◽  
Yuan-Chao LIU

2021 ◽  
Vol 11 (2) ◽  
pp. 472
Author(s):  
Hyeongmin Cho ◽  
Sangkyun Lee

Machine learning has been proven to be effective in various application areas, such as object and speech recognition on mobile systems. Since a critical key to machine learning success is the availability of large training data, many datasets are being disclosed and published online. From a data consumer or manager point of view, measuring data quality is an important first step in the learning process. We need to determine which datasets to use, update, and maintain. However, not many practical ways to measure data quality are available today, especially when it comes to large-scale high-dimensional data, such as images and videos. This paper proposes two data quality measures that can compute class separability and in-class variability, the two important aspects of data quality, for a given dataset. Classical data quality measures tend to focus only on class separability; however, we suggest that in-class variability is another important data quality factor. We provide efficient algorithms to compute our quality measures based on random projections and bootstrapping with statistical benefits on large-scale high-dimensional data. In experiments, we show that our measures are compatible with classical measures on small-scale data and can be computed much more efficiently on large-scale high-dimensional datasets.


Algorithms ◽  
2021 ◽  
Vol 14 (5) ◽  
pp. 146
Author(s):  
Aleksei Vakhnin ◽  
Evgenii Sopov

Modern real-valued optimization problems are complex and high-dimensional, and they are known as “large-scale global optimization (LSGO)” problems. Classic evolutionary algorithms (EAs) perform poorly on this class of problems because of the curse of dimensionality. Cooperative Coevolution (CC) is a high-performed framework for performing the decomposition of large-scale problems into smaller and easier subproblems by grouping objective variables. The efficiency of CC strongly depends on the size of groups and the grouping approach. In this study, an improved CC (iCC) approach for solving LSGO problems has been proposed and investigated. iCC changes the number of variables in subcomponents dynamically during the optimization process. The SHADE algorithm is used as a subcomponent optimizer. We have investigated the performance of iCC-SHADE and CC-SHADE on fifteen problems from the LSGO CEC’13 benchmark set provided by the IEEE Congress of Evolutionary Computation. The results of numerical experiments have shown that iCC-SHADE outperforms, on average, CC-SHADE with a fixed number of subcomponents. Also, we have compared iCC-SHADE with some state-of-the-art LSGO metaheuristics. The experimental results have shown that the proposed algorithm is competitive with other efficient metaheuristics.


2012 ◽  
Vol 2012 ◽  
pp. 1-8 ◽  
Author(s):  
Catalina Alvarado-Rojas ◽  
Michel Le Van Quyen

Little is known about the long-term dynamics of widely interacting cortical and subcortical networks during the wake-sleep cycle. Using large-scale intracranial recordings of epileptic patients during seizure-free periods, we investigated local- and long-range synchronization between multiple brain regions over several days. For such high-dimensional data, summary information is required for understanding and modelling the underlying dynamics. Here, we suggest that a compact yet useful representation is given by a state space based on the first principal components. Using this representation, we report, with a remarkable similarity across the patients with different locations of electrode placement, that the seemingly complex patterns of brain synchrony during the wake-sleep cycle can be represented by a small number of characteristic dynamic modes. In this space, transitions between behavioral states occur through specific trajectories from one mode to another. These findings suggest that, at a coarse level of temporal resolution, the different brain states are correlated with several dominant synchrony patterns which are successively activated across wake-sleep states.


2015 ◽  
Vol 2015 ◽  
pp. 1-12 ◽  
Author(s):  
Sai Kiranmayee Samudrala ◽  
Jaroslaw Zola ◽  
Srinivas Aluru ◽  
Baskar Ganapathysubramanian

Dimensionality reduction refers to a set of mathematical techniques used to reduce complexity of the original high-dimensional data, while preserving its selected properties. Improvements in simulation strategies and experimental data collection methods are resulting in a deluge of heterogeneous and high-dimensional data, which often makes dimensionality reduction the only viable way to gain qualitative and quantitative understanding of the data. However, existing dimensionality reduction software often does not scale to datasets arising in real-life applications, which may consist of thousands of points with millions of dimensions. In this paper, we propose a parallel framework for dimensionality reduction of large-scale data. We identify key components underlying the spectral dimensionality reduction techniques, and propose their efficient parallel implementation. We show that the resulting framework can be used to process datasets consisting of millions of points when executed on a 16,000-core cluster, which is beyond the reach of currently available methods. To further demonstrate applicability of our framework we perform dimensionality reduction of 75,000 images representing morphology evolution during manufacturing of organic solar cells in order to identify how processing parameters affect morphology evolution.


2016 ◽  
Vol 24 (4) ◽  
pp. 644-651 ◽  
Author(s):  
Ziya L. Gokaslan ◽  
Patricia L. Zadnik ◽  
Daniel M. Sciubba ◽  
Niccole Germscheid ◽  
C. Rory Goodwin ◽  
...  

OBJECT A chordoma is an indolent primary spinal tumor that has devastating effects on the patient's life. These lesions are chemoresistant, resistant to conventional radiotherapy, and moderately sensitive to proton therapy; however, en bloc resection remains the preferred treatment for optimizing patient outcomes. While multiple small and largely retrospective studies have investigated the outcomes following en bloc resection of chordomas in the sacrum, there have been few large-scale studies on patients with chordomas of the mobile spine. The goal of this study was to review the outcomes of surgically treated patients with mobile spine chordomas at multiple international centers with respect to local recurrence and survival. This multiinstitutional retrospective study collected data between 1988 and 2012 about prognosis-predicting factors, including various clinical characteristics and surgical techniques for mobile spine chordoma. Tumors were classified according to the Enneking principles and analyzed in 2 treatment cohorts: Enneking-appropriate (EA) and Enneking-inappropriate (EI) cohorts. Patients were categorized as EA when the final pathological assessment of the margin matched the Enneking recommendation; otherwise, they were categorized as EI. METHODS Descriptive statistics were used to summarize the data (Student t-test, chi-square, and Fisher exact tests). Recurrence and survival data were analyzed using Kaplan-Meier survival curves, log-rank tests, and multivariate Cox proportional hazard modeling. RESULTS A total of 166 patients (55 female and 111 male patients) with mobile spine chordoma were included. The median patient follow-up was 2.6 years (range 1 day to 22.5 years). Fifty-eight (41%) patients were EA and 84 (59%) patients were EI. The type of biopsy (p < 0.001), spinal location (p = 0.018), and if the patient received adjuvant therapy (p < 0.001) were significantly different between the 2 cohorts. Overall, 58 (35%) patients developed local recurrence and 57 (34%) patients died. Median survival was 7.0 years postoperative: 8.4 years postoperative for EA patients and 6.4 years postoperative for EI patients (p = 0.023). The multivariate analysis showed that the EI cohort was significantly associated with an increased risk of local recurrence in comparison with the EA cohort (HR 7.02; 95% CI 2.96–16.6; p < 0.001), although no significant difference in survival was observed. CONCLUSIONS EA resection plays a major role in decreasing the risk for local recurrence in patients with chordoma of the mobile spine.


Sign in / Sign up

Export Citation Format

Share Document