Mining with Inference: Data-Adaptive Target Parameters

Summary Objectives: In oncological studies, the hazard rate can be used to differentiate subgroups of the study population according to their patterns of survival risk over time. Nonparametric curve estimation has been suggested as an exploratory means of revealing such patterns. The decision about the type of smoothing parameter is critical for performance in practice. In this paper, we study data-adaptive smoothing. Methods: A decade ago, the nearest-neighbor bandwidth was introduced for censored data in survival analysis. It is specified by one parameter, namely the number of nearest neighbors. Bandwidth selection in this setting has rarely been investigated, although the heuristical advantages over the frequently-studied fixed bandwidth are quite obvious. The asymptotical relationship between the fixed and the nearest-neighbor bandwidth can be used to generate novel approaches. Results: We develop a new selection algorithm termed double-smoothing for the nearest-neighbor bandwidth in hazard rate estimation. Our approach uses a finite sample approximation of the asymptotical relationship between the fixed and nearest-neighbor bandwidth. By so doing, we identify the nearest-neighbor bandwidth as an additional smoothing step and achieve further data-adaption after fixed bandwidth smoothing. We illustrate the application of the new algorithm in a clinical study and compare the outcome to the traditional fixed bandwidth result, thus demonstrating the practical performance of the technique. Conclusion: The double-smoothing approach enlarges the methodological repertoire for selecting smoothing parameters in nonparametric hazard rate estimation. The slight increase in computational effort is rewarded with a substantial amount of estimation stability, thus demonstrating the benefit of the technique for biostatistical applications.

Download Full-text

Kernel Based Data-Adaptive Support Vector Machines for Multi-Class Classification

Mathematics ◽

10.3390/math9090936 ◽

2021 ◽

Vol 9 (9) ◽

pp. 936

Author(s):

Jianli Shao ◽

Xin Liu ◽

Wenqing He

Keyword(s):

Machine Learning ◽

Spatial Association ◽

Class Imbalance ◽

Imbalanced Data ◽

Real Data ◽

Kernel Functions ◽

Support Vector ◽

Classification Problems ◽

Rare Class ◽

Data Adaptive

Imbalanced data exist in many classification problems. The classification of imbalanced data has remarkable challenges in machine learning. The support vector machine (SVM) and its variants are popularly used in machine learning among different classifiers thanks to their flexibility and interpretability. However, the performance of SVMs is impacted when the data are imbalanced, which is a typical data structure in the multi-category classification problem. In this paper, we employ the data-adaptive SVM with scaled kernel functions to classify instances for a multi-class population. We propose a multi-class data-dependent kernel function for the SVM by considering class imbalance and the spatial association among instances so that the classification accuracy is enhanced. Simulation studies demonstrate the superb performance of the proposed method, and a real multi-class prostate cancer image dataset is employed as an illustration. Not only does the proposed method outperform the competitor methods in terms of the commonly used accuracy measures such as the F-score and G-means, but also successfully detects more than 60% of instances from the rare class in the real data, while the competitors can only detect less than 20% of the rare class instances. The proposed method will benefit other scientific research fields, such as multiple region boundary detection.

Download Full-text

A data-adaptive robust unit commitment model considering high penetration of wind power generation and its enhanced uncertainty set

International Journal of Electrical Power & Energy Systems ◽

10.1016/j.ijepes.2021.106797 ◽

2021 ◽

Vol 129 ◽

pp. 106797

Author(s):

Zhenjia Lin ◽

Haoyong Chen ◽

Qiuwei Wu ◽

Jianping Huang ◽

Mengshi Li ◽

...

Keyword(s):

Power Generation ◽

Wind Power ◽

Unit Commitment ◽

Wind Power Generation ◽

High Penetration ◽

Uncertainty Set ◽

Data Adaptive

Download Full-text

Improved power by collapsing rare and common variants based on a data-adaptive forward selection strategy

BMC Proceedings ◽

10.1186/1753-6561-5-s9-s114 ◽

2011 ◽

Vol 5 (S9) ◽

Cited By ~ 1

Author(s):

Yilin Dai ◽

Ling Guo ◽

Jianping Dong ◽

Renfang Jiang

Keyword(s):

Selection Strategy ◽

Common Variants ◽

Forward Selection ◽

Data Adaptive

Download Full-text

Boolean representation based data-adaptive correlation analysis over time series streams

Proceedings of the sixteenth ACM conference on Conference on information and knowledge management - CIKM '07 ◽

10.1145/1321440.1321471 ◽

2007 ◽

Cited By ~ 3

Author(s):

Tiancheng Zhang ◽

Dejun Yue ◽

Yu Gu ◽

Ge Yu

Keyword(s):

Time Series ◽

Correlation Analysis ◽

Data Adaptive ◽

Over Time

Download Full-text

A novel data adaptive detection scheme for distributed fiber optic acoustic sensing

10.1117/12.2224151 ◽

2016 ◽

Author(s):

İbrahim Ölçer ◽

Ahmet Öncü

Keyword(s):

Fiber Optic ◽

Adaptive Detection ◽

Detection Scheme ◽

Acoustic Sensing ◽

Data Adaptive

Download Full-text

A novel approach for decomposition of biomedical signals in different applications based on data-adaptive Gaussian average filtering

Biomedical Signal Processing and Control ◽

10.1016/j.bspc.2021.103104 ◽

2022 ◽

Vol 71 ◽

pp. 103104

Author(s):

Yue-Der Lin ◽

Yong Kok Tan ◽

Baofeng Tian

Keyword(s):

Biomedical Signals ◽

Novel Approach ◽

Data Adaptive

Download Full-text

A low-complexity data-adaptive approach for premature ventricular contraction recognition

Signal Image and Video Processing ◽

10.1007/s11760-013-0478-6 ◽

2013 ◽

Vol 8 (1) ◽

pp. 111-120 ◽

Cited By ~ 36

Author(s):

Peng Li ◽

Chengyu Liu ◽

Xinpei Wang ◽

Dingchang Zheng ◽

Yuanyang Li ◽

...

Keyword(s):

Premature Ventricular Contraction ◽

Low Complexity ◽

Ventricular Contraction ◽

Adaptive Approach ◽

Data Adaptive

Download Full-text

DAT: A Data Adaptive Transmission Mechanism for Clustering-Based Wireless Sensor Networks

2008 International Conference on MultiMedia and Information Technology ◽

10.1109/mmit.2008.181 ◽

2008 ◽

Author(s):

Yan Shen ◽

Zi-wei Zeng

Keyword(s):

Wireless Sensor Networks ◽

Sensor Networks ◽

Adaptive Transmission ◽

Transmission Mechanism ◽

Wireless Sensor ◽

Data Adaptive

Download Full-text

Discussion of Identification, Estimation and Approximation of Risk under Interventions that Depend on the Natural Value of Treatment Using Observational Data, by Jessica Young, Miguel Hernán, and James Robins

Epidemiologic Methods ◽

10.1515/em-2014-0012 ◽

2014 ◽

Vol 3 (1) ◽

Cited By ~ 1

Author(s):

Mark J. van der Laan ◽

Alexander R. Luedtke ◽

Iván Díaz

Keyword(s):

Statistical Model ◽

Observational Data ◽

Robust Estimation ◽

Statistical Estimation ◽

Estimation Problem ◽

Target Parameter ◽

Adaptive Parameters ◽

The Mean ◽

Data Adaptive ◽

Statistical Estimation Problem

AbstractYoung, Hernán, and Robins consider the mean outcome under a dynamic intervention that may rely on the natural value of treatment. They first identify this value with a statistical target parameter, and then show that this statistical target parameter can also be identified with a causal parameter which gives the mean outcome under a stochastic intervention. The authors then describe estimation strategies for these quantities. Here we augment the authors’ insightful discussion by sharing our experiences in situations where two causal questions lead to the same statistical estimand, or the newer problem that arises in the study of data adaptive parameters, where two statistical estimands can lead to the same estimation problem. Given a statistical estimation problem, we encourage others to always use a robust estimation framework where the data generating distribution truly belongs to the statistical model. We close with a discussion of a framework which has these properties.

Download Full-text