The asymptotic distribution of the Net Benefit estimator in presence of right-censoring

2021 ◽  
pp. 096228022110370
Author(s):  
Brice Ozenne ◽  
Esben Budtz-Jørgensen ◽  
Julien Péron

The benefit–risk balance is a critical information when evaluating a new treatment. The Net Benefit has been proposed as a metric for the benefit–risk assessment, and applied in oncology to simultaneously consider gains in survival and possible side effects of chemotherapies. With complete data, one can construct a U-statistic estimator for the Net Benefit and obtain its asymptotic distribution using standard results of the U-statistic theory. However, real data is often subject to right-censoring, e.g. patient drop-out in clinical trials. It is then possible to estimate the Net Benefit using a modified U-statistic, which involves the survival time. The latter can be seen as a nuisance parameter affecting the asymptotic distribution of the Net Benefit estimator. We present here how existing asymptotic results on U-statistics can be applied to estimate the distribution of the net benefit estimator, and assess their validity in finite samples. The methodology generalizes to other statistics obtained using generalized pairwise comparisons, such as the win ratio. It is implemented in the R package BuyseTest (version 2.3.0 and later) available on Comprehensive R Archive Network.

2020 ◽  
Vol 499 (3) ◽  
pp. 4054-4067
Author(s):  
Steven Cunnington ◽  
Stefano Camera ◽  
Alkistis Pourtsidou

ABSTRACT Potential evidence for primordial non-Gaussianity (PNG) is expected to lie in the largest scales mapped by cosmological surveys. Forthcoming 21 cm intensity mapping experiments will aim to probe these scales by surveying neutral hydrogen (H i) within galaxies. However, foreground signals dominate the 21 cm emission, meaning foreground cleaning is required to recover the cosmological signal. The effect this has is to damp the H i power spectrum on the largest scales, especially along the line of sight. Whilst there is agreement that this contamination is potentially problematic for probing PNG, it is yet to be fully explored and quantified. In this work, we carry out the first forecasts on fNL that incorporate simulated foreground maps that are removed using techniques employed in real data. Using an Monte Carlo Markov Chain analysis on an SKA1-MID-like survey, we demonstrate that foreground cleaned data recovers biased values [$f_{\rm NL}= -102.1_{-7.96}^{+8.39}$ (68 per cent CL)] on our fNL = 0 fiducial input. Introducing a model with fixed parameters for the foreground contamination allows us to recover unbiased results ($f_{\rm NL}= -2.94_{-11.9}^{+11.4}$). However, it is not clear that we will have sufficient understanding of foreground contamination to allow for such rigid models. Treating the main parameter $k_\parallel ^\text{FG}$ in our foreground model as a nuisance parameter and marginalizing over it, still recovers unbiased results but at the expense of larger errors ($f_{\rm NL}= 0.75^{+40.2}_{-44.5}$), which can only be reduced by imposing the Planck 2018 prior. Our results show that significant progress on understanding and controlling foreground removal effects is necessary for studying PNG with H i intensity mapping.


Biometrika ◽  
2021 ◽  
Author(s):  
Juhyun Park ◽  
Jeongyoun Ahn ◽  
Yongho Jeon

Abstract Functional linear discriminant analysis offers a simple yet efficient method for classification, with the possibility of achieving a perfect classification. Several methods are proposed in the literature that mostly address the dimensionality of the problem. On the other hand, there is a growing interest in interpretability of the analysis, which favors a simple and sparse solution. In this work, we propose a new approach that incorporates a type of sparsity that identifies nonzero sub-domains in the functional setting, offering a solution that is easier to interpret without compromising performance. With the need to embed additional constraints in the solution, we reformulate the functional linear discriminant analysis as a regularization problem with an appropriate penalty. Inspired by the success of ℓ1-type regularization at inducing zero coefficients for scalar variables, we develop a new regularization method for functional linear discriminant analysis that incorporates an L1-type penalty, ∫ |f|, to induce zero regions. We demonstrate that our formulation has a well-defined solution that contains zero regions, achieving a functional sparsity in the sense of domain selection. In addition, the misclassification probability of the regularized solution is shown to converge to the Bayes error if the data are Gaussian. Our method does not presume that the underlying function has zero regions in the domain, but produces a sparse estimator that consistently estimates the true function whether or not the latter is sparse. Numerical comparisons with existing methods demonstrate this property in finite samples with both simulated and real data examples.


2019 ◽  
Vol 36 (7) ◽  
pp. 2017-2024
Author(s):  
Weiwei Zhang ◽  
Ziyi Li ◽  
Nana Wei ◽  
Hua-Jun Wu ◽  
Xiaoqi Zheng

Abstract Motivation Inference of differentially methylated (DM) CpG sites between two groups of tumor samples with different geno- or pheno-types is a critical step to uncover the epigenetic mechanism of tumorigenesis, and identify biomarkers for cancer subtyping. However, as a major source of confounding factor, uneven distributions of tumor purity between two groups of tumor samples will lead to biased discovery of DM sites if not properly accounted for. Results We here propose InfiniumDM, a generalized least square model to adjust tumor purity effect for differential methylation analysis. Our method is applicable to a variety of experimental designs including with or without normal controls, different sources of normal tissue contaminations. We compared our method with conventional methods including minfi, limma and limma corrected by tumor purity using simulated datasets. Our method shows significantly better performance at different levels of differential methylation thresholds, sample sizes, mean purity deviations and so on. We also applied the proposed method to breast cancer samples from TCGA database to further evaluate its performance. Overall, both simulation and real data analyses demonstrate favorable performance over existing methods serving similar purpose. Availability and implementation InfiniumDM is a part of R package InfiniumPurify, which is freely available from GitHub (https://github.com/Xiaoqizheng/InfiniumPurify). Supplementary information Supplementary data are available at Bioinformatics online.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Bing Song ◽  
August E. Woerner ◽  
John Planz

Abstract Background Multi-locus genotype data are widely used in population genetics and disease studies. In evaluating the utility of multi-locus data, the independence of markers is commonly considered in many genomic assessments. Generally, pairwise non-random associations are tested by linkage disequilibrium; however, the dependence of one panel might be triplet, quartet, or other. Therefore, a compatible and user-friendly software is necessary for testing and assessing the global linkage disequilibrium among mixed genetic data. Results This study describes a software package for testing the mutual independence of mixed genetic datasets. Mutual independence is defined as no non-random associations among all subsets of the tested panel. The new R package “mixIndependR” calculates basic genetic parameters like allele frequency, genotype frequency, heterozygosity, Hardy–Weinberg equilibrium, and linkage disequilibrium (LD) by mutual independence from population data, regardless of the type of markers, such as simple nucleotide polymorphisms, short tandem repeats, insertions and deletions, and any other genetic markers. A novel method of assessing the dependence of mixed genetic panels is developed in this study and functionally analyzed in the software package. By comparing the observed distribution of two common summary statistics (the number of heterozygous loci [K] and the number of share alleles [X]) with their expected distributions under the assumption of mutual independence, the overall independence is tested. Conclusion The package “mixIndependR” is compatible to all categories of genetic markers and detects the overall non-random associations. Compared to pairwise disequilibrium, the approach described herein tends to have higher power, especially when number of markers is large. With this package, more multi-functional or stronger genetic panels can be developed, like mixed panels with different kinds of markers. In population genetics, the package “mixIndependR” makes it possible to discover more about admixture of populations, natural selection, genetic drift, and population demographics, as a more powerful method of detecting LD. Moreover, this new approach can optimize variants selection in disease studies and contribute to panel combination for treatments in multimorbidity. Application of this approach in real data is expected in the future, and this might bring a leap in the field of genetic technology. Availability The R package mixIndependR, is available on the Comprehensive R Archive Network (CRAN) at: https://cran.r-project.org/web/packages/mixIndependR/index.html.


2021 ◽  
Author(s):  
Lajos Horváth ◽  
Zhenya Liu ◽  
Gregory Rice ◽  
Yuqian Zhao

Abstract The problem of detecting change points in the mean of high dimensional panel data with potentially strong cross–sectional dependence is considered. Under the assumption that the cross–sectional dependence is captured by an unknown number of common factors, a new CUSUM type statistic is proposed. We derive its asymptotic properties under three scenarios depending on to what extent the common factors are asymptotically dominant. With panel data consisting of N cross sectional time series of length T, the asymptotic results hold under the mild assumption that min {N, T} → ∞, with an otherwise arbitrary relationship between N and T, allowing the results to apply to most panel data examples. Bootstrap procedures are proposed to approximate the sampling distribution of the test statistics. A Monte Carlo simulation study showed that our test outperforms several other existing tests in finite samples in a number of cases, particularly when N is much larger than T. The practical application of the proposed results are demonstrated with real data applications to detecting and estimating change points in the high dimensional FRED-MD macroeconomic data set.


Mathematics ◽  
2021 ◽  
Vol 9 (23) ◽  
pp. 3074
Author(s):  
Cristian Preda ◽  
Quentin Grimonprez ◽  
Vincent Vandewalle

Categorical functional data represented by paths of a stochastic jump process with continuous time and a finite set of states are considered. As an extension of the multiple correspondence analysis to an infinite set of variables, optimal encodings of states over time are approximated using an arbitrary finite basis of functions. This allows dimension reduction, optimal representation, and visualisation of data in lower dimensional spaces. The methodology is implemented in the cfda R package and is illustrated using a real data set in the clustering framework.


2021 ◽  
Vol 6 (12) ◽  
pp. 13488-13502
Author(s):  
Qingsong Shan ◽  
◽  
Qianning Liu

<abstract><p>In this paper, we propose a beta kernel estimator to measure functional dependence (MFD). The MFD not only can measure the strength of linear or monotonic relationships, but it is also suitable for more complicated functional dependence. We derive the asymptotic distribution of the proposed estimator and then use several simulated examples to compare our estimator with the traditional measures. Our simulation results demonstrate that beta kernel provides high accuracy in estimation. A real data example is also given to illustrate one possible application of the new estimator.</p></abstract>


Biometrika ◽  
2020 ◽  
Author(s):  
X Guo ◽  
C Y Tang

Summary We consider testing the covariance structure in statistical models. We focus on developing such tests when the random vectors of interest are not directly observable and have to be derived via estimated models. Additionally, the covariance specification may involve extra nuisance parameters which also need to be estimated. In a generic additive model setting, we develop and investigate test statistics based on the maximum discrepancy measure calculated from the residuals. To approximate the distributions of the test statistics under the null hypothesis, new multiplier bootstrap procedures with dedicated adjustments that incorporate the model and nuisance parameter estimation errors are proposed. Our theoretical development elucidates the impact due to the estimation errors with high-dimensional data and demonstrates the validity of our tests. Simulations and real data examples confirm our theory and demonstrate the performance of the proposed tests.


2020 ◽  
Vol 2020 ◽  
pp. 1-7
Author(s):  
Kyeongjun Lee ◽  
Jung-In Seo

This paper provides an estimation method for an unknown parameter by extending weighted least-squared and pivot-based methods to the Gompertz distribution with the shape and scale parameters under the progressive Type-II censoring scheme, which induces a consistent estimator and an unbiased estimator of the scale parameter. In addition, a way to deal with a nuisance parameter is provided in the pivot-based approach. For evaluation and comparison, the Monte Carlo simulations are conducted, and real data are analyzed.


Circulation ◽  
2020 ◽  
Vol 142 (20) ◽  
pp. 1974-1988
Author(s):  
Sanjay Kaul ◽  
Norman Stockbridge ◽  
Javed Butler

Balancing benefits and risks is a complex task that poses a major challenge, both to the approval of new medicines and devices by regulatory authorities and in therapeutic decision-making in practice. Several analysis methods and visualization tools have been developed to help evaluate and communicate whether the benefit–risk profile is favorable or unfavorable. In this White Paper, we describe approaches to benefit–risk assessment using qualitative approaches such as the Benefit Risk Action Team framework developed by the Pharmaceutical Research and Manufacturers of America, and the Benefit–Risk Framework developed by the United States Food and Drug Administration; and quantitative approaches such as the numbers needed to treat for benefit and harm, the benefit–risk ratio, and Incremental Net Benefit. We give illustrative examples of benefit–risk evaluations using 4 treatment interventions including sodium glucose cotransporter 2 inhibitors in patients with type 2 diabetes; a direct antithrombin agent, dabigatran, for reducing stroke and systemic embolism in patients with nonvalvular atrial fibrillation; transcatheter aortic valve replacement in patients with symptomatic severe aortic valve stenosis; and antiplatelet agents vorapaxar and prasugrel for reducing cardiovascular events in patients at high cardiovascular risk. Regular applications of structured benefit–risk assessment, whether qualitative, quantitative, or both, enabled by easy-to-understand graphical presentations that capture uncertainties around the benefit–risk metric, may aid shared decision-making and enhance transparency of those decisions.


Sign in / Sign up

Export Citation Format

Share Document