scholarly journals Kendall Transformation: A Robust Representation of Continuous Data for Information Theory

Author(s):  
Miron Bartosz Kursa

Abstract Kendall transformation is a conversion of an ordered feature into a vector of pairwise order relations between individual values. This way, it preserves ranking of observations and represents it in a categorical form. Such transformation allows for generalisation of methods requiring strictly categorical input, especially in the limit of small number of observations, when discretisation becomes problematic.In particular, many approaches of information theory can be directly applied to Kendall-transformed continuous data without relying on differential entropy or any additional parameters. Moreover, by filtering information to this contained in ranking, Kendall transformation leads to a better robustness at a reasonable cost of dropping sophisticated interactions which are anyhow unlikely to be correctly estimated. In bivariate analysis, Kendall transformation can be related to popular non-parametric methods, showing the soundness of the approach.The paper also demonstrates its efficiency in multivariate problems, as well as provides an example analysis of a real-world data.

Author(s):  
Andrew Hamilton-Wright ◽  
Daniel W. Stashuk

A great deal of interesting real-world data is encountered through the analysis of continuous variables, however many of the robust tools for rule discovery and data characterization depend upon the underlying data existing in an ordinal, enumerable or discrete data domain. Tools that fall into this category include much of the current work in fuzzy logic and rough sets, as well as all forms of event-based pattern discovery tools based on probabilistic inference. Through the application of discretization techniques, continuous data is made accessible to the analysis provided by the strong tools of discrete-valued data mining. The most common approach for discretization is quantization, in which the range of observed continuous valued data are assigned to a fixed number of quanta, each of which covers a particular portion of the range within the bounds provided by the most extreme points observed within the continuous domain. This chapter explores the effects such quantization may have, and the techniques that are available to ameliorate the negative effects of these efforts, notably fuzzy systems and rough sets.


Entropy ◽  
2019 ◽  
Vol 21 (8) ◽  
pp. 720 ◽  
Author(s):  
Sergio Verdú

We give a brief survey of the literature on the empirical estimation of entropy, differential entropy, relative entropy, mutual information and related information measures. While those quantities are of central importance in information theory, universal algorithms for their estimation are increasingly important in data science, machine learning, biology, neuroscience, economics, language, and other experimental sciences.


2021 ◽  
Vol 9 (2) ◽  
pp. e002092
Author(s):  
Xue Bai ◽  
Michelle Kim ◽  
Gyulnara Kasumova ◽  
Lu Si ◽  
Bixia Tang ◽  
...  

BackgroundAlthough the Society for Immunotherapy of Cancer (SITC) Immunotherapy Resistance Taskforce recently defined primary and secondary resistance to anti-programmed cell death protein 1 (anti-PD-1) therapy, there is lack of real-world data regarding differences in these resistance subtypes with respect to radiological dynamics and clinical manifestations.MethodsWe performed single-blind re-evaluations of radiological images by independent radiologists on a retrospectively assembled cohort of patients with advanced melanoma (n=254, median follow-up 31 months) receiving anti-PD-1 monotherapy at Massachusetts General Hospital and Peking University Cancer Hospital. Radiological characteristics and timing at multiple crucial time points were analyzed and correlated with each other and with survival. Primary and secondary resistance was defined as per the SITC Immunotherapy Resistance Taskforce definitions.ResultsThe most significant target lesion measurement change took place within the first 3 months after anti-PD-1 initiation. Patients with stable disease with versus without tumor shrinkage at the initial evaluation exhibited distinct disease trajectory, as the rate of further upgrade to a partial or complete remission (CR/PR) was 44% and 0%, respectively. Eleven per cent of PR patients ultimately achieved a CR. In multivariate analyses, deeper response depth was independently associated with a more limited progression pattern, fewer involved organs, lower tumor burden, slower growth rate at disease progression (PD) (all p≤0.001), and longer post-progression survival (PPS) (bivariate analysis, p=0.005). Compared with primary resistance, secondary resistance was associated with less widespread PD pattern, lower tumor burden and slower tumor growth (all p≤0.001). Patients with secondary resistance were less likely to receive further systemic therapy (28% vs 57%, p<0.001) yet had significantly better PPS (HR 0.503, 95% CI 0.288 to 0.879, p=0.02).ConclusionsRadiological dynamics were variable, yet significantly correlated with survival outcomes. SITC-defined primary and secondary resistance are distinct clinical manifestations in patients with melanoma, suggesting the possibility of resistance-type-based therapeutic decision-making and clinical trial design, once further validated by future prospective studies.


2015 ◽  
Vol 30 (1) ◽  
pp. 125-140 ◽  
Author(s):  
Nayereh Bagheri Khoolenjani ◽  
Mohammad Hossein Alamatsaz

De Bruijn's identity shows a link between two fundamental concepts in information theory: entropy and Fisher information. In the literature, De Bruijn's identity has been stated under the assumption of independence between input signal and an additive noise. However, in the real world, the noise could be highly dependent on the main signal. The main aim of this paper is, firstly, to extend De bruijn's identity for signal-dependent noise channels and, secondly, to study how Stein and heat identities are related to De bruijn's identity. Thus, new versions of De Bruijn's identity are introduced in the case when input signal and additive noise are dependent and are jointly distributed according to Archimedean and Gaussian copulas. It is shown that in this generalized model, the derivatives of the differential entropy can be expressed in terms of a function of Fisher information. Our finding enfolds the conventional De Bruijn's identity as some remarks. Then, the equivalence among the new De Bruijn-type identity, Stein's identity and heat equation identity is established. The paper concludes with an application of the developed results in information theory.


2008 ◽  
Vol 2008 ◽  
pp. 1-11 ◽  
Author(s):  
Nicholas Holden ◽  
Alex A. Freitas

We have previously proposed a hybrid particle swarm optimisation/ant colony optimisation (PSO/ACO) algorithm for the discovery of classification rules. Unlike a conventional PSO algorithm, this hybrid algorithm can directly cope with nominal attributes, without converting nominal values into binary numbers in a preprocessing phase. PSO/ACO2 also directly deals with both continuous and nominal attribute values, a feature that current PSO and ACO rule induction algorithms lack. We evaluate the new version of the PSO/ACO algorithm (PSO/ACO2) in 27 public-domain, real-world data sets often used to benchmark the performance of classification algorithms. We compare the PSO/ACO2 algorithm to an industry standard algorithm PART and compare a reduced version of our PSO/ACO2 algorithm, coping only with continuous data, to our new classification algorithm for continuous data based on differential evolution. The results show that PSO/ACO2 is very competitive in terms of accuracy to PART and that PSO/ACO2 produces significantly simpler (smaller) rule sets, a desirable result in data mining—where the goal is to discover knowledge that is not only accurate but also comprehensible to the user. The results also show that the reduced PSO version for continuous attributes provides a slight increase in accuracy when compared to the differential evolution variant.


2016 ◽  
Vol 22 ◽  
pp. 219
Author(s):  
Roberto Salvatori ◽  
Olga Gambetti ◽  
Whitney Woodmansee ◽  
David Cox ◽  
Beloo Mirakhur ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document