ClassiMap: A New Dimension Reduction Technique for Exploratory Data Analysis of Labeled Data

Author(s):  
Sylvain Lespinats ◽  
Michaël Aupetit ◽  
Anke Meyer-Baese

Multidimensional scaling techniques are unsupervised Dimension Reduction (DR) techniques which use multidimensional data pairwise similarities to represent data into a plane enabling their visual exploratory analysis. Considering labeled data, the DR techniques face two objectives with potentially different priorities: one is to account for the data points' similarities, the other for the data classes' structures. Unsupervised DR techniques attempt to preserve original data similarities, but they do not consider their class label hence they can map originally separated classes as overlapping ones. Conversely, the state-of-the-art so-called supervised DR techniques naturally handle labeled data, but they do so in a predictive modeling framework where they attempt to separate the classes in order to improve a classification accuracy measure in the low-dimensional space, hence they can map as separated even originally overlapping classes. We propose ClassiMap, a DR technique which optimizes a new objective function enabling Exploratory Data Analysis (EDA) of labeled data. Mapping distortions known as tears and false neighborhoods cannot be avoided in general due to the reduction of the data dimension. ClassiMap intends primarily to preserve data similarities but tends to distribute preferentially unavoidable tears among the different-label data and unavoidable false neighbors among the same-label data. Standard quality measures to evaluate the quality of unsupervised mappings cannot tell about the preservation of within-class or between-class structures, while classification accuracy used to evaluate supervised mappings is only relevant to the framework of predictive modeling. We propose two measures better suited to the evaluation of DR of labeled data in an EDA framework. We use these two label-aware indices and four other standard unsupervised indices to compare ClassiMap to other state-of-the-art supervised and unsupervised DR techniques on synthetic and real datasets. ClassiMap appears to provide a better tradeoff between pairwise similarities and class structure preservation according to these new measures.

2013 ◽  
Author(s):  
Stephen J. Tueller ◽  
Richard A. Van Dorn ◽  
Georgiy Bobashev ◽  
Barry Eggleston

Author(s):  
Jayesh S

UNSTRUCTURED Covid-19 outbreak was first reported in Wuhan, China. The deadly virus spread not just the disease, but fear around the globe. On January 2020, WHO declared COVID-19 as a Public Health Emergency of International Concern (PHEIC). First case of Covid-19 in India was reported on January 30, 2020. By the time, India was prepared in fighting against the virus. India has taken various measures to tackle the situation. In this paper, an exploratory data analysis of Covid-19 cases in India is carried out. Data namely number of cases, testing done, Case Fatality ratio, Number of deaths, change in visits stringency index and measures taken by the government is used for modelling and visual exploratory data analysis.


Molecules ◽  
2021 ◽  
Vol 26 (5) ◽  
pp. 1393
Author(s):  
Ralitsa Robeva ◽  
Miroslava Nedyalkova ◽  
Georgi Kirilov ◽  
Atanaska Elenkova ◽  
Sabina Zacharieva ◽  
...  

Catecholamines are physiological regulators of carbohydrate and lipid metabolism during stress, but their chronic influence on metabolic changes in obese patients is still not clarified. The present study aimed to establish the associations between the catecholamine metabolites and metabolic syndrome (MS) components in obese women as well as to reveal the possible hidden subgroups of patients through hierarchical cluster analysis and principal component analysis. The 24-h urine excretion of metanephrine and normetanephrine was investigated in 150 obese women (54 non diabetic without MS, 70 non-diabetic with MS and 26 with type 2 diabetes). The interrelations between carbohydrate disturbances, metabolic syndrome components and stress response hormones were studied. Exploratory data analysis was used to determine different patterns of similarities among the patients. Normetanephrine concentrations were significantly increased in postmenopausal patients and in women with morbid obesity, type 2 diabetes, and hypertension but not with prediabetes. Both metanephrine and normetanephrine levels were positively associated with glucose concentrations one hour after glucose load irrespectively of the insulin levels. The exploratory data analysis showed different risk subgroups among the investigated obese women. The development of predictive tools that include not only traditional metabolic risk factors, but also markers of stress response systems might help for specific risk estimation in obesity patients.


Sign in / Sign up

Export Citation Format

Share Document