Symbolic Data Analysis

Standard data mining techniques no longer adequately represent the complexity of the world. So, a new paradigm is necessary. Symbolic Data Analysis is a new type of data analysis that allows us to represent the complexity of reality, maintaining the internal variation and structure developed by Diday (2003). This new paradigm is based on the concept of symbolic object, which is a mathematical model of a concept. In this article the authors are going to present the fundamentals of the symbolic data analysis paradigm and the symbolic object concept. Theoretical aspects and examples allow the authors to understand the SDA paradigm as a tool for mining complex data.

Download Full-text

Principles on Symbolic Data Analysis

Handbook of Research on Innovations in Database Technologies and Applications ◽

10.4018/978-1-60566-242-8.ch009 ◽

2009 ◽

pp. 74-81

Author(s):

Héctor Oscar Nigro ◽

Sandra Elizabeth González Císaro

Keyword(s):

Data Analysis ◽

Missing Values ◽

Symbolic Data Analysis ◽

Human Errors ◽

Symbolic Data ◽

Analysis Process ◽

New Type ◽

Null Value ◽

Internal Variation ◽

Different Sources

Today’s technology allows storing vast quantities of information from different sources in nature. This information has missing values, nulls, internal variation, taxonomies, and rules. We need a new type of data analysis that allows us represent the complexity of reality, maintaining the internal variation and structure (Diday, 2003). In Data Analysis Process or Data Mining, it is necessary to know the nature of null values - the cases are by absence value, null value or default value -, being also possible and valid to have some imprecision, due to differential semantic in a concept, diverse sources, linguistic imprecision, element resumed in Database, human errors, etc (Chavent, 1997). So, we need a conceptual support to manipulate these types of situations. As we are going to see below, Symbolic Data Analysis (SDA) is a new issue based on a strong conceptual model called Symbolic Object (SO). A “SO” is defined by its “intent” which contains a way to find its “extent”. For instance, the description of habitants in a region and the way of allocating an individual to this region is called “intent”, the set of individuals, which satisfies this intent, is called “extent” (Diday 2003). For this type of analysis, different experts are needed, each one giving their concepts.

Download Full-text

Symbolic Objects and Symbolic Data Analysis

Encyclopedia of Database Technologies and Applications ◽

10.4018/978-1-59140-560-3.ch109 ◽

2005 ◽

pp. 665-670 ◽

Cited By ~ 1

Author(s):

Héctor Oscar Nigro ◽

Sandra Elizabeth González Císaro

Keyword(s):

Data Analysis ◽

Missing Values ◽

Symbolic Data Analysis ◽

Symbolic Data ◽

New Type ◽

Internal Variation ◽

Different Sources

Download Full-text

Symbolic Data Clustering

Encyclopedia of Data Warehousing and Mining ◽

10.4018/978-1-59140-557-3.ch204 ◽

2011 ◽

pp. 1087-1091 ◽

Cited By ~ 3

Author(s):

Edwin Diday ◽

M. Narasimha Murthy

Keyword(s):

Data Mining ◽

Decision Making ◽

Data Analysis ◽

Internal Structure ◽

Data Clustering ◽

Large Datasets ◽

Complex Data ◽

Symbolic Data Analysis ◽

New Techniques ◽

Symbolic Data

In data mining, we generate class/cluster models from large datasets. Symbolic Data Analysis (SDA) is a powerful tool that permits dealing with complex data (Diday, 1988) where a combination of variables and logical and hierarchical relationships among them are used. Such a view permits us to deal with data at a conceptual level, and as a consequence, SDA is ideally suited for data mining. Symbolic data have their own internal structure that necessitates the need for new techniques that generally differ from the ones used on conventional data (Billard & Diday, 2003). Clustering generates abstractions that can be used in a variety of decision-making applications (Jain, Murty, & Flynn, 1999). In this article, we deal with the application of clustering to SDA.

Download Full-text

Fraud Detection in Healthcare System using Symbolic Data Analysis

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.h9269.0710921 ◽

2021 ◽

Vol 10 (9) ◽

pp. 1-7

Author(s):

Sahana Munavalli ◽

◽

Sanjeevakumar M. Hatture ◽

Keyword(s):

Data Mining ◽

Health Insurance ◽

Data Analysis ◽

Claim Data ◽

Symbolic Data Analysis ◽

Standard Data ◽

Symbolic Data ◽

Large Sets ◽

Strenuous Work ◽

Insurance Claim Data

In the era of digitization the frauds are found in all categories of health insurance. It is finished next to deliberate trickiness or distortion for acquiring some pitiful advantage in the form of health expenditures. Bigdata analysis can be utilized to recognize fraud in large sets of insurance claim data. In light of a couple of cases that are known or suspected to be false, the anomaly detection technique computes the closeness of each record to be fake by investigating the previous insurance claims. The investigators would then be able to have a nearer examination for the cases that have been set apart by data mining programming. One of the issues is the abuse of the medical insurance systems. Manual detection of frauds in the healthcare industry is strenuous work. Fraud and Abuse in the Health care system have become a significant concern and that too inside health insurance organizations, from the most recent couple of years because of the expanding misfortunes in incomes, handling medical claims have become a debilitating manual assignment, which is done by a couple of clinical specialists who have the duty of endorsing, adjusting, or dismissing the appropriations mentioned inside a restricted period from their gathering. Standard data mining techniques at this point do not sufficiently address the intricacy of the world. In this way, utilizing Symbolic Data Analysis is another sort of data analysis that permits us to address the intricacy of the real world and to recognize misrepresentation in the dataset.

Download Full-text

Mixture decomposition of distributions by copulas in the symbolic data analysis framework

Discrete Applied Mathematics ◽

10.1016/j.dam.2004.06.018 ◽

2005 ◽

Vol 147 (1) ◽

pp. 27-41 ◽

Cited By ~ 13

Author(s):

E. Diday ◽

M. Vrac

Keyword(s):

Data Analysis ◽

Analysis Framework ◽

Symbolic Data Analysis ◽

Symbolic Data ◽

Mixture Decomposition ◽

Decomposition Of Distributions

Download Full-text

Symbolic Data Analysis for the Development of Object Oriented Data Model for Sensor Data Repository

Proceedings of the International Conference on Frontiers of Intelligent Computing: Theory and Applications (FICTA) 2013 - Advances in Intelligent Systems and Computing ◽

10.1007/978-3-319-02931-3_49 ◽

2014 ◽

pp. 435-442

Author(s):

Doreswamy ◽

Srinivas Narasegouda

Keyword(s):

Data Analysis ◽

Data Model ◽

Object Oriented ◽

Sensor Data ◽

Data Repository ◽

Symbolic Data Analysis ◽

Symbolic Data

Download Full-text

Face Recognition Using Symbolic KPCA Plus Symbolic LDA in the Framework of Symbolic Data Analysis: Symbolic Kernel Fisher Discriminant Method

Advanced Concepts for Intelligent Vision Systems - Lecture Notes in Computer Science ◽

10.1007/978-3-540-88458-3_89 ◽

2008 ◽

pp. 982-993 ◽

Cited By ~ 3

Author(s):

P. S. Hiremath ◽

C. J. Prabhakar

Keyword(s):

Face Recognition ◽

Data Analysis ◽

Symbolic Data Analysis ◽

Symbolic Data ◽

Fisher Discriminant

Download Full-text

Use of Pyramids in Symbolic Data Analysis

New Approaches in Classification and Data Analysis - Studies in Classification, Data Analysis, and Knowledge Organization ◽

10.1007/978-3-642-51175-2_43 ◽

1994 ◽

pp. 378-386 ◽

Cited By ~ 6

Author(s):

P. Brito

Keyword(s):

Data Analysis ◽

Symbolic Data Analysis ◽

Symbolic Data

Download Full-text

Recursive Partition and Symbolic Data Analysis

New Approaches in Classification and Data Analysis - Studies in Classification, Data Analysis, and Knowledge Organization ◽

10.1007/978-3-642-51175-2_32 ◽

1994 ◽

pp. 277-284 ◽

Cited By ~ 1

Author(s):

A. Ciampi ◽

E. Diday ◽

J. Lebbe ◽

R. Vignes

Keyword(s):

Data Analysis ◽

Symbolic Data Analysis ◽

Symbolic Data ◽

Recursive Partition

Download Full-text

Kernel Generative Topographic Mapping of Protein Sequences

Medical Applications of Intelligent Data Analysis - Advances in Medical Technologies and Clinical Practice ◽

10.4018/978-1-4666-1803-9.ch013 ◽

2012 ◽

pp. 195-208

Author(s):

M.I. Cardenas ◽

A. Vellido ◽

I. Olier ◽

X. Rovira ◽

J. Giraldo

Keyword(s):

Data Analysis ◽

Kernel Method ◽

Protein Sequences ◽

Topographic Mapping ◽

Complex Data ◽

Generative Topographic Mapping ◽

Protein Amino Acid ◽

Genomics And Proteomics ◽

The World ◽

Symbolic Sequences

The world of pharmacology is becoming increasingly dependent on the advances in the fields of genomics and proteomics. The –omics sciences bring about the challenge of how to deal with the large amounts of complex data they generate from an intelligence data analysis perspective. In this chapter, the authors focus on the analysis of a specific type of proteins, the G protein-couple receptors, which are the target for over 15% of current drugs. They describe a kernel method of the manifold learning family for the analysis of protein amino acid symbolic sequences. This method sheds light on the structure of protein subfamilies, while providing an intuitive visualization of such structure.

Download Full-text