Data Normalization | ScienceGate

Jaya Ant Lion Optimization-Driven Deep Recurrent Neural Network for Cancer Classification Using Gene Expression Data (Preprint)

10.2196/preprints.25962 ◽

2020 ◽

Author(s):

Ramachandro Majji

Keyword(s):

Neural Network ◽

Recurrent Neural Network ◽

Early Stage ◽

Data Transformation ◽

Cancer Classification ◽

Data Normalization ◽

Ant Lion Optimization ◽

Deep Recurrent Neural Network ◽

Novel Strategy ◽

Feature Dimension

BACKGROUND Cancer is one of the deadly diseases prevailing worldwide and the patients with cancer are rescued only when the cancer is detected at the very early stage. Early detection of cancer is essential as, in the final stage, the chance of survival is limited. The symptoms of cancers are rigorous and therefore, all the symptoms should be studied properly before the diagnosis. OBJECTIVE Propose an automatic prediction system for classifying cancer to malignant or benign. METHODS This paper introduces the novel strategy based on the JayaAnt lion optimization-based Deep recurrent neural network (JayaALO-based DeepRNN) for cancer classification. The steps followed in the developed model are data normalization, data transformation, feature dimension detection, and classification. The first step is the data normalization. The goal of data normalization is to eliminate data redundancy and to mitigate the storage of objects in a relational database that maintains the same information in several places. After that, the data transformation is carried out based on log transformation that generates the patterns using more interpretable and helps fulfill the supposition, and to reduce skew. Also, the non-negative matrix factorization is employed for reducing the feature dimension. Finally, the proposed JayaALO-based DeepRNN method effectively classifies cancer-based on the reduced dimension features to produce a satisfactory result. RESULTS The proposed JayaALO-based DeepRNN showed improved results with maximal accuracy of 95.97%, the maximal sensitivity of 95.95%, and the maximal specificity of 96.96%. CONCLUSIONS The resulted output of the proposed JayaALO-based DeepRNN is used for cancer classification.

Download Full-text

Active sonar reverberation suppression based on beam space data normalization

2017 IEEE International Conference on Signal Processing, Communications and Computing (ICSPCC) ◽

10.1109/icspcc.2017.8242471 ◽

2017 ◽

Author(s):

Jun Wang ◽

Chao Wang ◽

Tao Cheng

Keyword(s):

Data Normalization ◽

Active Sonar ◽

Space Data

Download Full-text

Leveraging the UMLS As a Data Standard for Rare Disease Data Normalization and Harmonization

Methods of Information in Medicine ◽

10.1055/s-0040-1718940 ◽

2020 ◽

Author(s):

Qian Zhu ◽

Dac-Trung Nguyen ◽

Eric Sid ◽

Anne Pariser

Keyword(s):

Rare Disease ◽

Rare Diseases ◽

Mendelian Inheritance ◽

Data Normalization ◽

Community Settings ◽

Unified Medical Language System ◽

Data Standard ◽

Medical Language ◽

Disease Concepts ◽

High Level

Abstract Objective In this study, we aimed to evaluate the capability of the Unified Medical Language System (UMLS) as one data standard to support data normalization and harmonization of datasets that have been developed for rare diseases. Through analysis of data mappings between multiple rare disease resources and the UMLS, we propose suggested extensions of the UMLS that will enable its adoption as a global standard in rare disease. Methods We analyzed data mappings between the UMLS and existing datasets on over 7,000 rare diseases that were retrieved from four publicly accessible resources: Genetic And Rare Diseases Information Center (GARD), Orphanet, Online Mendelian Inheritance in Men (OMIM), and the Monarch Disease Ontology (MONDO). Two types of disease mappings were assessed, (1) curated mappings extracted from those four resources; and (2) established mappings generated by querying the rare disease-based integrative knowledge graph developed in the previous study. Results We found that 100% of OMIM concepts, and over 50% of concepts from GARD, MONDO, and Orphanet were normalized by the UMLS and accurately categorized into the appropriate UMLS semantic groups. We analyzed 58,636 UMLS mappings, which resulted in 3,876 UMLS concepts across these resources. Manual evaluation of a random set of 500 UMLS mappings demonstrated a high level of accuracy (99%) of developing those mappings, which consisted of 414 mappings of synonyms (82.8%), 76 are subtypes (15.2%), and five are siblings (1%). Conclusion The mapping results illustrated in this study that the UMLS was able to accurately represent rare disease concepts, and their associated information, such as genes and phenotypes, and can effectively be used to support data harmonization across existing resources developed on collecting rare disease data. We recommend the adoption of the UMLS as a data standard for rare disease to enable the existing rare disease datasets to support future applications in a clinical and community settings.

Download Full-text