Detecting Differential Variable microRNAs via Model-Based Clustering

Mapping Intimacies ◽

10.1101/296947 ◽

2018 ◽

Author(s):

Xuan Li ◽

Yuejiao Fu ◽

Xiaogang Wang ◽

Dawn L. DeMeo ◽

Kelan Tantisira ◽

...

Keyword(s):

Dna Methylation ◽

Clustering Algorithm ◽

F Test ◽

Biologically Relevant ◽

Model Based Clustering ◽

Model Based ◽

Equal Variance ◽

Genomic Probes ◽

The Relationship ◽

Genomic Risk

ABSTRACTIdentifying genomic probes (e.g., DNA methylation marks) is becoming a new approach to detect novel genomic risk factors for complex human diseases. The F test is the standard equal-variance test in Statistics. For high-throughput genomic data, the probe-wise F test has been successfully used to detect biologically relevant DNA methylation marks that have different variances between two groups of subjects (e.g., cases vs. controls). In addition to DNA methylation, microRNA is another mechanism of epigenetics. However, to the best of our knowledge, no studies have identified differentially variable (DV) microRNAs. In this article, we proposed a novel model-based clustering to improve the power of the probe-wise F test to detect DV microRNAs. We imposed special structures on covariance matrices for each cluster of microRNAs based on the prior information about the relationship between variance in cases and variance in controls and about the independence among cases and controls. To the best of our knowledge, the proposed method is the first clustering algorithm that aims to detect DV genomic probes. Simulation studies showed that the proposed method outperformed the probe-wise F test and had certain robustness to the violation of the normality assumption. Based on two real datasets about human hepatocellular carcinoma (HCC), we identified 7 DV-only microRNAs (hsa-miR-1826, hsa-miR-191, hsa-miR-194-star, hsa-miR-222, hsa-miR-502-3p, hsa-miR-93, and hsa-miR-99b) using the proposed method, one (hsa-miR-1826) of which has not yet been reported to relate to HCC in the literature.

Detecting Differentially Variable MicroRNAs via Model-Based Clustering

International Journal of Genomics ◽

10.1155/2018/6591634 ◽

2018 ◽

Vol 2018 ◽

pp. 1-9

Author(s):

Xuan Li ◽

Yuejiao Fu ◽

Xiaogang Wang ◽

Dawn L. DeMeo ◽

Kelan Tantisira ◽

...

Keyword(s):

Dna Methylation ◽

New Approach ◽

Biologically Relevant ◽

Model Based Clustering ◽

Model Based ◽

Equal Variance ◽

Genomic Probes ◽

The Relationship ◽

Genomic Risk ◽

Novel Model

Identifying differentially variable (DV) genomic probes is becoming a new approach to detect novel genomic risk factors for complex human diseases. The F test is the standard equal-variance test in statistics. For high-throughput genomic data, the probe-wise F test has been successfully used to detect biologically relevant DNA methylation marks that have different variances between two groups of subjects (e.g., cases versus controls). In addition to DNA methylation, microRNA (miRNA) is another important mechanism of epigenetics. However, to the best of our knowledge, no studies have identified DV miRNAs. In this article, we proposed a novel model-based clustering method to improve the power of the probe-wise F test to detect DV miRNAs. We imposed special structures on covariance matrices for each cluster of miRNAs based on the prior information about the relationship between variances in cases and controls and about the independence among them. Simulation studies showed that the proposed method seems promising in detecting DV probes. Based on two real datasets about human hepatocellular carcinoma (HCC), we identified 7 DV-only miRNAs (hsa-miR-1826, hsa-miR-191, hsa-miR-194-star, hsa-miR-222, hsa-miR-502-3p, hsa-miR-93, and hsa-miR-99b) using the proposed method, one (hsa-miR-1826) of which has not yet been reported to be related to HCC in the literature.

Scaling-Up Model-Based Clustering Algorithm by Working on Clustering Features

Intelligent Data Engineering and Automated Learning — IDEAL 2002 - Lecture Notes in Computer Science ◽

10.1007/3-540-45675-9_86 ◽

2002 ◽

pp. 569-575

Author(s):

Huidong Jin ◽

Kwong-Sak Leung ◽

Man-Leung Wong

Keyword(s):

Clustering Algorithm ◽

Scaling Up ◽

Model Based Clustering ◽

Model Based

Model-Based Clustering of DNA Methylation Array Data

Translational Bioinformatics - Computational and Statistical Epigenomics ◽

10.1007/978-94-017-9927-0_5 ◽

2015 ◽

pp. 91-123

Author(s):

Devin C. Koestler ◽

E. Andrés Houseman

Keyword(s):

Dna Methylation ◽

Methylation Array ◽

Array Data ◽

Model Based Clustering ◽

Model Based ◽

Dna Methylation Array

Data-driven scheduling for smart shop floor via reinforcement learning with model-based clustering algorithm

2021 IEEE 4th Advanced Information Management, Communicates, Electronic and Automation Control Conference (IMCEC) ◽

10.1109/imcec51613.2021.9482089 ◽

2021 ◽

Author(s):

Yuxin Li ◽

Wenbin Gu ◽

Xianliang Wang ◽

Zeyu Chen

Keyword(s):

Reinforcement Learning ◽

Clustering Algorithm ◽

Data Driven ◽

Shop Floor ◽

Model Based Clustering ◽

Model Based

On Hierarchical Linguistic-Based Clustering

Journal of Advanced Computational Intelligence and Intelligent Informatics ◽

10.20965/jaciii.2015.p0900 ◽

2015 ◽

Vol 19 (6) ◽

pp. 900-906 ◽

Cited By ~ 1

Author(s):

Naohiko Kinoshita ◽

◽

Yasunori Endo ◽

Akira Sugawara ◽

◽

...

Keyword(s):

Data Structure ◽

Data Structures ◽

Clustering Algorithm ◽

Clustering Algorithms ◽

Fuzzy Reasoning ◽

Unsupervised Classification ◽

Clustering Techniques ◽

Model Based Clustering ◽

Model Based ◽

Soft Computing Techniques

Clustering is representative unsupervised classification. Many researchers have proposed clustering algorithms based on mathematical models – methods we call model-based clustering. Clustering techniques are very useful for determining data structures, but model-based clustering is difficult to use for analyzing data correctly because we cannot select a suitable method unless we know the data structure at least partially. The new clustering algorithm we propose introduces soft computing techniques such as fuzzy reasoning in what we call linguistic-based clustering, whose features are not incident to the data structure. We verify the method’s effectiveness through numerical examples.

An Ultra-Fast Method for Clustering of Big Genomic Data

International Journal of Applied Metaheuristic Computing ◽

10.4018/ijamc.2020010104 ◽

2020 ◽

Vol 11 (1) ◽

pp. 45-60

Author(s):

Billel Kenidra ◽

Mohamed Benmohammed

Keyword(s):

Dna Methylation ◽

Large Scale ◽

Clustering Algorithm ◽

Clustering Algorithms ◽

Computational Time ◽

Fast Method ◽

Running Time ◽

Cancer Subtypes ◽

Biologically Relevant ◽

High Dimensional Datasets

The clustering process is used to identify cancer subtypes based on gene expression and DNA methylation datasets, since cancer subtype information is critically important for understanding tumor heterogeneity, detecting previously unknown clusters of biological samples, which are usually associated with unknown types of cancer will, in turn, gives way to prescribe more effective treatments for patients. This is because cancer has varying subtypes which often respond disparately to the same treatment. While the DNA methylation database is extremely large-scale datasets, running time still remains a major challenge. Actually, traditional clustering algorithms are too slow to handle biological high-dimensional datasets, they usually require large amounts of computational time. The proposed clustering algorithm extraordinarily overcomes all others in terms of running time, it is able to rapidly identify a set of biologically relevant clusters in large-scale DNA methylation datasets, its superiority over the others has been demonstrated regarding its relative speed.

Rotation invariant face detection using a model-based clustering algorithm

2000 IEEE International Conference on Multimedia and Expo. ICME2000. Proceedings. Latest Advances in the Fast Changing World of Multimedia (Cat. No.00TH8532) ◽

10.1109/icme.2000.871564 ◽

2002 ◽

Cited By ~ 3

Author(s):

Byeong Hwan Jeon ◽

Sang Uk Lee ◽

Kyung Mu Lee

Keyword(s):

Face Detection ◽

Clustering Algorithm ◽

Rotation Invariant ◽

Model Based Clustering ◽

Model Based

A robust Hidden Markov Model based clustering algorithm

2011 6th IEEE Joint International Information Technology and Artificial Intelligence Conference ◽

10.1109/itaic.2011.6030325 ◽

2011 ◽

Cited By ~ 1

Author(s):

Shitong Yao

Keyword(s):

Markov Model ◽

Hidden Markov Model ◽

Clustering Algorithm ◽

Hidden Markov ◽

Model Based Clustering ◽

Model Based

A Gaussian Mixture Model-based clustering algorithm for image segmentation using dependable spatial constraints

2010 3rd International Congress on Image and Signal Processing ◽

10.1109/cisp.2010.5647653 ◽

2010 ◽

Cited By ~ 4

Author(s):

Weiling Cai ◽

Lei Lei ◽

Ming Yang

Keyword(s):

Image Segmentation ◽

Gaussian Mixture Model ◽

Mixture Model ◽

Clustering Algorithm ◽

Gaussian Mixture ◽

Spatial Constraints ◽

Model Based Clustering ◽

Model Based

MBMM: Moment Estimating Beta Mixture Model-Based Clustering Algorithm for m6A Co-methylation Module Mining

Current Bioinformatics ◽

10.2174/1574893616666210629143411 ◽

2021 ◽

Vol 16 ◽

Author(s):

Zhaoyang Liu ◽

Hongsheng Yin ◽

Shutao Chen ◽

Hui Liu ◽

Jia Meng ◽

...

Keyword(s):

Mixture Model ◽

Clustering Algorithm ◽

Biological Significance ◽

Real Data ◽

Small Sample ◽

Simulation Research ◽

Model Based Clustering ◽

Model Based ◽

Specific Analysis ◽

Beta Mixture Model

Background: m6A methylation is a ubiquitous post-transcriptional modification that exists in mammals. MeRIP-seq technology makes the acquisition of m6A data in the whole transcriptome under different conditions realizable. The specific regulation of the enzyme will present co-methylation module on m6A methylation level data. Thus, mining the co-methylation module from which can help to unveil the mechanism of m<sup>6</sup>A methylation modification and its mechanism in the occurrence and development of complex diseases such as cancer. Objective: To develop a clustering algorithm that can effectively realize the mining of m6 co-methylation module. Method: In this study, a novel beta mixture model-based clustering algorithm named MBMM was proposed, which is based on the EM framework and introduces the method of moment estimating in M-step for parameter estimation to tackle the high-dimensional small sample m6A data. Simulation research was employed to evaluate the clustering performance of the proposed algorithm, and by which the co-methylation module mining was done based on real data. Biological significance correlation analysis was employed to explore whether the clustering results are co-methylation modules. Results and Conclusion: Simulation research demonstrated that MBMM performed out than other clustering algorithms. In real data, seven co-methylation modules were found by MBMM. Six m6A-related pathways specific analysis showed that six co-methylation modules were enriched in the pathway and were different. Five enzymes substrate-specific analysis revealed that seven co-methylation modules expressed varying degrees of enrichment. Gene Ontology enrichment analysis indicated that these modules may be regulated by enzymes while having potential functional specificity.