How to Use Temporal-Driven Constrained Clustering to Detect Typical Evolutions

As the first step of many visual speech recognition and visual speaker authentication systems, robust and accurate lip region segmentation is of vital importance for lip image analysis. However, most of the current techniques break down when dealing with lip images with complex and inhomogeneous background region such as mustaches and beards. In order to solve this problem, a Multi-class, Shapeguided FCM (MS-FCM) clustering algorithm is proposed in this chapter. In the proposed approach, one cluster is set for the lip region and a combination of multiple clusters for the background which generally includes the skin region, lip shadow or beards. With the spatial distribution of the lip cluster, a spatial penalty term considering the spatial location information is introduced and incorporated into the objective function such that pixels having similar color but located in different regions can be differentiated. Experimental results show that the proposed algorithm provides accurate lip-background partition even for the images with complex background features.

Download Full-text

Gut mucosal virome alterations in ulcerative colitis

Gut ◽

10.1136/gutjnl-2018-318131 ◽

2019 ◽

Vol 68 (7) ◽

pp. 1169-1179 ◽

Cited By ~ 69

Author(s):

Tao Zuo ◽

Xiao-Juan Lu ◽

Yu Zhang ◽

Chun Pan Cheung ◽

Siu Lam ◽

...

Keyword(s):

Clustering Algorithm ◽

Intestinal Inflammation ◽

Rectal Mucosa ◽

Significant Loss ◽

Healthy Controls ◽

Virus Like Particle ◽

Geographical Regions ◽

Viral Communities ◽

First Time ◽

Gut Microbiota Dysbiosis

ObjectiveThe pathogenesis of UC relates to gut microbiota dysbiosis. We postulate that alterations in the viral community populating the intestinal mucosa play an important role in UC pathogenesis. This study aims to characterise the mucosal virome and their functions in health and UC.DesignDeep metagenomics sequencing of virus-like particle preparations and bacterial 16S rRNA sequencing were performed on the rectal mucosa of 167 subjects from three different geographical regions in China (UC=91; healthy controls=76). Virome and bacteriome alterations in UC mucosa were assessed and correlated with patient metadata. We applied partition around medoids clustering algorithm and classified mucosa viral communities into two clusters, referred to as mucosal virome metacommunities 1 and 2.ResultsIn UC, there was an expansion of mucosa viruses, particularly Caudovirales bacteriophages, and a decrease in mucosa Caudovirales diversity, richness and evenness compared with healthy controls. Altered mucosal virome correlated with intestinal inflammation. Interindividual dissimilarity between mucosal viromes was higher in UC than controls. Escherichia phage and Enterobacteria phage were more abundant in the mucosa of UC than controls. Compared with metacommunity 1, metacommunity 2 was predominated by UC subjects and displayed a significant loss of various viral species. Patients with UC showed substantial abrogation of diverse viral functions, whereas multiple viral functions, particularly functions of bacteriophages associated with host bacteria fitness and pathogenicity, were markedly enriched in UC mucosa. Intensive transkingdom correlations between mucosa viruses and bacteria were significantly depleted in UC.ConclusionWe demonstrated for the first time that UC is characterised by substantial alterations of the mucosa virobiota with functional distortion. Enrichment of Caudovirales bacteriophages, increased phage/bacteria virulence functions and loss of viral-bacterial correlations in the UC mucosa highlight that mucosal virome may play an important role in UC pathogenesis.

Download Full-text

Active Learning for Constrained Document Clustering with Uncertainty Region

Complexity ◽

10.1155/2020/3207306 ◽

2020 ◽

Vol 2020 ◽

pp. 1-16

Author(s):

M. A. Balafar ◽

R. Hazratgholizadeh ◽

M. R. F. Derakhshi

Keyword(s):

Iterative Process ◽

Clustering Algorithm ◽

State Of The Art ◽

Distance Matrix ◽

Support Vector ◽

Constrained Clustering ◽

Similarity Matrix ◽

Improve Accuracy ◽

Degree Of Similarity ◽

Accuracy And Stability

Constrained clustering is intended to improve accuracy and personalization based on the constraints expressed by an Oracle. In this paper, a new constrained clustering algorithm is proposed and some of the informative data pairs are selected during an iterative process. Then, they are presented to the Oracle and their relation is answered with “Must-link (ML) or Cannot-link (CL).” In each iteration, first, the support vector machine (SVM) is utilized based on the label produced by the current clustering. According to the distance of each document from the hyperplane, the distance matrix is created. Also, based on cosine similarity of word2vector of each document, the similarity matrix is created. Two types of probability (similarity and degree of similarity) are calculated and they are smoothed for belonging to neighborhoods. Neighborhoods form the samples that are labeled by Oracle, to be in the same cluster. Finally, at the end of each iteration, the data with a greater level of uncertainty (in term of probability) is selected for questioning the oracle. In order to evaluate, the proposed method is compared with famous state-of-the-art methods based on two criteria and over a standard dataset. The result demonstrates an increased accuracy and stability of the obtained result with fewer questions.

Download Full-text

Optimal distributed interconnectivity of multi-robot systems by spatially-constrained clustering

Adaptive Behavior ◽

10.1177/1059712317700500 ◽

2017 ◽

Vol 25 (2) ◽

pp. 96-113 ◽

Cited By ~ 2

Author(s):

Matin Macktoobian ◽

Mahdi Aliyari Sh

Keyword(s):

Clustering Algorithm ◽

Task Assignment ◽

Constrained Clustering ◽

Loosely Coupled ◽

Robot Systems ◽

Multi Agent ◽

Probabilistic Proof ◽

Data Passing ◽

Spatially Constrained Clustering ◽

Multi Robot

A spatially-constrained clustering algorithm is presented in this paper. This algorithm is a distributed clustering approach to fine-tune the optimal distances between agents of the system to strengthen the data passing among them using a set of spatial constraints. In fact, this method will increase interconnectivity among agents and clusters, leading to improvement of the overall communicative functionality of the multi-robot system. This strategy will lead to the establishment of loosely-coupled connections among the clusters. These implicit interconnections will mobilize the clusters to receive and transmit information within the multi-agent system. In other words, this algorithm classifies each agent into the clusters with the lowest cost of local communication with its peers. This research demonstrates that the presented decentralized method will actually boost the communicative agility of the swarm by probabilistic proof of the acquired optimality. Hence, the common assumption regarding the full-knowledge of the agents’ primary locations has been fully relaxed compared to former methods. Consequently, the algorithm’s reliability and efficiency is confirmed. Furthermore, the method’s efficacy in passing information will improve the functionality of higher-level swarm operations, such as task assignment and swarm flocking. Analytical investigations and simulated accomplishments, corresponding to highly-populated swarms, prove the claimed efficiency and coherence.

Download Full-text

A spatially constrained clustering algorithm with no prior knowledge of the number of clusters

NeuroImage ◽

10.1016/s1053-8119(01)91404-1 ◽

2001 ◽

Vol 13 (6) ◽

pp. 61 ◽

Cited By ~ 1

Author(s):

Rita Almeida ◽

Anders Ledberg

Keyword(s):

Prior Knowledge ◽

Clustering Algorithm ◽

Constrained Clustering ◽

Number Of Clusters ◽

Spatially Constrained Clustering

Download Full-text

OCEAN: A Non-Conventional Parameter Free Clustering Algorithm Using Relative Densities of Categories

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s0218001421500178 ◽

2020 ◽

pp. 2150017

Author(s):

Iffat Gheyas ◽

Simon Parkinson ◽

Saad Khan

Keyword(s):

Clustering Algorithm ◽

Data Distribution ◽

Multidimensional Space ◽

Distance Metric ◽

Density Based Clustering ◽

Free Parameters ◽

The Difference ◽

Improved Performance ◽

Density Ratios

In this paper, we propose a fully autonomous density-based clustering algorithm named ‘Ocean’, which is inspired by the oceanic landscape and phenomena that occur in it. Ocean is an improvement over conventional algorithms regarding both distance metric and the clustering mechanism. Ocean defines the distance between two categories as the difference in the relative densities of categories. Unlike existing approaches, Ocean neither assigns the same distance to all pairs of categories, nor assigns arbitrary weights to matches and mismatches between categories that can lead to clustering errors. Ocean uses density ratios of adjacent regions in multidimensional space to detect the edges of the clusters. Ocean is robust against clusters of identical patterns. Unlike conventional approaches, Ocean neither makes any assumption regarding the data distribution within clusters, nor requires tuning of free parameters. Empirical evaluations demonstrate improved performance of Ocean over existing approaches.

Download Full-text

Clustering Genes Using Heterogeneous Data Sources

International Journal of Knowledge Discovery in Bioinformatics ◽

10.4018/jkdb.2010040102 ◽

2010 ◽

Vol 1 (2) ◽

pp. 12-28 ◽

Cited By ~ 3

Author(s):

Erliang Zeng ◽

Chengyong Yang ◽

Tao Li ◽

Giri Narasimhan

Keyword(s):

Gene Expression ◽

Gene Expression Data ◽

Incomplete Data ◽

Clustering Algorithm ◽

Biological Data ◽

Exploratory Analysis ◽

Data Sources ◽

Modular Organization ◽

Constrained Clustering ◽

Expression Data

Clustering of gene expression data is a standard exploratory technique used to identify closely related genes. Many other sources of data are also likely to be of great assistance in the analysis of gene expression data. This data provides a mean to begin elucidating the large-scale modular organization of the cell. The authors consider the challenging task of developing exploratory analytical techniques to deal with multiple complete and incomplete information sources. The Multi-Source Clustering (MSC) algorithm developed performs clustering with multiple, but complete, sources of data. To deal with incomplete data sources, the authors adopted the MPCK-means clustering algorithms to perform exploratory analysis on one complete source and other potentially incomplete sources provided in the form of constraints. This paper presents a new clustering algorithm MSC to perform exploratory analysis using two or more diverse but complete data sources, studies the effectiveness of constraints sets and robustness of the constrained clustering algorithm using multiple sources of incomplete biological data, and incorporates such incomplete data into constrained clustering algorithm in form of constraints sets.

Download Full-text

Clustering Genes Using Heterogeneous Data Sources

Computational Knowledge Discovery for Bioinformatics Research ◽

10.4018/978-1-4666-1785-8.ch005 ◽

2013 ◽

pp. 67-83

Author(s):

Erliang Zeng ◽

Chengyong Yang ◽

Tao Li ◽

Giri Narasimhan

Keyword(s):

Gene Expression ◽

Gene Expression Data ◽

Incomplete Data ◽

Clustering Algorithm ◽

Clustering Algorithms ◽

Exploratory Analysis ◽

Data Sources ◽

Constrained Clustering ◽

Expression Data ◽

Multiple Sources

Clustering of gene expression data is a standard exploratory technique used to identify closely related genes. Many other sources of data are also likely to be of great assistance in the analysis of gene expression data. This data provides a mean to begin elucidating the large-scale modular organization of the cell. The authors consider the challenging task of developing exploratory analytical techniques to deal with multiple complete and incomplete information sources. The Multi-Source Clustering (MSC) algorithm developed performs clustering with multiple, but complete, sources of data. To deal with incomplete data sources, the authors adopted the MPCK-means clustering algorithms to perform exploratory analysis on one complete source and other potentially incomplete sources provided in the form of constraints. This paper presents a new clustering algorithm MSC to perform exploratory analysis using two or more diverse but complete data sources, studies the effectiveness of constraints sets and robustness of the constrained clustering algorithm using multiple sources of incomplete biological data, and incorporates such incomplete data into constrained clustering algorithm in form of constraints sets.

Download Full-text

Embedded System for Heart Disease Recognition using Fuzzy Clustering and Correlation

Multidisciplinary Computational Intelligence Techniques ◽

10.4018/978-1-4666-1830-5.ch019 ◽

2012 ◽

pp. 327-350

Author(s):

Helton Hugo de Carvalho Júnior ◽

Robson Luiz Moreno ◽

Tales Cleber Pimenta

Keyword(s):

Heart Disease ◽

Embedded System ◽

Fuzzy Clustering ◽

Clustering Algorithm ◽

Heart Diseases ◽

Significant Loss ◽

Data Set ◽

Viability Analysis ◽

Ecg Signal Processing ◽

Electrocardiogram Ecg

This chapter presents the viability analysis and the development of heart disease identification embedded system. It offers a time reduction on electrocardiogram – ECG signal processing by reducing the amount of data samples without any significant loss. The goal of the developed system is the analysis of heart signals. The ECG signals are applied into the system that performs an initial filtering, and then uses a Gustafson-Kessel fuzzy clustering algorithm for the signal classification and correlation. The classification indicates common heart diseases such as angina, myocardial infarction and coronary artery diseases. The system uses the European electrocardiogram ST-T Database – EDB as a reference for tests and evaluation. The results prove the system can perform the heart disease detection on a data set reduced from 213 to just 20 samples, thus providing a reduction to just 9.4% of the original set, while maintaining the same effectiveness. This system is validated in a Xilinx Spartan®-3A FPGA. The FPGA implemented a Xilinx Microblaze® Soft-Core Processor running at a 50 MHz clock rate.

Download Full-text