A Graph Clustering Algorithm for the Homology Detection

In order to detect a large number of source program samples which are homologous files (files with plagiarism), a new graph-based cluster detection algorithm is proposed，the algorithm is divided into two phases, in the first phase, proposed algorithm based on the keyword program to calculate pairwise similarity in the detected sample program files,in the second stage,by means of graph clustering algorithm, the results of the first phase is dectected, homologous files (files with plagiarism) will form a cluster. The simulation results shows that the algorithm improved detection rate compare with the traditional homologous files detection algorithm and can determine which files are homologous.

Download Full-text

Distributed Entropy Energy-Efficient Clustering algorithm for cluster head selection (DEEEC)

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-189135 ◽

2020 ◽

Vol 39 (6) ◽

pp. 8139-8147

Author(s):

Ranganathan Arun ◽

Rangaswamy Balamurugan

Keyword(s):

Energy Efficient ◽

Clustering Algorithm ◽

Cluster Head ◽

Residual Energy ◽

Energy Utilization ◽

Sensor Nodes ◽

Second Stage ◽

Energy Efficient Clustering ◽

Two Stages ◽

Ch Selection

In Wireless Sensor Networks (WSN) the energy of Sensor nodes is not certainly sufficient. In order to optimize the endurance of WSN, it is essential to minimize the utilization of energy. Head of group or Cluster Head (CH) is an eminent method to develop the endurance of WSN that aggregates the WSN with higher energy. CH for intra-cluster and inter-cluster communication becomes dependent. For complete, in WSN, the Energy level of CH extends its life of cluster. While evolving cluster algorithms, the complicated job is to identify the energy utilization amount of heterogeneous WSNs. Based on Chaotic Firefly Algorithm CH (CFACH) selection, the formulated work is named “Novel Distributed Entropy Energy-Efficient Clustering Algorithm”, in short, DEEEC for HWSNs. The formulated DEEEC Algorithm, which is a CH, has two main stages. In the first stage, the identification of temporary CHs along with its entropy value is found using the correlative measure of residual and original energy. Along with this, in the clustering algorithm, the rotating epoch and its entropy value must be predicted automatically by its sensor nodes. In the second stage, if any member in the cluster having larger residual energy, shall modify the temporary CHs in the direction of the deciding set. The target of the nodes with large energy has the probability to be CHs which is determined by the above two stages meant for CH selection. The MATLAB is required to simulate the DEEEC Algorithm. The simulated results of the formulated DEEEC Algorithm produce good results with respect to the energy and increased lifetime when it is correlated with the current traditional clustering protocols being used in the Heterogeneous WSNs.

Download Full-text

Community Detection Based on Graph Representation Learning in Evolutionary Networks

Applied Sciences ◽

10.3390/app11104497 ◽

2021 ◽

Vol 11 (10) ◽

pp. 4497

Author(s):

Dongming Chen ◽

Mingshuo Nie ◽

Jie Wang ◽

Yun Kong ◽

Dongqi Wang ◽

...

Keyword(s):

Community Detection ◽

Network Structure ◽

Clustering Algorithm ◽

Laplacian Matrix ◽

Representation Learning ◽

Detection Algorithm ◽

Graph Representation ◽

Time Slice ◽

Current Time ◽

Evolutionary Networks

Aiming at analyzing the temporal structures in evolutionary networks, we propose a community detection algorithm based on graph representation learning. The proposed algorithm employs a Laplacian matrix to obtain the node relationship information of the directly connected edges of the network structure at the previous time slice, the deep sparse autoencoder learns to represent the network structure under the current time slice, and the K-means clustering algorithm is used to partition the low-dimensional feature matrix of the network structure under the current time slice into communities. Experiments on three real datasets show that the proposed algorithm outperformed the baselines regarding effectiveness and feasibility.

Download Full-text

Practical limitations of lane detection algorithm based on Hough transform in challenging scenarios

International Journal of Advanced Robotic Systems ◽

10.1177/17298814211008752 ◽

2021 ◽

Vol 18 (2) ◽

pp. 172988142110087

Author(s):

Qiao Huang ◽

Jinlong Liu

Keyword(s):

Hough Transform ◽

Detection Rate ◽

Real Life ◽

Weather Conditions ◽

Autonomous Driving ◽

Detection Algorithm ◽

Lane Detection ◽

Search Range ◽

Range Restriction ◽

Lighting Conditions

The vision-based road lane detection technique plays a key role in driver assistance system. While existing lane recognition algorithms demonstrated over 90% detection rate, the validation test was usually conducted on limited scenarios. Significant gaps still exist when applied in real-life autonomous driving. The goal of this article was to identify these gaps and to suggest research directions that can bridge them. The straight lane detection algorithm based on linear Hough transform (HT) was used in this study as an example to evaluate the possible perception issues under challenging scenarios, including various road types, different weather conditions and shades, changed lighting conditions, and so on. The study found that the HT-based algorithm presented an acceptable detection rate in simple backgrounds, such as driving on a highway or conditions showing distinguishable contrast between lane boundaries and their surroundings. However, it failed to recognize road dividing lines under varied lighting conditions. The failure was attributed to the binarization process failing to extract lane features before detections. In addition, the existing HT-based algorithm would be interfered by lane-like interferences, such as guardrails, railways, bikeways, utility poles, pedestrian sidewalks, buildings and so on. Overall, all these findings support the need for further improvements of current road lane detection algorithms to be robust against interference and illumination variations. Moreover, the widely used algorithm has the potential to raise the lane boundary detection rate if an appropriate search range restriction and illumination classification process is added.

Download Full-text

WLeidenRDF: RDF Data Query Method based on Semantic-Enhanced Graph-Clustering Algorithm

2020 International Symposium on Theoretical Aspects of Software Engineering (TASE) ◽

10.1109/tase49443.2020.00014 ◽

2020 ◽

Author(s):

Liu Yang ◽

Zhou Chen ◽

Yiqing Feng ◽

Zhifang Liao ◽

Zhigang Hu ◽

...

Keyword(s):

Clustering Algorithm ◽

Graph Clustering ◽

Data Query ◽

Rdf Data

Download Full-text

Border Detection in Skin Lesion Images Using an Improved Clustering Algorithm

International Journal of e-Collaboration ◽

10.4018/ijec.2020100102 ◽

2020 ◽

Vol 16 (4) ◽

pp. 15-29

Author(s):

Jayalakshmi D. ◽

Dheeba J.

Keyword(s):

Skin Cancer ◽

Skin Lesion ◽

Clustering Algorithm ◽

Median Filter ◽

Second Phase ◽

Border Detection ◽

Early Computer ◽

Segmentation Methods ◽

Two Phases ◽

Skin Cancer Detection

The incidence of skin cancer has been increasing in recent years and it can become dangerous if not detected early. Computer-aided diagnosis systems can help the dermatologists in assisting with skin cancer detection by examining the features more critically. In this article, a detailed review of pre-processing and segmentation methods is done on skin lesion images by investigating existing and prevalent segmentation methods for the diagnosis of skin cancer. The pre-processing stage is divided into two phases, in the first phase, a median filter is used to remove the artifact; and in the second phase, an improved K-means clustering with outlier removal (KMOR) algorithm is suggested. The proposed method was tested in a publicly available Danderm database. The improved cluster-based algorithm gives an accuracy of 92.8% with a sensitivity of 93% and specificity of 90% with an AUC value of 0.90435. From the experimental results, it is evident that the clustering algorithm has performed well in detecting the border of the lesion and is suitable for pre-processing dermoscopic images.

Download Full-text

Efficient Vector Partitioning Algorithms for Graph Clustering

journal of Data Intelligence ◽

10.26421/jdi1.2-1 ◽

2020 ◽

Vol 1 (2) ◽

pp. 101-123

Author(s):

Hiroaki Shiokawa ◽

Yasunori Futamura

Keyword(s):

Social Networks ◽

Large Scale ◽

Clustering Algorithm ◽

Ground Truth ◽

Graph Clustering ◽

Mining Communities ◽

Fine Grained ◽

Efficient Vector ◽

Public Datasets ◽

Many Core

This paper addressed the problem of finding clusters included in graph-structured data such as Web graphs, social networks, and others. Graph clustering is one of the fundamental techniques for understanding structures present in the complex graphs such as Web pages, social networks, and others. In the Web and data mining communities, the modularity-based graph clustering algorithm is successfully used in many applications. However, it is difficult for the modularity-based methods to find fine-grained clusters hidden in large-scale graphs; the methods fail to reproduce the ground truth. In this paper, we present a novel modularity-based algorithm, \textit{CAV}, that shows better clustering results than the traditional algorithm. The proposed algorithm employs a cohesiveness-aware vector partitioning into the graph spectral analysis to improve the clustering accuracy. Additionally, this paper also presents a novel efficient algorithm \textit{P-CAV} for further improving the clustering speed of CAV; P-CAV is an extension of CAV that utilizes the thread-based parallelization on a many-core CPU. Our extensive experiments on synthetic and public datasets demonstrate the performance superiority of our approaches over the state-of-the-art approaches.

Download Full-text

Self-Adaptive K-Means Based on a Covering Algorithm

Complexity ◽

10.1155/2018/7698274 ◽

2018 ◽

Vol 2018 ◽

pp. 1-16 ◽

Cited By ~ 1

Author(s):

Yiwen Zhang ◽

Yuanyuan Zhou ◽

Xing Guo ◽

Jintao Wu ◽

Qiang He ◽

...

Keyword(s):

Large Scale ◽

Clustering Algorithm ◽

Real Data ◽

Second Phase ◽

Data Sets ◽

Number Of Clusters ◽

Large Scale Data ◽

Long Time ◽

Two Phases ◽

Selection Of

The K-means algorithm is one of the ten classic algorithms in the area of data mining and has been studied by researchers in numerous fields for a long time. However, the value of the clustering number k in the K-means algorithm is not always easy to be determined, and the selection of the initial centers is vulnerable to outliers. This paper proposes an improved K-means clustering algorithm called the covering K-means algorithm (C-K-means). The C-K-means algorithm can not only acquire efficient and accurate clustering results but also self-adaptively provide a reasonable numbers of clusters based on the data features. It includes two phases: the initialization of the covering algorithm (CA) and the Lloyd iteration of the K-means. The first phase executes the CA. CA self-organizes and recognizes the number of clusters k based on the similarities in the data, and it requires neither the number of clusters to be prespecified nor the initial centers to be manually selected. Therefore, it has a “blind” feature, that is, k is not preselected. The second phase performs the Lloyd iteration based on the results of the first phase. The C-K-means algorithm combines the advantages of CA and K-means. Experiments are carried out on the Spark platform, and the results verify the good scalability of the C-K-means algorithm. This algorithm can effectively solve the problem of large-scale data clustering. Extensive experiments on real data sets show that the accuracy and efficiency of the C-K-means algorithm outperforms the existing algorithms under both sequential and parallel conditions.

Download Full-text

A New Echo Cancellation Algorithm without Double-Talk Detection Based on Correlation Function

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.225-226.996 ◽

2011 ◽

Vol 225-226 ◽

pp. 996-999

Author(s):

Li Jun Sun ◽

Shou Yong Zhang ◽

Wei Sheng Wang ◽

Xiao Ning Zhang

Keyword(s):

Computer Simulation ◽

Correlation Function ◽

Adaptive Filter ◽

Detection Algorithm ◽

Echo Cancellation ◽

Acoustic Echo Cancellation ◽

Error Signal ◽

Echo Canceller ◽

Filter Performance ◽

Simulation Results

In an adaptive echo canceller, the detection algorithm able to distinguish echo path change (EPC) from double-talk (DT) is vital to ensure that adaptive filter tap coefficients are updated in case of EPC and frozen during the DT period. The paper presents a new echo cancel algorithm, which can protect the adaptive filter performance during double-talk in acoustic echo cancellation of teleconference without setting a detector. A judgment value can be directly used in the iteration formula to control the iteration speed of the filter, which composed of the correlation of the far-end signal and near-end received signal, the pre-correlation of the error signal. The computer simulation results verify that the mentioned algorithm has the good double talk protection performance, and it is very useful and efficient in distinguishing EPC from DT but with less computational complexity contrast to the congener algorithm.

Download Full-text

A Local Graph Clustering Algorithm for Discovering Subgoals in Reinforcement Learning

Communication and Networking - Communications in Computer and Information Science ◽

10.1007/978-3-642-17604-3_5 ◽

2010 ◽

pp. 41-50 ◽

Cited By ~ 1

Author(s):

Negin Entezari ◽

Mohammad Ebrahim Shiri ◽

Parham Moradi

Keyword(s):

Reinforcement Learning ◽

Clustering Algorithm ◽

Graph Clustering ◽

Local Graph

Download Full-text

Detection of Anomalies in Water Networks by Functional Data Analysis

Mathematical Problems in Engineering ◽

10.1155/2018/5129735 ◽

2018 ◽

Vol 2018 ◽

pp. 1-13 ◽

Cited By ~ 8

Author(s):

Laura Millán-Roures ◽

Irene Epifanio ◽

Vicente Martínez

Keyword(s):

Data Analysis ◽

Outlier Detection ◽

Functional Data Analysis ◽

Functional Data ◽

Real Data ◽

Water Networks ◽

Archetypal Analysis ◽

Detection Techniques ◽

Second Stage ◽

Two Phases

A functional data analysis (FDA) based methodology for detecting anomalous flows in urban water networks is introduced. Primary hydraulic variables are recorded in real-time by telecontrol systems, so they are functional data (FD). In the first stage, the data are validated (false data are detected) and reconstructed, since there could be not only false data, but also missing and noisy data. FDA tools are used such as tolerance bands for FD and smoothing for dense and sparse FD. In the second stage, functional outlier detection tools are used in two phases. In Phase I, the data are cleared of anomalies to ensure that data are representative of the in-control system. The objective of Phase II is system monitoring. A new functional outlier detection method is also proposed based on archetypal analysis. The methodology is applied and illustrated with real data. A simulated study is also carried out to assess the performance of the outlier detection techniques, including our proposal. The results are very promising.

Download Full-text