Using clustering algorithms in legacy systems remodularization

Clustering is the process of grouping objects into subsets that have meaning in the context of a particular problem. It does not rely on predefined classes. It is referred to as an unsupervised learning method because no information is provided about the "right answer" for any of the objects. Many clustering algorithms have been proposed and are used based on different applications. Sentence clustering is one of best clustering technique. Hierarchical Clustering Algorithm is applied for multiple levels for accuracy. For tagging purpose POS tagger, porter stemmer is used. WordNet dictionary is utilized for determining the similarity by invoking the Jiang Conrath and Cosine similarity measure. Grouping is performed with respect to the highest similarity measure value with a mean threshold. This paper incorporates many parameters for finding similarity between words. In order to identify the disambiguated words, the sense identification is performed for the adjectives and comparison is performed. semcor and machine learning datasets are employed. On comparing with previous results for WSD, our work has improvised a lot which gives a percentage of 91.2%

Download Full-text

Survey on Partition based Clustering Algorithms in Big Data

International Journal of Computer Sciences and Engineering ◽

10.26438/ijcse/v5i12.323325 ◽

2017 ◽

Vol 5 (12) ◽

pp. 323-325

Author(s):

E. Mahima Jane ◽

◽

E. George Dharma Prakash Raj

Keyword(s):

Big Data ◽

Clustering Algorithms

Download Full-text

A SURVEY ON VARIED DISTRIBUTED CLUSTERING ALGORITHMS FOR WIRELESS SENSOR NETWORKS

i-manager s Journal on Communication Engineering and Systems ◽

10.26634/jcs.7.1.13963 ◽

2018 ◽

Vol 7 (1) ◽

pp. 35

Author(s):

PRADHAN SWAGATIKA ◽

PATNAIK PAWAN ◽

S.D. MISHRA ◽

◽

...

Keyword(s):

Wireless Sensor Networks ◽

Sensor Networks ◽

Clustering Algorithms ◽

Wireless Sensor ◽

Distributed Clustering

Download Full-text

DRSA: a non-hierarchical clustering algorithm using k-NN graph and its application in vegetation classification

Vegetation of Russia ◽

10.31111/vegrus/2015.27.125 ◽

2015 ◽

pp. 125-138 ◽

Cited By ~ 2

Author(s):

I. V. Goncharenko

Keyword(s):

Cluster Analysis ◽

Clustering Algorithm ◽

Nearest Neighbor ◽

Clustering Algorithms ◽

Protein Structures ◽

Hierarchical Cluster ◽

Vegetation Classification ◽

K Nearest Neighbor ◽

Neighbor Graph ◽

Nearest Neighbor Graph

In this article we proposed a new method of non-hierarchical cluster analysis using k-nearest-neighbor graph and discussed it with respect to vegetation classification. The method of k-nearest neighbor (k-NN) classiﬁcation was originally developed in 1951 (Fix, Hodges, 1951). Later a term “k-NN graph” and a few algorithms of k-NN clustering appeared (Cover, Hart, 1967; Brito et al., 1997). In biology k-NN is used in analysis of protein structures and genome sequences. Most of k-NN clustering algorithms build «excessive» graph firstly, so called hypergraph, and then truncate it to subgraphs, just partitioning and coarsening hypergraph. We developed other strategy, the “upward” clustering in forming (assembling consequentially) one cluster after the other. Until today graph-based cluster analysis has not been considered concerning classification of vegetation datasets.

Download Full-text

A Distributed-Ledger, Edge-Computing Architecture for Automation and Computer Integration in Semiconductor Manufacturing

Global Journal of Computer Science and Technology ◽

10.34257/gjcstgvol20is3pg13 ◽

2020 ◽

pp. 13-29

Author(s):

Da-Yin Liao

Keyword(s):

Semiconductor Manufacturing ◽

Manufacturing Systems ◽

Edge Computing ◽

Legacy Systems ◽

Software Framework ◽

Computer Integration ◽

Computing Architecture ◽

Disruptive Technologies ◽

Distributed Ledger ◽

Operational Structure

Contemporary 300mm semiconductor manufacturing systems have highly automated and digitalized cyber-physical integration. They suffer from the profound problems of integrating large, centralized legacy systems with small islands of automation. With the recent advances in disruptive technologies, semiconductor manufacturing has faced dramatic pressures to reengineer its automation and computer integrated systems. This paper proposes a Distributed-Ledger, Edge-Computing Architecture (DLECA) for automation and computer integration in semiconductor manufacturing. Based on distributed ledger and edge computing technologies, DLECA establishes a decentralized software framework where manufacturing data are stored in distributed ledgers and processed locally by executing smart contracts at the edge nodes. We adopt an important topic of automation and computer integration for semiconductor research &development (R&D) operations as the study vehicle to illustrate the operational structure and functionality, applications, and feasibility of the proposed DLECA software framework.

Download Full-text