Using clustering algorithms in legacy systems remodularization

Author(s):  
T.A. Wiggerts
Author(s):  
Mohana Priya K ◽  
Pooja Ragavi S ◽  
Krishna Priya G

Clustering is the process of grouping objects into subsets that have meaning in the context of a particular problem. It does not rely on predefined classes. It is referred to as an unsupervised learning method because no information is provided about the "right answer" for any of the objects. Many clustering algorithms have been proposed and are used based on different applications. Sentence clustering is one of best clustering technique. Hierarchical Clustering Algorithm is applied for multiple levels for accuracy. For tagging purpose POS tagger, porter stemmer is used. WordNet dictionary is utilized for determining the similarity by invoking the Jiang Conrath and Cosine similarity measure. Grouping is performed with respect to the highest similarity measure value with a mean threshold. This paper incorporates many parameters for finding similarity between words. In order to identify the disambiguated words, the sense identification is performed for the adjectives and comparison is performed. semcor and machine learning datasets are employed. On comparing with previous results for WSD, our work has improvised a lot which gives a percentage of 91.2%


2017 ◽  
Vol 5 (12) ◽  
pp. 323-325
Author(s):  
E. Mahima Jane ◽  
◽  
◽  
E. George Dharma Prakash Raj

2015 ◽  
pp. 125-138 ◽  
Author(s):  
I. V. Goncharenko

In this article we proposed a new method of non-hierarchical cluster analysis using k-nearest-neighbor graph and discussed it with respect to vegetation classification. The method of k-nearest neighbor (k-NN) classification was originally developed in 1951 (Fix, Hodges, 1951). Later a term “k-NN graph” and a few algorithms of k-NN clustering appeared (Cover, Hart, 1967; Brito et al., 1997). In biology k-NN is used in analysis of protein structures and genome sequences. Most of k-NN clustering algorithms build «excessive» graph firstly, so called hypergraph, and then truncate it to subgraphs, just partitioning and coarsening hypergraph. We developed other strategy, the “upward” clustering in forming (assembling consequentially) one cluster after the other. Until today graph-based cluster analysis has not been considered concerning classification of vegetation datasets.


Author(s):  
Da-Yin Liao

Contemporary 300mm semiconductor manufacturing systems have highly automated and digitalized cyber-physical integration. They suffer from the profound problems of integrating large, centralized legacy systems with small islands of automation. With the recent advances in disruptive technologies, semiconductor manufacturing has faced dramatic pressures to reengineer its automation and computer integrated systems. This paper proposes a Distributed-Ledger, Edge-Computing Architecture (DLECA) for automation and computer integration in semiconductor manufacturing. Based on distributed ledger and edge computing technologies, DLECA establishes a decentralized software framework where manufacturing data are stored in distributed ledgers and processed locally by executing smart contracts at the edge nodes. We adopt an important topic of automation and computer integration for semiconductor research &development (R&D) operations as the study vehicle to illustrate the operational structure and functionality, applications, and feasibility of the proposed DLECA software framework.


2008 ◽  
Vol 19 (1) ◽  
pp. 48-61 ◽  
Author(s):  
Ji-Gui SUN

2009 ◽  
Vol 29 (7) ◽  
pp. 1760-1763 ◽  
Author(s):  
Fan ZHANG ◽  
Feng YUAN ◽  
Yong-ji WANG
Keyword(s):  

2001 ◽  
Author(s):  
Santiago Comella-Dorda ◽  
Grace A. Lewis ◽  
Pat Place ◽  
Dan Plakosh ◽  
Robert C. Seacord
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document