scholarly journals Clustering Methodologies for Software Engineering

2012 ◽  
Vol 2012 ◽  
pp. 1-18 ◽  
Author(s):  
Mark Shtern ◽  
Vassilios Tzerpos

The size and complexity of industrial strength software systems are constantly increasing. This means that the task of managing a large software project is becoming even more challenging, especially in light of high turnover of experienced personnel. Software clustering approaches can help with the task of understanding large, complex software systems by automatically decomposing them into smaller, easier-to-manage subsystems. The main objective of this paper is to identify important research directions in the area of software clustering that require further attention in order to develop more effective and efficient clustering methodologies for software engineering. To that end, we first present the state of the art in software clustering research. We discuss the clustering methods that have received the most attention from the research community and outline their strengths and weaknesses. Our paper describes each phase of a clustering algorithm separately. We also present the most important approaches for evaluating the effectiveness of software clustering.

Author(s):  
Shohag Barman ◽  
Hira Lal Gope ◽  
M M Manjurul Islam ◽  
Md Mehedi Hasan ◽  
Umme Salma

<p>Software industries face a common problem which is the maintenance cost of industrial software systems. There are lots of reasons behind this problem. One of the possible reasons is the high maintenance cost due to lack of knowledge about understanding the software systems that are too large, and complex. Software clustering is an efficient technique to deal with such kind of problems that arise from the sheer size and complexity of large software systems. Day by day the size and complexity of industrial software systems are rapidly increasing. So, it will be a challenging task for managing software systems. Software clustering can be very helpful to understand the larger software system, decompose them into smaller and easy to maintenance. In this paper, we want to give research direction in the area of software clustering in order to develop efficient clustering techniques for software engineering. Besides, we want to describe the most recent clustering techniques and their strength as well as weakness. In addition, we propose genetic algorithm based software modularization clustering method. The result section demonstrated that proposed method can effectively produce good module structure and it outperforms the state of the art methods. </p>


Author(s):  
Haiping Xu

Software Engineering (SE) and Knowledge Engineering (KE) are closely related disciplines with goals of turning the development process of software systems and knowledge-based systems, respectively, into engineering disciplines. In particular, they together can provide systematic approaches for engineering intelligent software systems more efficiently and cost-effectively. As there is a large overlap between the two disciplines, the interplay is vital for both to be successful. In this paper, we divide the intersection of SE and KE into three subareas, namely Knowledge-Supported Software Engineering (KSSE), Engineering Knowledge as a Software (EKaaS), and Intelligent Software System Engineering (ISSE). For each subarea, we describe the challenges along with the current trends, and predict the future research directions that may have the most potential for success.


Author(s):  
JIANLIN ZHU ◽  
JIN HUANG ◽  
DAICUI ZHOU ◽  
ZHONGBAO YIN ◽  
GUOPING ZHANG ◽  
...  

Software architecture recovery is to gain the architectural level understanding of a software system while its architecture description does not exist. In recent years, researchers have adopted various software clustering techniques to detect hierarchical structure of software systems. Most graph clustering techniques focus on the connectivity between program elements, but unreasonably ignore the similarity which is also a key measure for finding elements of one module. In this paper we propose a novel hierarchy graph clustering algorithm DGHC, which considers both similarity and connectivity between program elements. During the transformation of program dependence graph edges representing similarity between elements are added. Then similar elements are grouped by density-based approaches. The alternative strategy is adopted to find groups of closely connected and similar elements. Meanwhile we adjust the contribution of connectivity and similarity by a flexible clustering algorithm based on short random walk model, which can obtain more structure information of software to find its multiple layers. Furthermore a new method called Multi-layer Propagation Gap is proposed to suggest stable layers of hierarchy clustering result as multiple layers of software system. Extensive experimental results illustrate the effectiveness and efficiency of DGHC in detecting hierarchy structure of software through comparison with various software clustering methods.


Electronics ◽  
2021 ◽  
Vol 10 (5) ◽  
pp. 592
Author(s):  
Radek Silhavy ◽  
Petr Silhavy ◽  
Zdenka Prokopova

Software size estimation represents a complex task, which is based on data analysis or on an algorithmic estimation approach. Software size estimation is a nontrivial task, which is important for software project planning and management. In this paper, a new method called Actors and Use Cases Size Estimation is proposed. The new method is based on the number of actors and use cases only. The method is based on stepwise regression and led to a very significant reduction in errors when estimating the size of software systems compared to Use Case Points-based methods. The proposed method is independent of Use Case Points, which allows the elimination of the effect of the inaccurate determination of Use Case Points components, because such components are not used in the proposed method.


2007 ◽  
Vol 16 (06) ◽  
pp. 919-934
Author(s):  
YONGGUO LIU ◽  
XIAORONG PU ◽  
YIDONG SHEN ◽  
ZHANG YI ◽  
XIAOFENG LIAO

In this article, a new genetic clustering algorithm called the Improved Hybrid Genetic Clustering Algorithm (IHGCA) is proposed to deal with the clustering problem under the criterion of minimum sum of squares clustering. In IHGCA, the improvement operation including five local iteration methods is developed to tune the individual and accelerate the convergence speed of the clustering algorithm, and the partition-absorption mutation operation is designed to reassign objects among different clusters. By experimental simulations, its superiority over some known genetic clustering methods is demonstrated.


Genetics ◽  
2001 ◽  
Vol 159 (2) ◽  
pp. 699-713
Author(s):  
Noah A Rosenberg ◽  
Terry Burke ◽  
Kari Elo ◽  
Marcus W Feldman ◽  
Paul J Freidlin ◽  
...  

Abstract We tested the utility of genetic cluster analysis in ascertaining population structure of a large data set for which population structure was previously known. Each of 600 individuals representing 20 distinct chicken breeds was genotyped for 27 microsatellite loci, and individual multilocus genotypes were used to infer genetic clusters. Individuals from each breed were inferred to belong mostly to the same cluster. The clustering success rate, measuring the fraction of individuals that were properly inferred to belong to their correct breeds, was consistently ~98%. When markers of highest expected heterozygosity were used, genotypes that included at least 8–10 highly variable markers from among the 27 markers genotyped also achieved &gt;95% clustering success. When 12–15 highly variable markers and only 15–20 of the 30 individuals per breed were used, clustering success was at least 90%. We suggest that in species for which population structure is of interest, databases of multilocus genotypes at highly variable markers should be compiled. These genotypes could then be used as training samples for genetic cluster analysis and to facilitate assignments of individuals of unknown origin to populations. The clustering algorithm has potential applications in defining the within-species genetic units that are useful in problems of conservation.


2021 ◽  
Author(s):  
Feiyang Ren ◽  
Yi Han ◽  
Shaohan Wang ◽  
He Jiang

Abstract A novel marine transportation network based on high-dimensional AIS data with a multi-level clustering algorithm is proposed to discover important waypoints in trajectories based on selected navigation features. This network contains two parts: the calculation of major nodes with CLIQUE and BIRCH clustering methods and navigation network construction with edge construction theory. Unlike the state-of-art work for navigation clustering with only ship coordinate, the proposed method contains more high-dimensional features such as drafting, weather, and fuel consumption. By comparing the historical AIS data, more than 220,133 lines of data in 30 days were used to extract 440 major nodal points in less than 4 minutes with ordinary PC specs (i5 processer). The proposed method can be performed on more dimensional data for better ship path planning or even national economic analysis. Current work has shown good performance on complex ship trajectories distinction and great potential for future shipping transportation market analytical predictions.


Sign in / Sign up

Export Citation Format

Share Document