Clustering Methodologies for Software Engineering

The size and complexity of industrial strength software systems are constantly increasing. This means that the task of managing a large software project is becoming even more challenging, especially in light of high turnover of experienced personnel. Software clustering approaches can help with the task of understanding large, complex software systems by automatically decomposing them into smaller, easier-to-manage subsystems. The main objective of this paper is to identify important research directions in the area of software clustering that require further attention in order to develop more effective and efficient clustering methodologies for software engineering. To that end, we first present the state of the art in software clustering research. We discuss the clustering methods that have received the most attention from the research community and outline their strengths and weaknesses. Our paper describes each phase of a clustering algorithm separately. We also present the most important approaches for evaluating the effectiveness of software clustering.

Download Full-text

Clustering Techniques for Software Engineering

Indonesian Journal of Electrical Engineering and Computer Science ◽

10.11591/ijeecs.v4.i2.pp465-472 ◽

2016 ◽

Vol 4 (2) ◽

pp. 465 ◽

Cited By ~ 1

Author(s):

Shohag Barman ◽

Hira Lal Gope ◽

M M Manjurul Islam ◽

Md Mehedi Hasan ◽

Umme Salma

Keyword(s):

Software Engineering ◽

Research Direction ◽

Maintenance Cost ◽

Software Systems ◽

Module Structure ◽

Result Section ◽

Clustering Techniques ◽

Software Clustering ◽

Good Module ◽

Industrial Software

<p>Software industries face a common problem which is the maintenance cost of industrial software systems. There are lots of reasons behind this problem. One of the possible reasons is the high maintenance cost due to lack of knowledge about understanding the software systems that are too large, and complex. Software clustering is an efficient technique to deal with such kind of problems that arise from the sheer size and complexity of large software systems. Day by day the size and complexity of industrial software systems are rapidly increasing. So, it will be a challenging task for managing software systems. Software clustering can be very helpful to understand the larger software system, decompose them into smaller and easy to maintenance. In this paper, we want to give research direction in the area of software clustering in order to develop efficient clustering techniques for software engineering. Besides, we want to describe the most recent clustering techniques and their strength as well as weakness. In addition, we propose genetic algorithm based software modularization clustering method. The result section demonstrated that proposed method can effectively produce good module structure and it outperforms the state of the art methods. </p>

Download Full-text

Future Research Directions of Software Engineering and Knowledge Engineering

International Journal of Software Engineering and Knowledge Engineering ◽

10.1142/s0218194015500035 ◽

2015 ◽

Vol 25 (02) ◽

pp. 415-421 ◽

Cited By ~ 2

Author(s):

Haiping Xu

Keyword(s):

Software Engineering ◽

Development Process ◽

Knowledge Engineering ◽

Future Research ◽

Software Systems ◽

Research Directions ◽

Knowledge Based ◽

Intelligent Software ◽

Current Trends ◽

Future Research Directions

Software Engineering (SE) and Knowledge Engineering (KE) are closely related disciplines with goals of turning the development process of software systems and knowledge-based systems, respectively, into engineering disciplines. In particular, they together can provide systematic approaches for engineering intelligent software systems more efficiently and cost-effectively. As there is a large overlap between the two disciplines, the interplay is vital for both to be successful. In this paper, we divide the intersection of SE and KE into three subareas, namely Knowledge-Supported Software Engineering (KSSE), Engineering Knowledge as a Software (EKaaS), and Intelligent Software System Engineering (ISSE). For each subarea, we describe the challenges along with the current trends, and predict the future research directions that may have the most potential for success.

Download Full-text

SOFTWARE ARCHITECTURE RECOVERY THROUGH SIMILARITY-BASED GRAPH CLUSTERING

International Journal of Software Engineering and Knowledge Engineering ◽

10.1142/s0218194013500162 ◽

2013 ◽

Vol 23 (04) ◽

pp. 559-586 ◽

Cited By ~ 5

Author(s):

JIANLIN ZHU ◽

JIN HUANG ◽

DAICUI ZHOU ◽

ZHONGBAO YIN ◽

GUOPING ZHANG ◽

...

Keyword(s):

Software Architecture ◽

Clustering Algorithm ◽

Graph Clustering ◽

Software System ◽

Clustering Methods ◽

Architecture Recovery ◽

Clustering Techniques ◽

Software Clustering ◽

Software Architecture Recovery ◽

Similar Elements

Software architecture recovery is to gain the architectural level understanding of a software system while its architecture description does not exist. In recent years, researchers have adopted various software clustering techniques to detect hierarchical structure of software systems. Most graph clustering techniques focus on the connectivity between program elements, but unreasonably ignore the similarity which is also a key measure for finding elements of one module. In this paper we propose a novel hierarchy graph clustering algorithm DGHC, which considers both similarity and connectivity between program elements. During the transformation of program dependence graph edges representing similarity between elements are added. Then similar elements are grouped by density-based approaches. The alternative strategy is adopted to find groups of closely connected and similar elements. Meanwhile we adjust the contribution of connectivity and similarity by a flexible clustering algorithm based on short random walk model, which can obtain more structure information of software to find its multiple layers. Furthermore a new method called Multi-layer Propagation Gap is proposed to suggest stable layers of hierarchy clustering result as multiple layers of software system. Extensive experimental results illustrate the effectiveness and efficiency of DGHC in detecting hierarchy structure of software through comparison with various software clustering methods.

Download Full-text

Correction: Control Software: Research Directions in the Intersection of Control Theory and Software Engineering

AIAA Scitech 2020 Forum ◽

10.2514/6.2020-2102.c1 ◽

2020 ◽

Author(s):

Justin M. Bradley ◽

Hamid Bagheri

Keyword(s):

Software Engineering ◽

Control Theory ◽

Control Software ◽

Research Directions

Download Full-text

Architectures for Software Systems: A Curriculum Development Proposal in Undergraduate Software Engineering

10.21236/ada266703 ◽

1993 ◽

Author(s):

David Garlan ◽

Mary Shaw

Keyword(s):

Software Engineering ◽

Curriculum Development ◽

Software Systems

Download Full-text

Using Actors and Use Cases for Software Size Estimation

Electronics ◽

10.3390/electronics10050592 ◽

2021 ◽

Vol 10 (5) ◽

pp. 592

Author(s):

Radek Silhavy ◽

Petr Silhavy ◽

Zdenka Prokopova

Keyword(s):

Project Planning ◽

Use Cases ◽

New Method ◽

Software Systems ◽

Software Project ◽

Use Case ◽

Size Estimation ◽

Software Size Estimation ◽

Use Case Points

Software size estimation represents a complex task, which is based on data analysis or on an algorithmic estimation approach. Software size estimation is a nontrivial task, which is important for software project planning and management. In this paper, a new method called Actors and Use Cases Size Estimation is proposed. The new method is based on the number of actors and use cases only. The method is based on stepwise regression and led to a very significant reduction in errors when estimating the size of software systems compared to Use Case Points-based methods. The proposed method is independent of Use Case Points, which allows the elimination of the effect of the inaccurate determination of Use Case Points components, because such components are not used in the proposed method.

Download Full-text

CLUSTERING USING AN IMPROVED HYBRID GENETIC ALGORITHM

International Journal of Artificial Intelligence Tools ◽

10.1142/s021821300700362x ◽

2007 ◽

Vol 16 (06) ◽

pp. 919-934

Author(s):

YONGGUO LIU ◽

XIAORONG PU ◽

YIDONG SHEN ◽

ZHANG YI ◽

XIAOFENG LIAO

Keyword(s):

Genetic Algorithm ◽

Clustering Algorithm ◽

Hybrid Genetic Algorithm ◽

Sum Of Squares ◽

Clustering Methods ◽

Clustering Problem ◽

Mutation Operation ◽

Iteration Methods ◽

Genetic Clustering ◽

The Individual

In this article, a new genetic clustering algorithm called the Improved Hybrid Genetic Clustering Algorithm (IHGCA) is proposed to deal with the clustering problem under the criterion of minimum sum of squares clustering. In IHGCA, the improvement operation including five local iteration methods is developed to tune the individual and accelerate the convergence speed of the clustering algorithm, and the partition-absorption mutation operation is designed to reassign objects among different clusters. By experimental simulations, its superiority over some known genetic clustering methods is demonstrated.

Download Full-text

(ISEF): an integrated industrial-strength software engineering framework

ACM SIGSOFT Software Engineering Notes ◽

10.1145/64137.65008 ◽

1988 ◽

Vol 13 (5) ◽

pp. 45-54

Author(s):

Shaye Koenig

Keyword(s):

Software Engineering ◽

Industrial Strength

Download Full-text

Empirical Evaluation of Genetic Clustering Methods Using Multilocus Genotypes From 20 Chicken Breeds

Genetics ◽

10.1093/genetics/159.2.699 ◽

2001 ◽

Vol 159 (2) ◽

pp. 699-713

Author(s):

Noah A Rosenberg ◽

Terry Burke ◽

Kari Elo ◽

Marcus W Feldman ◽

Paul J Freidlin ◽

...

Keyword(s):

Cluster Analysis ◽

Population Structure ◽

Clustering Algorithm ◽

Empirical Evaluation ◽

Unknown Origin ◽

Clustering Methods ◽

Genetic Cluster ◽

Data Set ◽

Multilocus Genotypes ◽

Chicken Breeds

Abstract We tested the utility of genetic cluster analysis in ascertaining population structure of a large data set for which population structure was previously known. Each of 600 individuals representing 20 distinct chicken breeds was genotyped for 27 microsatellite loci, and individual multilocus genotypes were used to infer genetic clusters. Individuals from each breed were inferred to belong mostly to the same cluster. The clustering success rate, measuring the fraction of individuals that were properly inferred to belong to their correct breeds, was consistently ~98%. When markers of highest expected heterozygosity were used, genotypes that included at least 8–10 highly variable markers from among the 27 markers genotyped also achieved >95% clustering success. When 12–15 highly variable markers and only 15–20 of the 30 individuals per breed were used, clustering success was at least 90%. We suggest that in species for which population structure is of interest, databases of multilocus genotypes at highly variable markers should be compiled. These genotypes could then be used as training samples for genetic cluster analysis and to facilitate assignments of individuals of unknown origin to populations. The clustering algorithm has potential applications in defining the within-species genetic units that are useful in problems of conservation.

Download Full-text

A Novel High-Dimensional Trajectories Construction Network based on Multi-Clustering Algorithm

10.21203/rs.3.rs-1060086/v1 ◽

2021 ◽

Author(s):

Feiyang Ren ◽

Yi Han ◽

Shaohan Wang ◽

He Jiang

Keyword(s):

Economic Analysis ◽

Clustering Algorithm ◽

Transportation Network ◽

High Dimensional ◽

Clustering Methods ◽

Marine Transportation ◽

Network Construction ◽

National Economic ◽

Multi Level ◽

State Of Art

Abstract A novel marine transportation network based on high-dimensional AIS data with a multi-level clustering algorithm is proposed to discover important waypoints in trajectories based on selected navigation features. This network contains two parts: the calculation of major nodes with CLIQUE and BIRCH clustering methods and navigation network construction with edge construction theory. Unlike the state-of-art work for navigation clustering with only ship coordinate, the proposed method contains more high-dimensional features such as drafting, weather, and fuel consumption. By comparing the historical AIS data, more than 220,133 lines of data in 30 days were used to extract 440 major nodal points in less than 4 minutes with ordinary PC specs (i5 processer). The proposed method can be performed on more dimensional data for better ship path planning or even national economic analysis. Current work has shown good performance on complex ship trajectories distinction and great potential for future shipping transportation market analytical predictions.

Download Full-text