Identifying Key Classes Algorithm in Directed Weighted Class Interaction Network Based on the Structure Entropy Weighted LeaderRank

Identifying key classes can help software maintainers quickly understand software systems. The existing key class recognition algorithms consider the weight of class interaction, but the weight mechanism is single or arbitrary. In this paper, the multitype weighting mechanism is considered, and the key classes are accurately identified by using four kinds of interaction. By abstracting the software system into the directed weighted class interaction network, a novel Structure Entropy Weighted LeaderRank of identifying key classes algorithm is proposed. First, considering multiple types and directions of interactions between every pair of classes, the directed weighted class interaction software network (DWCIS-Network) is built. Second, Class Entropy of each class is initialized by the software structural entropy in DWCIS-Network; the Structure Entropy Weighted LeaderRank applies the biased random walk process to iterate Class Entropy. Finally, the iteration is completed to obtain the Final Class Entropy ( FCE ) of each class as the importance score of each class, top- k classes are obtained, and key classes are identified. For two sets of experiments on Ant and JHotDraw, our approach effectively identifies key classes in class-level software networks for different top- k of classes, and the recall rates of our approach are the highest, 80% and 100%, respectively. From top-15% to top-5%, the precision of our approach is improved by 13.39%, which is the highest in comparison with the precisions of the other two classical approaches. Compared with the best performance of the two classical approaches, the RankingScore of our approach is improved by 16.51% in JHotDraw.

Download Full-text

STRUCTURAL PROPERTIES OF MULTILAYER SOFTWARE NETWORKS: A CASE STUDY IN TOMCAT

Advances in Complex Systems ◽

10.1142/s0219525918500042 ◽

2018 ◽

Vol 21 (02) ◽

pp. 1850004 ◽

Cited By ~ 3

Author(s):

WEIFENG PAN ◽

BO HU ◽

JILEI DONG ◽

KUN LIU ◽

BO JIANG

Keyword(s):

Structural Properties ◽

Path Length ◽

Single Layer ◽

Clustering Coefficient ◽

Software Systems ◽

Software Network ◽

Class Level ◽

Single Layer Network ◽

Coupling Type

Statistical properties of software networks have been extensively studied. However, in the previous works, software networks are usually considered as a single-layer network, which cannot capture the authentic characteristics of software since software in its nature should be multilayer. In this paper, we explore the structural properties of the multilayer software network at the class level by progressively merging layers together, where each coupling type such as inheritance, implements, and method call defines a specific layer. A case study in software Tomcat is conducted using a set of 10 measures widely used in complex network literatures. The results show that some structural properties that are widely observed in software network researches can only emerge when several layers are merged together, such as high clustering coefficient, small value of average shortest path length, and high global efficiency. Our study highlights the importance of taking into consideration the multilayer nature of software systems. The results we found can provide valuable insights to our understanding and modeling of the dynamical processes taking place in the design and development of software systems.

Download Full-text

Network-Based Analysis of Software Change Propagation

The Scientific World JOURNAL ◽

10.1155/2014/237243 ◽

2014 ◽

Vol 2014 ◽

pp. 1-10 ◽

Cited By ~ 3

Author(s):

Rongcun Wang ◽

Rubing Huang ◽

Binbin Qu

Keyword(s):

Open Source Software ◽

Rank Correlation ◽

Object Oriented ◽

Centrality Measures ◽

Software Systems ◽

Change Propagation ◽

Software Projects ◽

Dependency Networks ◽

Class Level

The object-oriented software systems frequently evolve to meet new change requirements. Understanding the characteristics of changes aids testers and system designers to improve the quality of softwares. Identifying important modules becomes a key issue in the process of evolution. In this context, a novel network-based approach is proposed to comprehensively investigate change distributions and the correlation between centrality measures and the scope of change propagation. First, software dependency networks are constructed at class level. And then, the number of times of cochanges among classes is minded from software repositories. According to the dependency relationships and the number of times of cochanges among classes, the scope of change propagation is calculated. Using Spearman rank correlation analyzes the correlation between centrality measures and the scope of change propagation. Three case studies on java open source software projects Findbugs, Hibernate, and Spring are conducted to research the characteristics of change propagation. Experimental results show that (i) change distribution is very uneven; (ii) PageRank, Degree, and CIRank are significantly correlated to the scope of change propagation. Particularly, CIRank shows higher correlation coefficient, which suggests it can be a more useful indicator for measuring the scope of change propagation of classes in object-oriented software system.

Download Full-text

Measuring Software Modularity Based on Software Networks

Entropy ◽

10.3390/e21040344 ◽

2019 ◽

Vol 21 (4) ◽

pp. 344 ◽

Cited By ~ 8

Author(s):

Yiming Xiang ◽

Weifeng Pan ◽

Haibo Jiang ◽

Yunfang Zhu ◽

Hao Li

Keyword(s):

Complex Systems ◽

Complex Network ◽

Software Metrics ◽

Empirical Evaluation ◽

Network Models ◽

Software Systems ◽

Complex Network Theory ◽

Software Metric ◽

Software Network ◽

Software Modularity

Modularity has been regarded as one of the most important properties of a successful software design. It has significant impact on many external quality attributes such as reusability, maintainability, and understandability. Thus, proposing metrics to measure the software modularity can be very useful. Although several metrics have been proposed to characterize some modularity-related attributes, they fail to characterize software modularity as a whole. A complex network uses network models to abstract the internal structure of complex systems, providing a general way to analyze complex systems as a whole. In this paper, we introduce the complex network theory into software engineering and employ modularity, a metric widely used in the field of community detection in complex network research, to measure software modularity as a whole. First, a specific piece of software is represented by a software network, feature coupling network (FCN), where methods and attributes are nodes, couplings between methods and attributes are edges, and the weight on the edges denotes the coupling strength. Then, modularity is applied to the FCN to measure software modularity. We apply the Weyuker’s criteria which is widely used in the field of software metrics, to validate the modularity as a software metric theoretically, and also perform an empirical evaluation using open-source Java software systems to show its effectiveness as a software metric to measure software modularity.

Download Full-text

A Weighted Multi-Local-World Network Evolving Model and Its Application in Software Network Modeling

Mathematical Problems in Engineering ◽

10.1155/2018/2048525 ◽

2018 ◽

Vol 2018 ◽

pp. 1-9

Author(s):

Zengyang Li ◽

Hui Liu ◽

Jun-An Lu ◽

Bing Li

Keyword(s):

Power Law ◽

Network Modeling ◽

Real Life ◽

Strength Distribution ◽

Software Systems ◽

Distribution Analysis ◽

Power Law Distribution ◽

Software Network ◽

Evolving Model ◽

Theoretical Results

The phenomenon of local worlds (also known as communities) exists in numerous real-life networks, for example, computer networks and social networks. We proposed the Weighted Multi-Local-World (WMLW) network evolving model, taking into account (1) the dense links between nodes in a local world, (2) the sparse links between nodes from different local worlds, and (3) the different importance between intra-local-world links and inter-local-world links. On topology evolving, new links between existing local worlds and new local worlds are added to the network, while new nodes and links are added to existing local worlds. On weighting mechanism, weight of links in a local world and weight of links between different local worlds are endowed different meanings. It is theoretically proven that the strength distribution of the generated network by the WMLW model yields to a power-law distribution. Simulations show the correctness of the theoretical results. Meanwhile, the degree distribution also follows a power-law distribution. Analysis and simulation results show that the proposed WMLW model can be used to model the evolution of class diagrams of software systems.

Download Full-text

Characterizing Software Stability via Change Propagation Simulation

Complexity ◽

10.1155/2019/9414162 ◽

2019 ◽

Vol 2019 ◽

pp. 1-17 ◽

Cited By ~ 4

Author(s):

Weifeng Pan ◽

Haibo Jiang ◽

Hua Ming ◽

Chunlai Chai ◽

Bi Chen ◽

...

Keyword(s):

Quality Improvement ◽

Open Source ◽

Software Quality ◽

Maintenance Cost ◽

Software Systems ◽

Change Propagation ◽

Propagation Process ◽

Empirical Results ◽

Class Level ◽

Theoretical Results

Software stability means the resistance to the amplification of changes in software. It has become one of the most important attributes that affect maintenance cost. To control the maintenance cost, many approaches have been proposed to measure software stability. However, it is still a very difficult task to evaluate the software stability especially when software becomes very large and complex. In this paper, we propose to characterize software stability via change propagation simulation. First, we propose a class coupling network (CCN) to model software structure at the class level. Then, we analyze the change propagation process in the CCN by using a simulation way, and by doing so, we develop a novel metric, SS (software stability), to measure software stability. Our SS metric is validated theoretically using the widely accepted Weyuker’s properties and empirically using a set of open source Java software systems. The theoretical results show that our SS metric satisfies most of Weyuker’s properties with only two exceptions, and the empirical results show that our metric is an effective indicator for software quality improvement and class importance. Empirical results also show that our approach has the ability to be applied to large software systems.

Download Full-text

Identifying Important Packages of Object-Oriented Software Using Weighted k-Core Decomposition

Journal of Intelligent Systems ◽

10.1515/jisys-2014-0015 ◽

2014 ◽

Vol 23 (4) ◽

pp. 461-476 ◽

Cited By ~ 4

Author(s):

Weifeng Pan ◽

Bo Hu ◽

Bo Jiang ◽

Bo Xie

Keyword(s):

Resource Allocation ◽

Decomposition Method ◽

Object Oriented ◽

Closeness Centrality ◽

The Other ◽

Software Systems ◽

Degree Centrality ◽

Software Network ◽

Influential Nodes ◽

Better Than

AbstractIdentifying important entities in software systems has many implications for effective resource allocation. Complex network research opens new opportunities for identifying important entities from software networks. However, the existing methods only focus on identifying important classes. Little work has been done on the identification of important packages. Moreover, the metrics they used to quantify the class importance are only designed for unweighted software networks and cannot fit in with the weighted software networks. To overcome these limitations, in this article, we introduce the weighted k-core decomposition method (Wk-core) to identify the important packages. First, we use a weighted software network to describe packages and their internal dependencies. Second, we use Wk-core to partition a software network into a layered structure. Then, the packages that are denoted by the nodes within the main core are the identified important packages. To evaluate our method, we use a variant of the susceptible–infectious–recovered model to examine the spreading influence of the nodes in six real weighted software networks. The results show that our method can well identify influential nodes, better than other four methods (i.e., original k-core decomposition, degree centrality, closeness centrality, and betweenness centrality methods). Furthermore, we demonstrate our method on two software networks and show that the important packages identified by our method are more meaningful from a software engineering perspective when compared with the other methods.

Download Full-text

Software Quality and Community Structure in Java Software Networks

International Journal of Software Engineering and Knowledge Engineering ◽

10.1142/s0218194017500401 ◽

2017 ◽

Vol 27 (07) ◽

pp. 1063-1096 ◽

Cited By ~ 2

Author(s):

Giulio Concas ◽

Michele Marchesi ◽

Cristina Monni ◽

Matteo Orrù ◽

Roberto Tonelli

Keyword(s):

Community Structure ◽

Software Quality ◽

Defect Density ◽

Power Laws ◽

Clustering Coefficient ◽

Software Systems ◽

Confounding Effect ◽

Software Network ◽

Community Metrics ◽

The Relationship

We present a study of 600 Java software networks with the aim of characterizing the relationship among their defectiveness and community metrics. We analyze the community structure of such networks, defined as their topological division into subnetworks of densely connected nodes. A high density of connections represents a higher level of cooperation between classes, so a well-defined division in communities could indicate that the software system has been designed in a modular fashion and all its functionalities are well separated. We show how the community structure can be an indicator of well-written, high quality code by retrieving the communities of the analyzed systems and by ranking their division in communities through the built-in metric called modularity. We found that the software systems with highest modularity possess the majority of bugs, and tested whether this result is related to some confounding effect. We found two power laws relating the maximum defect density with two different metrics: the number of detected communities inside a software network and the clustering coefficient. We finally found a linear correlation between clustering coefficient and number of communities. Our results can be used to make predictive hypotheses about software defectiveness of future releases of the analyzed systems.

Download Full-text

DNA Nano Devices as a Biased Random Walk Process: A Case Study of Isothermal Ratchet?

Materials Sciences and Applications ◽

10.4236/msa.2015.65045 ◽

2015 ◽

Vol 06 (05) ◽

pp. 401-419

Author(s):

Jean-Pierre Aimé ◽

Juan Elezgaray

Keyword(s):

Random Walk ◽

Random Walk Process ◽

Biased Random Walk

Download Full-text

Using predictive models to evaluate the quality of a test suite at class and method level.

10.5753/cbsoft_estendido.2020.14613 ◽

2020 ◽

Author(s):

Keslley Lima Silva ◽

Érika Cota

Keyword(s):

Predictive Models ◽

Continuous Process ◽

Machine Learning Algorithms ◽

Test Suite ◽

Software Systems ◽

Test Case ◽

Test Coverage ◽

Coverage Criteria ◽

Class Level ◽

Path Coverage

Testing is an indispensable part of the software development process and is a continuous process during the development life cycle. In this context, examining the behavior of software systems to reveal potential problems is a crucial task. To this end, the test suites usually are utilized to examine the software quality. However, test suite quality control is hard for the tester, especially in an evolving system. Such control is needed to assure and improve the test suite's quality and the application as a consequence. Currently, test coverage criteria are used as a mechanism to assist the tester in analyzing the test suite (e.g., find the weaknesses, and add a new test case or test inputs). However, more strong coverage criteria (potentially showing less glaring weaknesses) are challenging to assess. In this work, we propose a different approach to support the developer in evaluating the test suite quality based on more powerful test coverage criteria. We will follow the Knowledge Discovery in Database process using machine learning algorithms to estimate the prime path coverage at the method and class level. For this purpose, we will create two large datasets consisting of source code metrics and test case metrics from 12 open-source Java projects, and these datasets will be used in the training process to build the predictive models. Using the built models, we expected to predict the prime path coverage at the method and class level with a reliable prediction performance.

Download Full-text

The Cover Time of a Biased Random Walk on a Random Regular Graph of Odd Degree

The Electronic Journal of Combinatorics ◽

10.37236/8327 ◽

2020 ◽

Vol 27 (4) ◽

Author(s):

Tony Johansson

Keyword(s):

Random Walk ◽

Regular Graph ◽

Vertex Cover ◽

Random Walk Process ◽

Cover Time ◽

Edge Cover ◽

Biased Random Walk ◽

Random Regular Graph ◽

Odd Degree ◽

Leading Term

We consider a random walk process on graphs introduced by Orenshtein and Shinkar (2014). At any time, the random walk moves from its current position along a previously unvisited edge chosen uniformly at random, if such an edge exists. Otherwise, it walks along a previously visited edge chosen uniformly at random. For the random $r$-regular graph, with $r$ a constant odd integer, we show that this random walk process has asymptotic vertex and edge cover times $\frac{1}{r-2}n\log n$ and $\frac{r}{2(r-2)}n\log n$, respectively, generalizing a result of Cooper, Frieze and the author (2018) from $r = 3$ to any odd $r\geqslant 3$. The leading term of the asymptotic vertex cover time is now known for all fixed $r\geqslant 3$, with Berenbrink, Cooper and Friedetzky (2015) having shown that $G_r$ has vertex cover time asymptotic to $\frac{rn}{2}$ when $r\geqslant 4$ is even.

Download Full-text