scholarly journals Identifying Key Classes Algorithm in Directed Weighted Class Interaction Network Based on the Structure Entropy Weighted LeaderRank

2020 ◽  
Vol 2020 ◽  
pp. 1-12
Author(s):  
Wanchang Jiang ◽  
Ning Dai

Identifying key classes can help software maintainers quickly understand software systems. The existing key class recognition algorithms consider the weight of class interaction, but the weight mechanism is single or arbitrary. In this paper, the multitype weighting mechanism is considered, and the key classes are accurately identified by using four kinds of interaction. By abstracting the software system into the directed weighted class interaction network, a novel Structure Entropy Weighted LeaderRank of identifying key classes algorithm is proposed. First, considering multiple types and directions of interactions between every pair of classes, the directed weighted class interaction software network (DWCIS-Network) is built. Second, Class Entropy of each class is initialized by the software structural entropy in DWCIS-Network; the Structure Entropy Weighted LeaderRank applies the biased random walk process to iterate Class Entropy. Finally, the iteration is completed to obtain the Final Class Entropy ( FCE ) of each class as the importance score of each class, top- k classes are obtained, and key classes are identified. For two sets of experiments on Ant and JHotDraw, our approach effectively identifies key classes in class-level software networks for different top- k of classes, and the recall rates of our approach are the highest, 80% and 100%, respectively. From top-15% to top-5%, the precision of our approach is improved by 13.39%, which is the highest in comparison with the precisions of the other two classical approaches. Compared with the best performance of the two classical approaches, the RankingScore of our approach is improved by 16.51% in JHotDraw.

2018 ◽  
Vol 21 (02) ◽  
pp. 1850004 ◽  
Author(s):  
WEIFENG PAN ◽  
BO HU ◽  
JILEI DONG ◽  
KUN LIU ◽  
BO JIANG

Statistical properties of software networks have been extensively studied. However, in the previous works, software networks are usually considered as a single-layer network, which cannot capture the authentic characteristics of software since software in its nature should be multilayer. In this paper, we explore the structural properties of the multilayer software network at the class level by progressively merging layers together, where each coupling type such as inheritance, implements, and method call defines a specific layer. A case study in software Tomcat is conducted using a set of 10 measures widely used in complex network literatures. The results show that some structural properties that are widely observed in software network researches can only emerge when several layers are merged together, such as high clustering coefficient, small value of average shortest path length, and high global efficiency. Our study highlights the importance of taking into consideration the multilayer nature of software systems. The results we found can provide valuable insights to our understanding and modeling of the dynamical processes taking place in the design and development of software systems.


2014 ◽  
Vol 2014 ◽  
pp. 1-10 ◽  
Author(s):  
Rongcun Wang ◽  
Rubing Huang ◽  
Binbin Qu

The object-oriented software systems frequently evolve to meet new change requirements. Understanding the characteristics of changes aids testers and system designers to improve the quality of softwares. Identifying important modules becomes a key issue in the process of evolution. In this context, a novel network-based approach is proposed to comprehensively investigate change distributions and the correlation between centrality measures and the scope of change propagation. First, software dependency networks are constructed at class level. And then, the number of times of cochanges among classes is minded from software repositories. According to the dependency relationships and the number of times of cochanges among classes, the scope of change propagation is calculated. Using Spearman rank correlation analyzes the correlation between centrality measures and the scope of change propagation. Three case studies on java open source software projects Findbugs, Hibernate, and Spring are conducted to research the characteristics of change propagation. Experimental results show that (i) change distribution is very uneven; (ii) PageRank, Degree, and CIRank are significantly correlated to the scope of change propagation. Particularly, CIRank shows higher correlation coefficient, which suggests it can be a more useful indicator for measuring the scope of change propagation of classes in object-oriented software system.


Entropy ◽  
2019 ◽  
Vol 21 (4) ◽  
pp. 344 ◽  
Author(s):  
Yiming Xiang ◽  
Weifeng Pan ◽  
Haibo Jiang ◽  
Yunfang Zhu ◽  
Hao Li

Modularity has been regarded as one of the most important properties of a successful software design. It has significant impact on many external quality attributes such as reusability, maintainability, and understandability. Thus, proposing metrics to measure the software modularity can be very useful. Although several metrics have been proposed to characterize some modularity-related attributes, they fail to characterize software modularity as a whole. A complex network uses network models to abstract the internal structure of complex systems, providing a general way to analyze complex systems as a whole. In this paper, we introduce the complex network theory into software engineering and employ modularity, a metric widely used in the field of community detection in complex network research, to measure software modularity as a whole. First, a specific piece of software is represented by a software network, feature coupling network (FCN), where methods and attributes are nodes, couplings between methods and attributes are edges, and the weight on the edges denotes the coupling strength. Then, modularity is applied to the FCN to measure software modularity. We apply the Weyuker’s criteria which is widely used in the field of software metrics, to validate the modularity as a software metric theoretically, and also perform an empirical evaluation using open-source Java software systems to show its effectiveness as a software metric to measure software modularity.


2018 ◽  
Vol 2018 ◽  
pp. 1-9
Author(s):  
Zengyang Li ◽  
Hui Liu ◽  
Jun-An Lu ◽  
Bing Li

The phenomenon of local worlds (also known as communities) exists in numerous real-life networks, for example, computer networks and social networks. We proposed the Weighted Multi-Local-World (WMLW) network evolving model, taking into account (1) the dense links between nodes in a local world, (2) the sparse links between nodes from different local worlds, and (3) the different importance between intra-local-world links and inter-local-world links. On topology evolving, new links between existing local worlds and new local worlds are added to the network, while new nodes and links are added to existing local worlds. On weighting mechanism, weight of links in a local world and weight of links between different local worlds are endowed different meanings. It is theoretically proven that the strength distribution of the generated network by the WMLW model yields to a power-law distribution. Simulations show the correctness of the theoretical results. Meanwhile, the degree distribution also follows a power-law distribution. Analysis and simulation results show that the proposed WMLW model can be used to model the evolution of class diagrams of software systems.


Complexity ◽  
2019 ◽  
Vol 2019 ◽  
pp. 1-17 ◽  
Author(s):  
Weifeng Pan ◽  
Haibo Jiang ◽  
Hua Ming ◽  
Chunlai Chai ◽  
Bi Chen ◽  
...  

Software stability means the resistance to the amplification of changes in software. It has become one of the most important attributes that affect maintenance cost. To control the maintenance cost, many approaches have been proposed to measure software stability. However, it is still a very difficult task to evaluate the software stability especially when software becomes very large and complex. In this paper, we propose to characterize software stability via change propagation simulation. First, we propose a class coupling network (CCN) to model software structure at the class level. Then, we analyze the change propagation process in the CCN by using a simulation way, and by doing so, we develop a novel metric, SS (software stability), to measure software stability. Our SS metric is validated theoretically using the widely accepted Weyuker’s properties and empirically using a set of open source Java software systems. The theoretical results show that our SS metric satisfies most of Weyuker’s properties with only two exceptions, and the empirical results show that our metric is an effective indicator for software quality improvement and class importance. Empirical results also show that our approach has the ability to be applied to large software systems.


2014 ◽  
Vol 23 (4) ◽  
pp. 461-476 ◽  
Author(s):  
Weifeng Pan ◽  
Bo Hu ◽  
Bo Jiang ◽  
Bo Xie

AbstractIdentifying important entities in software systems has many implications for effective resource allocation. Complex network research opens new opportunities for identifying important entities from software networks. However, the existing methods only focus on identifying important classes. Little work has been done on the identification of important packages. Moreover, the metrics they used to quantify the class importance are only designed for unweighted software networks and cannot fit in with the weighted software networks. To overcome these limitations, in this article, we introduce the weighted k-core decomposition method (Wk-core) to identify the important packages. First, we use a weighted software network to describe packages and their internal dependencies. Second, we use Wk-core to partition a software network into a layered structure. Then, the packages that are denoted by the nodes within the main core are the identified important packages. To evaluate our method, we use a variant of the susceptible–infectious–recovered model to examine the spreading influence of the nodes in six real weighted software networks. The results show that our method can well identify influential nodes, better than other four methods (i.e., original k-core decomposition, degree centrality, closeness centrality, and betweenness centrality methods). Furthermore, we demonstrate our method on two software networks and show that the important packages identified by our method are more meaningful from a software engineering perspective when compared with the other methods.


Author(s):  
Giulio Concas ◽  
Michele Marchesi ◽  
Cristina Monni ◽  
Matteo Orrù ◽  
Roberto Tonelli

We present a study of 600 Java software networks with the aim of characterizing the relationship among their defectiveness and community metrics. We analyze the community structure of such networks, defined as their topological division into subnetworks of densely connected nodes. A high density of connections represents a higher level of cooperation between classes, so a well-defined division in communities could indicate that the software system has been designed in a modular fashion and all its functionalities are well separated. We show how the community structure can be an indicator of well-written, high quality code by retrieving the communities of the analyzed systems and by ranking their division in communities through the built-in metric called modularity. We found that the software systems with highest modularity possess the majority of bugs, and tested whether this result is related to some confounding effect. We found two power laws relating the maximum defect density with two different metrics: the number of detected communities inside a software network and the clustering coefficient. We finally found a linear correlation between clustering coefficient and number of communities. Our results can be used to make predictive hypotheses about software defectiveness of future releases of the analyzed systems.


Author(s):  
Keslley Lima Silva ◽  
Érika Cota

Testing is an indispensable part of the software development process and is a continuous process during the development life cycle. In this context, examining the behavior of software systems to reveal potential problems is a crucial task. To this end, the test suites usually are utilized to examine the software quality. However, test suite quality control is hard for the tester, especially in an evolving system. Such control is needed to assure and improve the test suite's quality and the application as a consequence. Currently, test coverage criteria are used as a mechanism to assist the tester in analyzing the test suite (e.g., find the weaknesses, and add a new test case or test inputs). However, more strong coverage criteria (potentially showing less glaring weaknesses) are challenging to assess. In this work, we propose a different approach to support the developer in evaluating the test suite quality based on more powerful test coverage criteria. We will follow the Knowledge Discovery in Database process using machine learning algorithms to estimate the prime path coverage at the method and class level. For this purpose, we will create two large datasets consisting of source code metrics and test case metrics from 12 open-source Java projects, and these datasets will be used in the training process to build the predictive models. Using the built models, we expected to predict the prime path coverage at the method and class level with a reliable prediction performance.


10.37236/8327 ◽  
2020 ◽  
Vol 27 (4) ◽  
Author(s):  
Tony Johansson

We consider a random walk process on graphs introduced by Orenshtein and Shinkar (2014). At any time, the random walk moves from its current position along a previously unvisited edge chosen uniformly at random, if such an edge exists. Otherwise, it walks along a previously visited edge chosen uniformly at random. For the random $r$-regular graph, with $r$ a constant odd integer, we show that this random walk process has asymptotic vertex and edge cover times $\frac{1}{r-2}n\log n$ and $\frac{r}{2(r-2)}n\log n$, respectively, generalizing a result of Cooper, Frieze and the author (2018) from $r = 3$ to any odd $r\geqslant 3$. The leading term of the asymptotic vertex cover time is now known for all fixed $r\geqslant 3$, with Berenbrink, Cooper and Friedetzky (2015) having shown that $G_r$ has vertex cover time asymptotic to $\frac{rn}{2}$ when $r\geqslant 4$ is even.


Sign in / Sign up

Export Citation Format

Share Document