Product Metrics for Automatic Identification of "Bad Smell" Design Problems in Java Source-Code

Author(s):  
M.J. Munro
2020 ◽  
Author(s):  
Willian N. Oizumi ◽  
Alessandro F. Garcia

Design problems affect most software projects and make their maintenance expensive and impeditive. Thus, the identification of potential design problems in the source code – which is very often the only available and upto-date artifact in a project – becomes essential in long-living software systems. This identification task is challenging as the reification of design problems in the source code tend to be scattered through several code elements. However, stateof-the-art techniques do not provide enough information to effectively help developers in this task. In this work, we address this challenge by proposing a new technique to support developers in revealing design problems. This technique synthesizes information about potential design problems, which are materialized in the implementation under the form of syntactic and semantic anomaly agglomerations. Our evaluation shows that the proposed synthesis technique helps to reveal more than 1200 design problems across 7 industry-strength systems, with a median precision of 71% and a median recall of 78%. The relevance of our work has been widely recognized by the software engineering community through 2 awards and 7 publications in international and national venues.


2019 ◽  
Vol 25 (1) ◽  
pp. 790-823
Author(s):  
Yusuf Sulistyo Nugroho ◽  
Hideaki Hata ◽  
Kenichi Matsumoto

Abstract Automatic identification of the differences between two versions of a file is a common and basic task in several applications of mining code repositories. Git, a version control system, has a diff utility and users can select algorithms of diff from the default algorithm Myers to the advanced Histogram algorithm. From our systematic mapping, we identified three popular applications of diff in recent studies. On the impact on code churn metrics in 14 Java projects, we obtained different values in 1.7% to 8.2% commits based on the different diff algorithms. Regarding bug-introducing change identification, we found 6.0% and 13.3% in the identified bug-fix commits had different results of bug-introducing changes from 10 Java projects. For patch application, we found that the Histogram is more suitable than Myers for providing the changes of code, from our manual analysis. Thus, we strongly recommend using the Histogram algorithm when mining Git repositories to consider differences in source code.


Author(s):  
J. Gu ◽  
G. Y. Li ◽  
Z. Dong

Metamodeling techniques are increasingly used in solving computation intensive design optimization problems today. In this work, the issue of automatic identification of appropriate metamodeling techniques in global optimization is addressed. A generic, new hybrid metamodel based global optimization method, particularly suitable for design problems involving computation intensive, black-box analyses and simulations, is introduced. The method employs three representative metamodels concurrently in the search process and selects sample data points adaptively according to the values calculated using the three metamodels to improve the accuracy of modeling. The global optimum is identified when the metamodels become reasonably accurate. The new method is tested using various benchmark global optimization problems and applied to a real industrial design optimization problem involving vehicle crash simulation, to demonstrate the superior performance of the new algorithm over existing search methods. Present limitations of the proposed method are also discussed.


Author(s):  
Ryosuke Ishizue ◽  
Hironori Washizaki ◽  
Yoshiaki Fukazawa ◽  
Sakae Inoue ◽  
Yoshiiku Hanai ◽  
...  

Software developments involving multiple organizations such as Open Source Software (OSS)-based projects tend to have numerous defects when one organization develops and another organization edits the program source code files. Developments with complex file creation, modification history (origin), and software architecture (functional layer) are increasing in OSS-based development. As an example, we focus on an Android smart phone and a VirtualBox development project, and propose new visualization techniques for product metrics based on file origin and functional layers. One is the Metrics Area Figure, which can express duplication of edits by multiple organizations intuitively using overlapping figures. The other is Origin City, which was inspired by Code City. It can represent the scale and other measurements, while simultaneously stacking functional layers as 3D buildings. The contributions of our paper are to propose new techniques, implement them as web applications, and share the results of our questionnaire. Our proposed techniques are useful not only to visualize the measured metrics, but also to improve the product quality.


2021 ◽  
Vol 18 (3) ◽  
pp. 674-710
Author(s):  
Jiaze Tu ◽  
Huiling Chen ◽  
Mingjing Wang ◽  
Amir H. Gandomi

AbstractThis paper proposes a new stochastic optimizer called the Colony Predation Algorithm (CPA) based on the corporate predation of animals in nature. CPA utilizes a mathematical mapping following the strategies used by animal hunting groups, such as dispersing prey, encircling prey, supporting the most likely successful hunter, and seeking another target. Moreover, the proposed CPA introduces new features of a unique mathematical model that uses a success rate to adjust the strategy and simulate hunting animals’ selective abandonment behavior. This paper also presents a new way to deal with cross-border situations, whereby the optimal position value of a cross-border situation replaces the cross-border value to improve the algorithm’s exploitation ability. The proposed CPA was compared with state-of-the-art metaheuristics on a comprehensive set of benchmark functions for performance verification and on five classical engineering design problems to evaluate the algorithm’s efficacy in optimizing engineering problems. The results show that the proposed algorithm exhibits competitive, superior performance in different search landscapes over the other algorithms. Moreover, the source code of the CPA will be publicly available after publication.


Software quality aims at having quality as part of all aspects of the developed software. Design smells are considered enemies of the software source code quality. There are verities of design problems with different terminologies. Researchers and practitioners accept it as true that whenever there is a design smell, there is a security issue or concern. In this work, we want to explore the connection between design smells and security vulnerabilities. This work provides experimental evidence about this connection. We conducted an empirical study to explore the connection between design smells and security issues by evaluating four C# open-source systems. We found interesting results that show classes with design smells have more chances of having security issues.


Author(s):  
Leonardo Sousa ◽  
Anderson Oliveira ◽  
Willian Oizumi ◽  
Simone Barbosa ◽  
Alessandro Garcia ◽  
...  
Keyword(s):  

2001 ◽  
Vol 10 (2) ◽  
pp. 180-188 ◽  
Author(s):  
Steven H. Long ◽  
Ron W. Channell

Most software for language analysis has relied on an interaction between the metalinguistic skills of a human coder and the calculating ability of the machine to produce reliable results. However, probabilistic parsing algorithms are now capable of highly accurate and completely automatic identification of grammatical word classes. The program Computerized Profiling combines a probabilistic parser with modules customized to produce four clinical grammatical analyses: MLU, LARSP, IPSyn, and DSS. The accuracy of these analyses was assessed on 69 language samples from typically developing, speech-impaired, and language-impaired children, 2 years 6 months to 7 years 10 months. Values obtained with human coding and by the software alone were compared. Results for all four analyses produced automatically were comparable to published data on the manual interrater reliability of these procedures. Clinical decisions based on cutoff scores and productivity data were little affected by the use of automatic rather than human-generated analyses. These findings bode well for future clinical and research use of automatic language analysis software.


Sign in / Sign up

Export Citation Format

Share Document