Product Metrics for Automatic Identification of "Bad Smell" Design Problems in Java Source-Code

Synthesis of Code Anomalies: Revealing Design Problems in the Source Code

10.5753/ctd.2016.9131 ◽

2020 ◽

Author(s):

Willian N. Oizumi ◽

Alessandro F. Garcia

Keyword(s):

Software Engineering ◽

Source Code ◽

New Technique ◽

Software Systems ◽

Identification Task ◽

Software Projects ◽

Design Problems ◽

Engineering Community ◽

Synthesis Technique ◽

A New Technique

Design problems affect most software projects and make their maintenance expensive and impeditive. Thus, the identification of potential design problems in the source code – which is very often the only available and upto-date artifact in a project – becomes essential in long-living software systems. This identification task is challenging as the reification of design problems in the source code tend to be scattered through several code elements. However, stateof-the-art techniques do not provide enough information to effectively help developers in this task. In this work, we address this challenge by proposing a new technique to support developers in revealing design problems. This technique synthesizes information about potential design problems, which are materialized in the implementation under the form of syntactic and semantic anomaly agglomerations. Our evaluation shows that the proposed synthesis technique helps to reveal more than 1200 design problems across 7 industry-strength systems, with a median precision of 71% and a median recall of 78%. The relevance of our work has been widely recognized by the software engineering community through 2 awards and 7 publications in international and national venues.

Download Full-text

How different are different diff algorithms in Git?

Empirical Software Engineering ◽

10.1007/s10664-019-09772-z ◽

2019 ◽

Vol 25 (1) ◽

pp. 790-823

Author(s):

Yusuf Sulistyo Nugroho ◽

Hideaki Hata ◽

Kenichi Matsumoto

Keyword(s):

Source Code ◽

Version Control ◽

Automatic Identification ◽

Systematic Mapping ◽

Patch Application ◽

Basic Task ◽

Manual Analysis ◽

Version Control System ◽

Change Identification ◽

The Impact

Abstract Automatic identification of the differences between two versions of a file is a common and basic task in several applications of mining code repositories. Git, a version control system, has a diff utility and users can select algorithms of diff from the default algorithm Myers to the advanced Histogram algorithm. From our systematic mapping, we identified three popular applications of diff in recent studies. On the impact on code churn metrics in 14 Java projects, we obtained different values in 1.7% to 8.2% commits based on the different diff algorithms. Regarding bug-introducing change identification, we found 6.0% and 13.3% in the identified bug-fix commits had different results of bug-introducing changes from 10 Java projects. For patch application, we found that the Histogram is more suitable than Myers for providing the changes of code, from our manual analysis. Thus, we strongly recommend using the Histogram algorithm when mining Git repositories to consider differences in source code.

Download Full-text

Hybrid and Adaptive Metamodel Based Global Optimization

Volume 5: 35th Design Automation Conference, Parts A and B ◽

10.1115/detc2009-87121 ◽

2009 ◽

Cited By ~ 1

Author(s):

J. Gu ◽

G. Y. Li ◽

Z. Dong

Keyword(s):

Global Optimization ◽

Design Optimization ◽

Optimization Problems ◽

Optimization Method ◽

Global Optimum ◽

Superior Performance ◽

Automatic Identification ◽

Design Problems ◽

Sample Data ◽

Data Points

Metamodeling techniques are increasingly used in solving computation intensive design optimization problems today. In this work, the issue of automatic identification of appropriate metamodeling techniques in global optimization is addressed. A generic, new hybrid metamodel based global optimization method, particularly suitable for design problems involving computation intensive, black-box analyses and simulations, is introduced. The method employs three representative metamodels concurrently in the search process and selects sample data points adaptively according to the values calculated using the three metamodels to improve the accuracy of modeling. The global optimum is identified when the metamodels become reasonably accurate. The new method is tested using various benchmark global optimization problems and applied to a real industrial design optimization problem involving vehicle crash simulation, to demonstrate the superior performance of the new algorithm over existing search methods. Present limitations of the proposed method are also discussed.

Download Full-text

Improving Automatic Identification of Outdated Requirements by Using Closeness Analysis Based on Source Code Changes

Communications in Computer and Information Science - Software Engineering and Methodology for Emerging Domains ◽

10.1007/978-981-10-3482-4_4 ◽

2016 ◽

pp. 52-67 ◽

Cited By ~ 3

Author(s):

Hongyu Kuang ◽

Jia Nie ◽

Hao Hu ◽

Jian Lü

Keyword(s):

Source Code ◽

Automatic Identification ◽

Code Changes ◽

Source Code Changes

Download Full-text

Metrics Visualization Techniques Based on Historical Origins and Functional Layers for Developments by Multiple Organizations

International Journal of Software Engineering and Knowledge Engineering ◽

10.1142/s0218194018500067 ◽

2018 ◽

Vol 28 (01) ◽

pp. 123-147 ◽

Cited By ~ 1

Author(s):

Ryosuke Ishizue ◽

Hironori Washizaki ◽

Yoshiaki Fukazawa ◽

Sakae Inoue ◽

Yoshiiku Hanai ◽

...

Keyword(s):

Open Source Software ◽

Web Applications ◽

Source Code ◽

Smart Phone ◽

Development Project ◽

Functional Layer ◽

New Techniques ◽

Product Metrics ◽

Visualization Techniques ◽

Historical Origins

Software developments involving multiple organizations such as Open Source Software (OSS)-based projects tend to have numerous defects when one organization develops and another organization edits the program source code files. Developments with complex file creation, modification history (origin), and software architecture (functional layer) are increasing in OSS-based development. As an example, we focus on an Android smart phone and a VirtualBox development project, and propose new visualization techniques for product metrics based on file origin and functional layers. One is the Metrics Area Figure, which can express duplication of edits by multiple organizations intuitively using overlapping figures. The other is Origin City, which was inspired by Code City. It can represent the scale and other measurements, while simultaneously stacking functional layers as 3D buildings. The contributions of our paper are to propose new techniques, implement them as web applications, and share the results of our questionnaire. Our proposed techniques are useful not only to visualize the measured metrics, but also to improve the product quality.

Download Full-text

The Colony Predation Algorithm

Journal of Bionic Engineering ◽

10.1007/s42235-021-0050-y ◽

2021 ◽

Vol 18 (3) ◽

pp. 674-710

Author(s):

Jiaze Tu ◽

Huiling Chen ◽

Mingjing Wang ◽

Amir H. Gandomi

Keyword(s):

Mathematical Model ◽

State Of The Art ◽

Source Code ◽

The Other ◽

Superior Performance ◽

Design Problems ◽

Optimal Position ◽

Cross Border ◽

Engineering Design Problems ◽

Engineering Problems

AbstractThis paper proposes a new stochastic optimizer called the Colony Predation Algorithm (CPA) based on the corporate predation of animals in nature. CPA utilizes a mathematical mapping following the strategies used by animal hunting groups, such as dispersing prey, encircling prey, supporting the most likely successful hunter, and seeking another target. Moreover, the proposed CPA introduces new features of a unique mathematical model that uses a success rate to adjust the strategy and simulate hunting animals’ selective abandonment behavior. This paper also presents a new way to deal with cross-border situations, whereby the optimal position value of a cross-border situation replaces the cross-border value to improve the algorithm’s exploitation ability. The proposed CPA was compared with state-of-the-art metaheuristics on a comprehensive set of benchmark functions for performance verification and on five classical engineering design problems to evaluate the algorithm’s efficacy in optimizing engineering problems. The results show that the proposed algorithm exhibits competitive, superior performance in different search landscapes over the other algorithms. Moreover, the source code of the CPA will be publicly available after publication.

Download Full-text

Exploring the Connection between Design Smells and Security Vulnerabilities

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.h6508.069820 ◽

2020 ◽

Vol 9 (8) ◽

pp. 449-452

Keyword(s):

Software Quality ◽

Software Design ◽

Source Code ◽

Security Issue ◽

Security Vulnerabilities ◽

Design Problems ◽

Security Issues ◽

Design Smells ◽

Code Quality ◽

Open Source Systems

Software quality aims at having quality as part of all aspects of the developed software. Design smells are considered enemies of the software source code quality. There are verities of design problems with different terminologies. Researchers and practitioners accept it as true that whenever there is a design smell, there is a security issue or concern. In this work, we want to explore the connection between design smells and security vulnerabilities. This work provides experimental evidence about this connection. We conducted an empirical study to explore the connection between design smells and security issues by evaluating four C# open-source systems. We found interesting results that show classes with design smells have more chances of having security issues.

Download Full-text

Identifying design problems in the source code

Proceedings of the 40th International Conference on Software Engineering - ICSE '18 ◽

10.1145/3180155.3180239 ◽

2018 ◽

Cited By ~ 14

Author(s):

Leonardo Sousa ◽

Anderson Oliveira ◽

Willian Oizumi ◽

Simone Barbosa ◽

Alessandro Garcia ◽

...

Keyword(s):

Source Code ◽

Design Problems

Download Full-text

SYNTHESIS OF CODE ANOMALIES: REVEALING DESIGN PROBLEMS IN THE SOURCE CODE

10.17771/pucrio.acad.25718 ◽

2015 ◽

Author(s):

WILLIAN NALEPA OIZUMI

Keyword(s):

Source Code ◽

Design Problems

Download Full-text

Accuracy of Four Language Analysis Procedures Performed Automatically

American Journal of Speech-Language Pathology ◽

10.1044/1058-0360(2001/017) ◽

2001 ◽

Vol 10 (2) ◽

pp. 180-188 ◽

Cited By ~ 9

Author(s):

Steven H. Long ◽

Ron W. Channell

Keyword(s):

Automatic Identification ◽

Published Data ◽

Typically Developing ◽

Language Impaired ◽

Productivity Data ◽

Language Analysis ◽

Word Classes ◽

Human Coder ◽

Impaired Children ◽

Metalinguistic Skills

Most software for language analysis has relied on an interaction between the metalinguistic skills of a human coder and the calculating ability of the machine to produce reliable results. However, probabilistic parsing algorithms are now capable of highly accurate and completely automatic identification of grammatical word classes. The program Computerized Profiling combines a probabilistic parser with modules customized to produce four clinical grammatical analyses: MLU, LARSP, IPSyn, and DSS. The accuracy of these analyses was assessed on 69 language samples from typically developing, speech-impaired, and language-impaired children, 2 years 6 months to 7 years 10 months. Values obtained with human coding and by the software alone were compared. Results for all four analyses produced automatically were comparable to published data on the manual interrater reliability of these procedures. Clinical decisions based on cutoff scores and productivity data were little affected by the use of automatic rather than human-generated analyses. These findings bode well for future clinical and research use of automatic language analysis software.

Download Full-text