Mining Change Logs and Release Notes to Understand Software Maintenance and Evolution

In C-alike programs, the source code is separated into header files and source files. During the software evolution process, both these two kinds of files need to adapt to changing requirement and changing environment. This paper studies the coevolution of header files and source files of C-alike programs. Using normalized compression distance that is derived from Kolmogorov complexity, we measure the header file difference and source file difference between versions of an evolving software product. Header files distance and source files distance are compared to understand their difference in pace of evolution. Mantel tests are performed to investigate the correlation of header file evolution and source file evolution. The study is performed on the source code of Apache HTTP web server.

Download Full-text

Summarizing Source Code with Transferred API Knowledge

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2018/314 ◽

2018 ◽

Cited By ~ 14

Author(s):

Xing Hu ◽

Ge Li ◽

Xin Xia ◽

David Lo ◽

Shuai Lu ◽

...

Keyword(s):

Real World ◽

Software Maintenance ◽

Large Scale ◽

State Of The Art ◽

Source Code ◽

Code Search ◽

Novel Approach ◽

Software Maintenance And Evolution ◽

World Industry ◽

Similar Code

Code summarization, aiming to generate succinct natural language description of source code, is extremely useful for code search and code comprehension. It has played an important role in software maintenance and evolution. Previous approaches generate summaries by retrieving summaries from similar code snippets. However, these approaches heavily rely on whether similar code snippets can be retrieved, how similar the snippets are, and fail to capture the API knowledge in the source code, which carries vital information about the functionality of the source code. In this paper, we propose a novel approach, named TL-CodeSum, which successfully uses API knowledge learned in a different but related task to code summarization. Experiments on large-scale real-world industry Java projects indicate that our approach is effective and outperforms the state-of-the-art in code summarization.

Download Full-text

Automatic Code Review by Learning the Revision of Source Code

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33014910 ◽

2019 ◽

Vol 33 ◽

pp. 4910-4917 ◽

Cited By ~ 4

Author(s):

Shu-Ting Shi ◽

Ming Li ◽

David Lo ◽

Ferdian Thung ◽

Xuan Huo

Keyword(s):

Software Maintenance ◽

Source Code ◽

Software Projects ◽

Code Review ◽

Manual Inspection ◽

Deep Model ◽

Speed Up ◽

The Difference ◽

Automatic Code ◽

Maintenance Process

Code review is the process of manual inspection on the revision of the source code in order to find out whether the revised source code eventually meets the revision requirements. However, manual code review is time-consuming, and automating such the code review process will alleviate the burden of code reviewers and speed up the software maintenance process. To construct the model for automatic code review, the characteristics of the revisions of source code (i.e., the difference between the two pieces of source code) should be properly captured and modeled. Unfortunately, most of the existing techniques can easily model the overall correlation between two pieces of source code, but not for the “difference” between two pieces of source code. In this paper, we propose a novel deep model named DACE for automatic code review. Such a model is able to learn revision features by contrasting the revised hunks from the original and revised source code with respect to the code context containing the hunks. Experimental results on six open source software projects indicate by learning the revision features, DACE can outperform the competing approaches in automatic code review.

Download Full-text

Using Kolmogorov Complexity to Study the Coevolution of Header Files and Source Files of C-alike Programs

International Journal of Knowledge and Systems Science ◽

10.4018/ijkss.2017040102 ◽

2017 ◽

Vol 8 (2) ◽

pp. 17-26

Author(s):

Liguo Yu

Keyword(s):

Kolmogorov Complexity ◽

Software Evolution ◽

Source Code ◽

Web Server ◽

Evolution Process ◽

Changing Environment ◽

Source File ◽

Normalized Compression Distance ◽

Mantel Tests ◽

Software Product

In C-alike programs, the source code is separated into header files and source files. During the software evolution process, both these two kinds of files need to adapt to changing requirement and changing environment. This paper studies the coevolution of header files and source files of C-alike programs. Using normalized compression distance that is derived from Kolmogorov complexity, we measure the header file difference and source file difference between versions of an evolving software product. Header files distance and source files distance are compared to understand their difference in pace of evolution. Mantel tests are performed to investigate the correlation of header file evolution and source file evolution. The study is performed on the source code of Apache HTTP web server.

Download Full-text

Source code comprehension strategies and metrics to predict comprehension effort in software maintenance and evolution tasks - an empirical study with industry practitioners

2011 27th IEEE International Conference on Software Maintenance (ICSM) ◽

10.1109/icsm.2011.6080814 ◽

2011 ◽

Cited By ~ 4

Author(s):

Kazuki Nishizono ◽

Shuji Morisakl ◽

Rodrigo Vivanco ◽

Kenichi Matsumoto

Keyword(s):

Empirical Study ◽

Software Maintenance ◽

Source Code ◽

Comprehension Strategies ◽

Software Maintenance And Evolution

Download Full-text

Integration of Ontology and UML Class-Based Modelling for Knowledge Representation

Journal of Computing Research and Innovation ◽

10.24191/jcrinn.v2i1.23 ◽

2018 ◽

Vol 2 (1) ◽

pp. 10-15

Author(s):

Rozita Kadar ◽

Sharifah Mashita Syed-Mohamad ◽

Putra Sumari ◽

Nur 'Aini Abdul Rashid

Keyword(s):

Knowledge Representation ◽

Software Maintenance ◽

Source Code ◽

Program Comprehension ◽

Software Systems ◽

Software System ◽

Source Codes ◽

Comprehension Process ◽

Relationship Of ◽

Maintenance Process

Program comprehension is an important process carried out involving much effort in software maintenance process. A key challenge to developers in program comprehension process is to comprehend a source code. Nowadays, software systems have grown in size ca using increase in developers' tasks to explore and understand millions of lines of source code. Meanwhile, source code is a crucial resource for developers to become familiar with a software system since some system documentations are often unavailable or outdated. However, there are problems exist in understanding source codes, which are tricky with different programming styles, and insufficient comments. Although many researchers have discussed different strategies and techniques to overcome program compr ehension problem, only a shallow knowledge is obtained about the challenges in trying to understand a software system through reading source code. Therefore, this study attempts to overcome the problems in source code comprehension by suggesting a suitable comprehension technique. The proposed technique is based on using ontology approach for knowledge representation. This approach is able to easily explain the concept and relationship of program domain. Thus, the proposed work will create a better way for improving program comprehension.

Download Full-text

Enhanced Bug Prediction in JavaScript Programs with Hybrid Call-Graph Based Invocation Metrics

Technologies ◽

10.3390/technologies9010003 ◽

2020 ◽

Vol 9 (1) ◽

pp. 3

Author(s):

Gábor Antal ◽

Zoltán Tóth ◽

Péter Hegedűs ◽

Rudolf Ferenc

Keyword(s):

Software Maintenance ◽

Positive Impact ◽

Source Code ◽

Code Analysis ◽

Static Source ◽

Static Code Analysis ◽

Function Calls ◽

Hybrid Code ◽

Code Metrics ◽

Scripting Language

Bug prediction aims at finding source code elements in a software system that are likely to contain defects. Being aware of the most error-prone parts of the program, one can efficiently allocate the limited amount of testing and code review resources. Therefore, bug prediction can support software maintenance and evolution to a great extent. In this paper, we propose a function level JavaScript bug prediction model based on static source code metrics with the addition of a hybrid (static and dynamic) code analysis based metric of the number of incoming and outgoing function calls (HNII and HNOI). Our motivation for this is that JavaScript is a highly dynamic scripting language for which static code analysis might be very imprecise; therefore, using a purely static source code features for bug prediction might not be enough. Based on a study where we extracted 824 buggy and 1943 non-buggy functions from the publicly available BugsJS dataset for the ESLint JavaScript project, we can confirm the positive impact of hybrid code metrics on the prediction performance of the ML models. Depending on the ML algorithm, applied hyper-parameters, and target measures we consider, hybrid invocation metrics bring a 2–10% increase in model performances (i.e., precision, recall, F-measure). Interestingly, replacing static NOI and NII metrics with their hybrid counterparts HNOI and HNII in itself improves model performances; however, using them all together yields the best results.

Download Full-text

Augmenting Bug Localization with Part-of-Speech and Invocation

International Journal of Software Engineering and Knowledge Engineering ◽

10.1142/s0218194017500346 ◽

2017 ◽

Vol 27 (06) ◽

pp. 925-949 ◽

Cited By ~ 5

Author(s):

Yu Zhou ◽

Yanxiang Tong ◽

Taolue Chen ◽

Jin Han

Keyword(s):

Software Maintenance ◽

Large Scale ◽

Bug Localization ◽

Bug Reports ◽

Part Of Speech ◽

Adaptive Technique ◽

Bug Report ◽

Software Maintenance And Evolution ◽

Speech Features ◽

Localization Approach

Bug localization represents one of the most expensive, as well as time-consuming, activities during software maintenance and evolution. To alleviate the workload of developers, numerous methods have been proposed to automate this process and narrow down the scope of reviewing buggy files. In this paper, we present a novel buggy source-file localization approach, using the information from both the bug reports and the source files. We leverage the part-of-speech features of bug reports and the invocation relationship among source files. We also integrate an adaptive technique to further optimize the performance of the approach. The adaptive technique discriminates Top 1 and Top N recommendations for a given bug report and consists of two modules. One module is to maximize the accuracy of the first recommended file, and the other one aims at improving the accuracy of the fixed defect file list. We evaluate our approach on six large-scale open source projects, i.e. ASpectJ, Eclipse, SWT, Zxing, Birt and Tomcat. Compared to the previous work, empirical results show that our approach can improve the overall prediction performance in all of these cases. Particularly, in terms of the Top 1 recommendation accuracy, our approach achieves an enhancement from 22.73% to 39.86% for ASpectJ, from 24.36% to 30.76% for Eclipse, from 31.63% to 46.94% for SWT, from 40% to 55% for ZXing, from 7.97% to 21.99% for Birt, and from 33.37% to 38.90% for Tomcat.

Download Full-text