scholarly journals A Conceptual Dependency Graph Based Keyword Extraction Model for Source Code to API Documentation Mapping

2019 ◽  
Vol 8 (2) ◽  
pp. 5888-5895

Natural language processing on software systems usually contain high dimensional noisy and irrelevant features which lead to inaccurate and poor contextual similarity between the project source code and its API documentation. Most of the traditional source code analysis models are independent of finding and extracting the relevant features for contextual similarity. As the size of the project source code and its related API documentation increases, these models incorporate the contextual similarity between the source code and API documentation for code analysis. One of the best solutions for this problem is finding the essential features using the source code dependency graph. In this paper, the dependency graph is used to compute the contextual similarity computation between the source code metrics and its API documents. A novel contextual similarity measure is used to find the relationship between the project source code metrics to the API documents. Proposed model is evaluated on different project source codes and API documents in terms of pre-processing, context similarity and runtime. Experimental results show that the proposed model has high computational efficiency compared to the existing models on the large size datasets

2020 ◽  
Vol 28 (4) ◽  
pp. 1447-1506 ◽  
Author(s):  
Rudolf Ferenc ◽  
Zoltán Tóth ◽  
Gergely Ladányi ◽  
István Siket ◽  
Tibor Gyimóthy

AbstractBug datasets have been created and used by many researchers to build and validate novel bug prediction models. In this work, our aim is to collect existing public source code metric-based bug datasets and unify their contents. Furthermore, we wish to assess the plethora of collected metrics and the capabilities of the unified bug dataset in bug prediction. We considered 5 public datasets and we downloaded the corresponding source code for each system in the datasets and performed source code analysis to obtain a common set of source code metrics. This way, we produced a unified bug dataset at class and file level as well. We investigated the diversion of metric definitions and values of the different bug datasets. Finally, we used a decision tree algorithm to show the capabilities of the dataset in bug prediction. We found that there are statistically significant differences in the values of the original and the newly calculated metrics; furthermore, notations and definitions can severely differ. We compared the bug prediction capabilities of the original and the extended metric suites (within-project learning). Afterwards, we merged all classes (and files) into one large dataset which consists of 47,618 elements (43,744 for files) and we evaluated the bug prediction model build on this large dataset as well. Finally, we also investigated cross-project capabilities of the bug prediction models and datasets. We made the unified dataset publicly available for everyone. By using a public unified dataset as an input for different bug prediction related investigations, researchers can make their studies reproducible, thus able to be validated and verified.


2021 ◽  
pp. 026-035
Author(s):  
A.M. Pokrovskyi ◽  
◽  

The rapid development of software quality measurement methods, the need in efficient and versatile reengineering automatization tools becomes increasingly bigger. This becomes even more apparent when the programming language and respective coding practices slowly develop alongside each other for a long period of time, while the legacy code base grows bigger and remains highly relevant. In this paper, a source code metrics measurement tool for Fortran program quality evaluation is developed. It is implemented as a code module for Photran integrated development environment and based on a set of syntax tree walking algorithms. The module utilizes the built-in Photran syntax analysis engine and the tree data structure which it builds from the source code. The developed tool is also compared to existing source code analysis instruments. The results show that the developed tool is most effective when used in combination with Photran’s built-in refactoring system, and that Photran’s application programming interface facilitates easy scaling of the existing infrastructure by introducing other code analysis methods.


Technologies ◽  
2020 ◽  
Vol 9 (1) ◽  
pp. 3
Author(s):  
Gábor Antal ◽  
Zoltán Tóth ◽  
Péter Hegedűs ◽  
Rudolf Ferenc

Bug prediction aims at finding source code elements in a software system that are likely to contain defects. Being aware of the most error-prone parts of the program, one can efficiently allocate the limited amount of testing and code review resources. Therefore, bug prediction can support software maintenance and evolution to a great extent. In this paper, we propose a function level JavaScript bug prediction model based on static source code metrics with the addition of a hybrid (static and dynamic) code analysis based metric of the number of incoming and outgoing function calls (HNII and HNOI). Our motivation for this is that JavaScript is a highly dynamic scripting language for which static code analysis might be very imprecise; therefore, using a purely static source code features for bug prediction might not be enough. Based on a study where we extracted 824 buggy and 1943 non-buggy functions from the publicly available BugsJS dataset for the ESLint JavaScript project, we can confirm the positive impact of hybrid code metrics on the prediction performance of the ML models. Depending on the ML algorithm, applied hyper-parameters, and target measures we consider, hybrid invocation metrics bring a 2–10% increase in model performances (i.e., precision, recall, F-measure). Interestingly, replacing static NOI and NII metrics with their hybrid counterparts HNOI and HNII in itself improves model performances; however, using them all together yields the best results.


Author(s):  
Oscar E. Perez-Cham ◽  
Carlos Soubervielle- Montalvo ◽  
Alberto S. Nunez-Varela ◽  
Cesar Puente ◽  
Luis J. Ontanon-Garcia

Author(s):  
Paulo Meirelles ◽  
Carlos Santos Jr. ◽  
Joao Miranda ◽  
Fabio Kon ◽  
Antonio Terceiro ◽  
...  

2019 ◽  
Vol 2019 (2) ◽  
pp. 117-126
Author(s):  
Chinmay Hota ◽  
Lov Kumar ◽  
Lalita Bhanu Murthy Neti

Author(s):  
Raquel Fialho de Queiroz Lafetá ◽  
Thiago Fialho de Queiroz Lafetá ◽  
Marcelo de Almeida Maia

A substantial effort, in general, is required for understanding APIs of application frameworks. High-quality API documentation may alleviate the effort, but the production of such documentation still poses a major challenge for modern frameworks. To facilitate the production of framework instantiation documentation, we hypothesize that the framework code itself and the code of existing instantiations provide useful information. However, given the size and complexity of existent code, automated approaches are required to assist the documentation production. Our goal is to assess an automated approach for constructing relevant documentation for framework instantiation based on source code analysis of the framework itself and of existing instantiations. The criterion for defining whether documentation is relevant would be to compare the documentation with an traditional framework documentation, considering the time spent and correctness during instantiation activities, information usefulness, complexity of the activity, navigation, satisfaction, information localization and clarity. We propose an automated approach for constructing relevant documentation for framework instantiation based on source code analysis of the framework itself and of existing instantiations. The proposed approach generates documentation in a cookbook style, where the recipes are programming activities using the necessary API elements driven by the framework features. We performed an empirical study, consisting of three experiments with 44 human subjects executing real framework instantiations aimed at comparing the use of the proposed cookbooks to traditional manual framework documentation (baseline). Our empirical assessment shows that the generated cookbooks performed better or, at least, with non-significant difference when compared to the traditional documentation, evidencing the effectiveness of the approach.


2014 ◽  
Vol 56 (2) ◽  
pp. 183-198 ◽  
Author(s):  
Jacqui Finlay ◽  
Russel Pears ◽  
Andy M. Connor

Sign in / Sign up

Export Citation Format

Share Document