code changes Latest Research Papers

Patchworking: Exploring the code changes induced by vulnerability fixing activities

Information and Software Technology ◽

10.1016/j.infsof.2021.106745 ◽

2022 ◽

Vol 142 ◽

pp. 106745

Author(s):

Gerardo Canfora ◽

Andrea Di Sorbo ◽

Sara Forootani ◽

Matias Martinez ◽

Corrado A. Visaggio

Keyword(s):

Code Changes

Download Full-text

Contrasting dedicated model transformation languages versus general purpose languages: a historical perspective on ATL versus Java based on complexity and size

Software & Systems Modeling ◽

10.1007/s10270-021-00937-3 ◽

2021 ◽

Author(s):

Stefan Höppner ◽

Timo Kehrer ◽

Matthias Tichy

Keyword(s):

Model Transformation ◽

Historical Perspective ◽

General Purpose ◽

Model Transformations ◽

Future Research ◽

Model Driven Engineering ◽

Model Driven ◽

Transformation Language ◽

Key Concepts ◽

Code Changes

AbstractModel transformations are among the key concepts of model-driven engineering (MDE), and dedicated model transformation languages (MTLs) emerged with the popularity of the MDE pssaradigm about 15 to 20 years ago. MTLs claim to increase the ease of development of model transformations by abstracting from recurring transformation aspects and hiding complex semantics behind a simple and intuitive syntax. Nonetheless, MTLs are rarely adopted in practice, there is still no empirical evidence for the claim of easier development, and the argument of abstraction deserves a fresh look in the light of modern general purpose languages (GPLs) which have undergone a significant evolution in the last two decades. In this paper, we report about a study in which we compare the complexity and size of model transformations written in three different languages, namely (i) the Atlas Transformation Language (ATL), (ii) Java SE5 (2004–2009), and (iii) Java SE14 (2020); the Java transformations are derived from an ATL specification using a translation schema we developed for our study. In a nutshell, we found that some of the new features in Java SE14 compared to Java SE5 help to significantly reduce the complexity of transformations written in Java by as much as 45%. At the same time, however, the relative amount of complexity that stems from aspects that ATL can hide from the developer, which is about 40% of the total complexity, stays about the same. Furthermore we discovered that while transformation code in Java SE14 requires up to 25% less lines of code, the number of words written in both versions stays about the same. And while the written number of words stays about the same their distribution throughout the code changes significantly. Based on these results, we discuss the concrete advancements in newer Java versions. We also discuss to which extent new language advancements justify writing transformations in a general purpose language rather than a dedicated transformation language. We further indicate potential avenues for future research on the comparison of MTLs and GPLs in a model transformation context.

Download Full-text

Quick remedy commits and their impact on mining software repositories

Empirical Software Engineering ◽

10.1007/s10664-021-10051-z ◽

2021 ◽

Vol 27 (1) ◽

Author(s):

Fengcai Wen ◽

Csaba Nagy ◽

Michele Lanza ◽

Gabriele Bavota

Keyword(s):

Software Maintenance ◽

Noisy Data ◽

Mining Software Repositories ◽

Software Repositories ◽

Bug Fixing ◽

Manual Analysis ◽

Data Points ◽

Software Maintenance And Evolution ◽

Code Changes ◽

Different Parts

AbstractMost changes during software maintenance and evolution are not atomic changes, but rather the result of several related changes affecting different parts of the code. It may happen that developers omit needed changes, thus leaving a task partially unfinished, introducing technical debt or injecting bugs. We present a study investigating “quick remedy commits” performed by developers to implement changes omitted in previous commits. With quick remedy commits we refer to commits that (i) quickly follow a commit performed by the same developer, and (ii) aim at remedying issues introduced as the result of code changes omitted in the previous commit (e.g., fix references to code components that have been broken as a consequence of a rename refactoring) or simply improve the previously committed change (e.g., improve the name of a newly introduced variable). Through a manual analysis of 500 quick remedy commits, we define a taxonomy categorizing the types of changes that developers tend to omit. The taxonomy can (i) guide the development of tools aimed at detecting omitted changes and (ii) help researchers in identifying corner cases that must be properly handled. For example, one of the categories in our taxonomy groups the reverted commits, meaning changes that are undone in a subsequent commit. We show that not accounting for such commits when mining software repositories can undermine one’s findings. In particular, our results show that considering completely reverted commits when mining software repositories accounts, on average, for 0.07 and 0.27 noisy data points when dealing with two typical MSR data collection tasks (i.e., bug-fixing commits identification and refactoring operations mining, respectively).

Download Full-text

On the relation between architectural smells and source code changes

Journal of Software Evolution and Process ◽

10.1002/smr.2398 ◽

2021 ◽

Author(s):

Darius Sas ◽

Paris Avgeriou ◽

Ilaria Pigazzini ◽

Francesca Arcelli Fontana

Keyword(s):

Source Code ◽

Code Changes ◽

Source Code Changes

Download Full-text

Comparing Commit Messages and Source Code Metrics for the Prediction Refactoring Activities

Algorithms ◽

10.3390/a14100289 ◽

2021 ◽

Vol 14 (10) ◽

pp. 289

Author(s):

Priyadarshni Suresh Sagar ◽

Eman Abdulah AlOmar ◽

Mohamed Wiem Mkaouer ◽

Ali Ouni ◽

Christian D. Newman

Keyword(s):

Source Code ◽

Classification Problem ◽

Prediction Performance ◽

Design Improvement ◽

Average Accuracy ◽

Code Metrics ◽

Code Changes ◽

Multi Class Classification ◽

Push Down ◽

Source Code Metrics

Understanding how developers refactor their code is critical to support the design improvement process of software. This paper investigates to what extent code metrics are good indicators for predicting refactoring activity in the source code. In order to perform this, we formulated the prediction of refactoring operation types as a multi-class classification problem. Our solution relies on measuring metrics extracted from committed code changes in order to extract the corresponding features (i.e., metric variations) that better represent each class (i.e., refactoring type) in order to automatically predict, for a given commit, the method-level type of refactoring being applied, namely Move Method, Rename Method, Extract Method, Inline Method, Pull-up Method, and Push-down Method. We compared various classifiers, in terms of their prediction performance, using a dataset of 5004 commits and extracted 800 Java projects. Our main findings show that the random forest model trained with code metrics resulted in the best average accuracy of 75%. However, we detected a variation in the results per class, which means that some refactoring types are harder to detect than others.

Download Full-text

An Effective Model to Predict the Extension of Code Changes in Bug Fixing Process Using Text Classifiers

Iranian Journal of Science and Technology Transactions of Electrical Engineering ◽

10.1007/s40998-021-00458-1 ◽

2021 ◽

Author(s):

Reza Sepahvand ◽

Reza Akbari ◽

Sattar Hashemi ◽

Omid Boushehrian

Keyword(s):

Effective Model ◽

Bug Fixing ◽

Text Classifiers ◽

Code Changes

Download Full-text

Modularis

Proceedings of the VLDB Endowment ◽

10.14778/3484224.3484229 ◽

2021 ◽

Vol 14 (13) ◽

pp. 3308-3321

Author(s):

Dimitrios Koutsoukos ◽

Ingo Müller ◽

Renato Marroquín ◽

Ana Klimovic ◽

Gustavo Alonso

Keyword(s):

Data Analytics ◽

Modular Design ◽

Processing System ◽

Building Blocks ◽

Short Term ◽

Distributed Query Processing ◽

Distributed Query ◽

Code Changes ◽

Hardware Platforms ◽

Cluster Database

The enormous quantity of data produced every day together with advances in data analytics has led to a proliferation of data management and analysis systems. Typically, these systems are built around highly specialized monolithic operators optimized for the underlying hardware. While effective in the short term, such an approach makes the operators cumbersome to port and adapt, which is increasingly required due to the speed at which algorithms and hardware evolve. To address this limitation, we present Modularis , an execution layer for data analytics based on sub-operators , i.e., composable building blocks resembling traditional database operators but at a finer granularity. To demonstrate the feasibility and advantages of our approach, we use Modularis to build a distributed query processing system supporting relational queries running on an RDMA cluster, a serverless cloud platform, and a smart storage engine. Modularis requires minimal code changes to execute queries across these three diverse hardware platforms, showing that the sub-operator approach reduces the amount and complexity of the code to maintain. In fact, changes in the platform affect only those sub-operators that depend on the underlying hardware (in our use cases, mainly the sub-operators related to network communication). We show the end-to-end performance of Modularis by comparing it with a framework for SQL processing (Presto), a commercial cluster database (SingleStore), as well as Query-as-a-Service systems (Athena, BigQuery). Modularis outperforms all these systems, proving that the design and architectural advantages of a modular design can be achieved without degrading performance. We also compare Modularis with a hand-optimized implementation of a join for RDMA clusters. We show that Modularis has the advantage of being easily extensible to a wider range of join variants and group by queries, all of which are not supported in the hand-tuned join.

Download Full-text

Static Analysis at GitHub

Queue ◽

10.1145/3487019.3487022 ◽

2021 ◽

Vol 19 (4) ◽

pp. 42-67

Author(s):

Timothy Clem ◽

Patrick Thomson

Keyword(s):

Static Analysis ◽

Human Behavior ◽

Complex Analysis ◽

User Behavior ◽

Semantic Code ◽

Analysis Techniques ◽

Analysis Tools ◽

Symbolic Code ◽

Incremental Improvement ◽

Code Changes

The Semantic Code team at GitHub builds and operates a suite of technologies that power symbolic code navigation on github.com. We learned that scale is about adoption, user behavior, incremental improvement, and utility. Static analysis in particular is difficult to scale with respect to human behavior; we often think of complex analysis tools working to find potentially problematic patterns in code and then trying to convince the humans to fix them. Our approach took a different tack: use basic analysis techniques to quickly put information that augments our ability to understand programs in front of everyone reading code on GitHub with zero configuration required and almost immediate availability after code changes.

Download Full-text

A classification of code changes and test types dependencies for improving machine learning based test selection

10.1145/3475960.3475987 ◽

2021 ◽

Author(s):

Khaled Al-Sabbagh ◽

Miroslaw Staron ◽

Regina Hebig ◽

Francisco Gomes

Keyword(s):

Machine Learning ◽

Test Selection ◽

Code Changes

Download Full-text

Unsupervised learning of general-purpose embeddings for code changes

10.1145/3472674.3473979 ◽

2021 ◽

Author(s):

Mikhail Pravilov ◽

Egor Bogomolov ◽

Yaroslav Golubev ◽

Timofey Bryksin

Keyword(s):

Unsupervised Learning ◽

General Purpose ◽

Code Changes

Download Full-text

code changes
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Patchworking: Exploring the code changes induced by vulnerability fixing activities

Contrasting dedicated model transformation languages versus general purpose languages: a historical perspective on ATL versus Java based on complexity and size

Quick remedy commits and their impact on mining software repositories

On the relation between architectural smells and source code changes

Comparing Commit Messages and Source Code Metrics for the Prediction Refactoring Activities

An Effective Model to Predict the Extension of Code Changes in Bug Fixing Process Using Text Classifiers

Modularis

Static Analysis at GitHub

A classification of code changes and test types dependencies for improving machine learning based test selection

Unsupervised learning of general-purpose embeddings for code changes

Export Citation Format

code changesRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Patchworking: Exploring the code changes induced by vulnerability fixing activities

Contrasting dedicated model transformation languages versus general purpose languages: a historical perspective on ATL versus Java based on complexity and size

Quick remedy commits and their impact on mining software repositories

On the relation between architectural smells and source code changes

Comparing Commit Messages and Source Code Metrics for the Prediction Refactoring Activities

An Effective Model to Predict the Extension of Code Changes in Bug Fixing Process Using Text Classifiers

Modularis

Static Analysis at GitHub

A classification of code changes and test types dependencies for improving machine learning based test selection

Unsupervised learning of general-purpose embeddings for code changes

code changes
Recently Published Documents