scholarly journals Is It Dangerous to Use Version Control Histories to Study Source Code Evolution?

Author(s):  
Stas Negara ◽  
Mohsen Vakilian ◽  
Nicholas Chen ◽  
Ralph E. Johnson ◽  
Danny Dig

Version Control Software or Revision Control Software are the most important things in the world of software development. In this paper, we have described two version control tools: Git and Apache Subversion. Git comes as free and open source code management and version control system which is disseminated with the GNU general public license. Apache Subversion abbreviated as SVN is one amongst a software versioning and revision control systems given as open source under Apache License. Git design, its functionality, and usage of Git and SVN are discussed in this paper. The goal of this research paper is to accentuate on GIT and SVN tools, evaluate and compare five version control tools to ascertain their usage and efficacy.


2019 ◽  
Vol 25 (1) ◽  
pp. 790-823
Author(s):  
Yusuf Sulistyo Nugroho ◽  
Hideaki Hata ◽  
Kenichi Matsumoto

Abstract Automatic identification of the differences between two versions of a file is a common and basic task in several applications of mining code repositories. Git, a version control system, has a diff utility and users can select algorithms of diff from the default algorithm Myers to the advanced Histogram algorithm. From our systematic mapping, we identified three popular applications of diff in recent studies. On the impact on code churn metrics in 14 Java projects, we obtained different values in 1.7% to 8.2% commits based on the different diff algorithms. Regarding bug-introducing change identification, we found 6.0% and 13.3% in the identified bug-fix commits had different results of bug-introducing changes from 10 Java projects. For patch application, we found that the Histogram is more suitable than Myers for providing the changes of code, from our manual analysis. Thus, we strongly recommend using the Histogram algorithm when mining Git repositories to consider differences in source code.


F1000Research ◽  
2019 ◽  
Vol 8 ◽  
pp. 1749 ◽  
Author(s):  
John D. Blischak ◽  
Peter Carbonetto ◽  
Matthew Stephens

Making scientific analyses reproducible, well documented, and easily shareable is crucial to maximizing their impact and ensuring that others can build on them. However, accomplishing these goals is not easy, requiring careful attention to organization, workflow, and familiarity with tools that are not a regular part of every scientist's toolbox. We have developed an R package, workflowr, to help all scientists, regardless of background, overcome these challenges. Workflowr aims to instill a particular "workflow" — a sequence of steps to be repeated and integrated into research practice — that helps make projects more reproducible and accessible.This workflow integrates four key elements: (1) version control (via Git); (2) literate programming (via R Markdown); (3) automatic checks and safeguards that improve code reproducibility; and (4) sharing code and results via a browsable website. These features exploit powerful existing tools, whose mastery would take considerable study. However, the workflowr interface is simple enough that novice users can quickly enjoy its many benefits. By simply following the workflowr "workflow", R users can create projects whose results, figures, and development history are easily accessible on a static website — thereby conveniently shareable with collaborators by sending them a URL — and accompanied by source code and reproducibility safeguards. The workflowr R package is open source and available on CRAN, with full documentation and source code available at https://github.com/jdblischak/workflowr.


2021 ◽  
Vol 24 (2) ◽  
Author(s):  
Sivana Hamer ◽  
Christian Quesada-López ◽  
Alexandra Martínez ◽  
Marcelo Jenkins

Many software engineering courses are centered around team-based project development. Analyzing the source code contributions during the projects’ development could provide both instructors and students with constant feedback to identify common trends and behaviors that can be improved during the courses. Evaluating course projects is a challenge due to the difficulty of measuring individual student contributions versus team contributions during the development. The adoption of distributed version control sys-tems like git enable the measurement of students’ and teams’ contributions to the project.In this work, we analyze the contributions within eight software development projects,with 150 students in total, from undergraduate courses that used project-based learning.We generate visualizations of aggregated git metrics using inequality measures and the contribution per module, which offer insights into the practices and processes followed by students and teams throughout the project development. This approach allowed us to identify inequality among students’ contributions, the modules where students con-tributed, development processes with a non-steady pace, and integration practices render-ing a useful feedback tool for instructors and students during the project’s development.Further studies can be conducted to assess the quality, complexity, and ownership of the contributions by analyzing software artifacts. 


Sign in / Sign up

Export Citation Format

Share Document