scholarly journals An Empirical Investigation of Relevant Changes and Automation Needs in Modern Code Review

2020 ◽  
Vol 25 (6) ◽  
pp. 4833-4872
Author(s):  
Sebastiano Panichella ◽  
Nik Zaugg

Abstract Recent research has shown that available tools for Modern Code Review (MCR) are still far from meeting the current expectations of developers. The objective of this paper is to investigate the approaches and tools that, from a developer’s point of view, are still needed to facilitate MCR activities. To that end, we first empirically elicited a taxonomy of recurrent review change types that characterize MCR. The taxonomy was designed by performing three steps: (i) we generated an initial version of the taxonomy by qualitatively and quantitatively analyzing 211 review changes/commits and 648 review comments of ten open-source projects; then (ii) we integrated into this initial taxonomy, topics, and MCR change types of an existing taxonomy available from the literature; finally, (iii) we surveyed 52 developers to integrate eventually missing change types in the taxonomy. Results of our study highlight that the availability of new emerging development technologies (e.g., Cloud-based technologies) and practices (e.g., Continuous delivery) has pushed developers to perform additional activities during MCR and that additional types of feedback are expected by reviewers. Our participants provided recommendations, specified techniques to employ, and highlighted the data to analyze for building recommender systems able to automate the code review activities composing our taxonomy. We surveyed 14 additional participants (12 developers and 2 researchers), not involved in the previous survey, to qualitatively assess the relevance and completeness of the identified MCR change types as well as assess how critical and feasible to implement are some of the identified techniques to support MCR activities. Thus, with a study involving 21 additional developers, we qualitatively assess the feasibility and usefulness of leveraging natural language feedback (automation considered critical/feasible to implement) in supporting developers during MCR activities. In summary, this study sheds some more light on the approaches and tools that are still needed to facilitate MCR activities, confirming the feasibility and usefulness of using summarization techniques during MCR activities. We believe that the results of our work represent an essential step for meeting the expectations of developers and supporting the vision of full or partial automation in MCR.

2021 ◽  
Vol 11 (7) ◽  
pp. 3095
Author(s):  
Suhyune Son ◽  
Seonjeong Hwang ◽  
Sohyeun Bae ◽  
Soo Jun Park ◽  
Jang-Hwan Choi

Multi-task learning (MTL) approaches are actively used for various natural language processing (NLP) tasks. The Multi-Task Deep Neural Network (MT-DNN) has contributed significantly to improving the performance of natural language understanding (NLU) tasks. However, one drawback is that confusion about the language representation of various tasks arises during the training of the MT-DNN model. Inspired by the internal-transfer weighting of MTL in medical imaging, we introduce a Sequential and Intensive Weighted Language Modeling (SIWLM) scheme. The SIWLM consists of two stages: (1) Sequential weighted learning (SWL), which trains a model to learn entire tasks sequentially and concentrically, and (2) Intensive weighted learning (IWL), which enables the model to focus on the central task. We apply this scheme to the MT-DNN model and call this model the MTDNN-SIWLM. Our model achieves higher performance than the existing reference algorithms on six out of the eight GLUE benchmark tasks. Moreover, our model outperforms MT-DNN by 0.77 on average on the overall task. Finally, we conducted a thorough empirical investigation to determine the optimal weight for each GLUE task.


2022 ◽  
Vol 31 (2) ◽  
pp. 1-23
Author(s):  
Jevgenija Pantiuchina ◽  
Bin Lin ◽  
Fiorella Zampetti ◽  
Massimiliano Di Penta ◽  
Michele Lanza ◽  
...  

Refactoring operations are behavior-preserving changes aimed at improving source code quality. While refactoring is largely considered a good practice, refactoring proposals in pull requests are often rejected after the code review. Understanding the reasons behind the rejection of refactoring contributions can shed light on how such contributions can be improved, essentially benefiting software quality. This article reports a study in which we manually coded rejection reasons inferred from 330 refactoring-related pull requests from 207 open-source Java projects. We surveyed 267 developers to assess their perceived prevalence of these identified rejection reasons, further complementing the reasons. Our study resulted in a comprehensive taxonomy consisting of 26 refactoring-related rejection reasons and 21 process-related rejection reasons. The taxonomy, accompanied with representative examples and highlighted implications, provides developers with valuable insights on how to ponder and polish their refactoring contributions, and indicates a number of directions researchers can pursue toward better refactoring recommenders.


2020 ◽  
Vol 34 (05) ◽  
pp. 8592-8599
Author(s):  
Sheena Panthaplackel ◽  
Milos Gligoric ◽  
Raymond J. Mooney ◽  
Junyi Jessy Li

Comments are an integral part of software development; they are natural language descriptions associated with source code elements. Understanding explicit associations can be useful in improving code comprehensibility and maintaining the consistency between code and comments. As an initial step towards this larger goal, we address the task of associating entities in Javadoc comments with elements in Java source code. We propose an approach for automatically extracting supervised data using revision histories of open source projects and present a manually annotated evaluation dataset for this task. We develop a binary classifier and a sequence labeling model by crafting a rich feature set which encompasses various aspects of code, comments, and the relationships between them. Experiments show that our systems outperform several baselines learning from the proposed supervision.


2019 ◽  
Author(s):  
Daniel M. Bean ◽  
James Teo ◽  
Honghan Wu ◽  
Ricardo Oliveira ◽  
Raj Patel ◽  
...  

AbstractAtrial fibrillation (AF) is the most common arrhythmia and significantly increases stroke risk. This risk is effectively managed by oral anticoagulation. Recent studies using national registry data indicate increased use of anticoagulation resulting from changes in guidelines and the availability of newer drugs.The aim of this study is to develop and validate an open source risk scoring pipeline for free-text electronic health record data using natural language processing.AF patients discharged from 1st January 2011 to 1st October 2017 were identified from discharge summaries (N=10,030, 64.6% male, average age 75.3 ± 12.3 years). A natural language processing pipeline was developed to identify risk factors in clinical text and calculate risk for ischaemic stroke (CHA2DS2-VASc) and bleeding (HAS-BLED). Scores were validated vs two independent experts for 40 patients.Automatic risk scores were in strong agreement with the two independent experts for CHA2DS2-VASc (average kappa 0.78 vs experts, compared to 0.85 between experts). Agreement was lower for HAS-BLED (average kappa 0.54 vs experts, compared to 0.74 between experts).In high-risk patients (CHA2DS2-VASc ≥2) OAC use has increased significantly over the last 7 years, driven by the availability of DOACs and the transitioning of patients from AP medication alone to OAC. Factors independently associated with OAC use included components of the CHA2DS2-VASc and HAS-BLED scores as well as discharging specialty and frailty. OAC use was highest in patients discharged under cardiology (69%).Electronic health record text can be used for automatic calculation of clinical risk scores at scale. Open source tools are available today for this task but require further validation. Analysis of routinely-collected EHR data can replicate findings from large-scale curated registries.


Author(s):  
Puspadhar Das

Mifos is an open source enterprise solution for microfinance. This chapter is a practitioner’s point of view on implementation of Mifos in an organization, based on the author’s experience in implementing Mifos at Asomi, a microfinance institution operating in the state of Assam, India. The factors to be considered in selection and implementation of Mifos are discussed. Various inputs, analyses and resources required for implementation are discussed. Any organization must have a concrete set of operational strategies that enables it to track its borrowers and loan portfolio effectively and on time in order to succeed. Wrong assumptions and choice of wrong technology may only aggravate MIS implementation. Development of technology has removed all the barriers to technologies and has enabled organizations to develop computerised systems streamlined to their operational needs and not the other way round. It is attempted to justify this by using the case of Mifos.


Author(s):  
Ramanjit Singh

Wikipedia is a free encyclopedia that operates worldwide on the Internet. Articles on Wikipedia are developed with close collaboration of volunteers and anyone can edit the content (Wikipedia, 2006e). Although there are many advantages of using Wikipedia as a group collaboration tool, there are important implications. First, Wikipedia community is diverse and intercultural differences can distort the communication process. Second, the neutral point of view (NPOV) policy can lead to disputes. Third, lack of supervision and open source policy can be another source of conflict. Forth, administration of articles can be complex due to differing cultural and political stand points (Smith & Kollock, 1999). Laslty, differences in time and space as well as low level of access to the Internet can significantly impede collaboration efforts at Wikipedia (Berry, 2006; Madon, 2000; Parayil, 2006; Sahay, Nicholson, & Krishna, 2003). Hence, the aim of this paper is to examine sociocultural implications of using Wikipedia as a group collaboration tool spanning multiple countries and how social and cultural climate, differences in time and space, as well as technological infrastructure of countries affect collaboration between individuals given the distinctive operational and administration policies at Wikipedia. It is believed that findings from this research will increase the awareness of the underlying cause of many disputes arising at Wikipedia. In addition, this research will lead to cultural relativism and provide neutral grounds for collaborative efforts at Wikipedia in the future.


Author(s):  
B. Rossi ◽  
M. Scotto ◽  
A. Sillitti ◽  
G. Succi

The aim of the article is to report the results of a migration to Open Source Software (OSS) in one public administration. The migration focuses on the office automation field and, in particular, on the OpenOffice.org suite. We have analysed the transition to OSS considering qualitative and quantitative data collected with the aid of different tools. All the data have been always considered from the point of view of the different stakeholders involved, IT managers, IT technicians, and users. The results of the project have been largely satisfactory. However the results cannot be generalised due to some constraints, like the environment considered and the parallel use of the old solution. Nevertheless, we think that the data collected can be of valuable aid to managers wishing to evaluate a possible transition to OSS.


Sign in / Sign up

Export Citation Format

Share Document