scholarly journals Curating GitHub for engineered software projects

Author(s):  
Nuthan Munaiah ◽  
Steven Kroh ◽  
Craig Cabrey ◽  
Meiyappan Nagappan

Software forges like GitHub host millions of repositories. Software engineering researchers have been able to take advantage of such a large corpora of potential study subjects with the help of tools like GHTorrent and Boa. However, the simplicity in querying comes with a caveat: there are limited means of separating the signal (e.g. repositories containing engineered software projects) from the noise (e.g. repositories containing home work assignments). The proportion of noise in a random sample of repositories could skew the study and may lead to researchers reaching unrealistic, potentially inaccurate, conclusions. We argue that it is imperative to have the ability to sieve out the noise in such large repository forges. We propose a framework, and present a reference implementation of the framework as a tool called reaper, to enable researchers to select GitHub repositories that contain evidence of an engineered software project. We identify software engineering practices (called dimensions) and propose means for validating their existence in a GitHub repository. We used reaper to measure the dimensions of 1,994,977 GitHub repositories. We then used the data set train classifiers capable of predicting if a given GitHub repository contains an engineered software project. The performance of the classifiers was evaluated using a set of 200 repositories with known ground truth classification. We also compared the performance of the classifiers to other approaches to classification (e.g. number of GitHub Stargazers) and found our classifiers to outperform existing approaches. We found stargazers-based classifier to exhibit high precision (96%) but an inversely proportional recall (27%). On the other hand, our best classifier exhibited a high precision (82%) and a high recall (83%). The stargazer-based criteria offers precision but fails to recall a significant potion of the population.

Author(s):  
Nuthan Munaiah ◽  
Steven Kroh ◽  
Craig Cabrey ◽  
Meiyappan Nagappan

Software forges like GitHub host millions of repositories. Software engineering researchers have been able to take advantage of such a large corpora of potential study subjects with the help of tools like GHTorrent and Boa. However, the simplicity in querying comes with a caveat: there are limited means of separating the signal (e.g. repositories containing engineered software projects) from the noise (e.g. repositories containing home work assignments). The proportion of noise in a random sample of repositories could skew the study and may lead to researchers reaching unrealistic, potentially inaccurate, conclusions. We argue that it is imperative to have the ability to sieve out the noise in such large repository forges. We propose a framework, and present a reference implementation of the framework as a tool called reaper, to enable researchers to select GitHub repositories that contain evidence of an engineered software project. We identify software engineering practices (called dimensions) and propose means for validating their existence in a GitHub repository. We used reaper to measure the dimensions of 1,994,977 GitHub repositories. We then used the data set train classifiers capable of predicting if a given GitHub repository contains an engineered software project. The performance of the classifiers was evaluated using a set of 200 repositories with known ground truth classification. We also compared the performance of the classifiers to other approaches to classification (e.g. number of GitHub Stargazers) and found our classifiers to outperform existing approaches. We found stargazers-based classifier to exhibit high precision (96%) but an inversely proportional recall (27%). On the other hand, our best classifier exhibited a high precision (82%) and a high recall (83%). The stargazer-based criteria offers precision but fails to recall a significant potion of the population.


Author(s):  
Yves Wautelet ◽  
Christophe Schinckus ◽  
Manuel Kolp

This article presents an epistemological reading of knowledge evolution in software engineering (SE) both within a software project and into SE theoretical frameworks principally modeling languages and software development life cycles (SDLC). The article envisages SE as an artificial science and notably points to the use of iterative development as a more adequate framework for the enterprise applications. Iterative development has become popular in SE since it allows a more efficient knowledge acquisition process especially in user intensive applications by continuous organizational modeling and requirements acquisition, early implementation and testing, modularity,… SE is by nature a human activity: analysts, designers, developers and other project managers confront their visions of the software system they are building with users’ requirements. The study of software projects’ actors and stakeholders using Simon’s bounded rationality points to the use of an iterative development life cycle. The later, indeed, allows to better apprehend their rationality. Popper’s knowledge growth principle could at first seem suited for the analysis of the knowledge evolution in the SE field. However, this epistemology is better adapted to purely hard sciences as physics than to SE which also takes roots in human activities and by the way in social sciences. Consequently, we will nuance the vision using Lakatosian epistemology notably using his falsification principle criticism on SE as an evolving science. Finally the authors will point to adaptive rationality for a lecture of SE theorists and researchers’ rationality.


Author(s):  
Yves Wautelet ◽  
Christophe Schinckus ◽  
Manuel Kolp

This article presents an epistemological reading of knowledge evolution in software engineering (SE) both within a software project and into SE theoretical frameworks principally modeling languages and software development life cycles (SDLC). The article envisages SE as an artificial science and notably points to the use of iterative development as a more adequate framework for the enterprise applications. Iterative development has become popular in SE since it allows a more efficient knowledge acquisition process especially in user intensive applications by continuous organizational modeling and requirements acquisition, early implementation and testing, modularity,… SE is by nature a human activity: analysts, designers, developers and other project managers confront their visions of the software system they are building with users’ requirements. The study of software projects’ actors and stakeholders using Simon’s bounded rationality points to the use of an iterative development life cycle. The later, indeed, allows to better apprehend their rationality. Popper’s knowledge growth principle could at first seem suited for the analysis of the knowledge evolution in the SE field. However, this epistemology is better adapted to purely hard sciences as physics than to SE which also takes roots in human activities and by the way in social sciences. Consequently, we will nuance the vision using Lakatosian epistemology notably using his falsification principle criticism on SE as an evolving science. Finally the authors will point to adaptive rationality for a lecture of SE theorists and researchers’ rationality.


Author(s):  
Roy Gelbard ◽  
Jeffrey Kantor ◽  
Liran Edelist

This study proposes and prototypes a model that integrates these three aspects of software projects by automatically mapping SE objects and accounting–costing objects into PM objects. To validate the feasibility of the model and without loss of generality, it is demonstrated using former research platform focused on conversion of data flow diagrams (DFD), which are actually full enterprise set of use cases diagrams reflecting entire system-software project into Gantt charts.


2020 ◽  
Vol 6 (3) ◽  
pp. 27-34
Author(s):  
E.J. Robles Gómez ◽  
J.A. Flores Lara ◽  
J.C. Ontiveros Neri

El juego getKanban es una herramienta para enseñar la metodología Kanban y SCRUM de una manera divertida. Facilita la enseñanza de la gestión de proyectos de software a través de un juego de mesa, donde los jugadores aprenden a formular estrategias de gestión de proyectos y las implementan para elaborar proyectos de calidad en tiempo y forma. El presente artículo muestra los resultados de la implementación del juego en una institución educativa de nivel superior, con alumnos de Ingeniería en Sistemas Computacionales de octavo semestre. Se puede apreciar que al utilizar este juego ayuda de manera efectiva a la enseñanza de Kanban y SCRUM, para la gestión de proyectos de software. Por lo cual se recomienda poder implementar este tipo de juegos como estrategia didáctica para la enseñanza/aprendizaje de Ingeniería de Software aplicada a la Gestión de Proyectos de Desarrollo de Software. The game Kanban is a tool to teach the methodology in a fun way. It facilitates the teaching of software project management through where players learn to formulate strategies and implement them to develop quality projects on time Delivery. This article shows the results of the implementation of the game in an educational institution of higher level, with students of Computer Systems Engineering eighth semester. It can be seen that by using this game it helps in an effective way to teach Kanban for the management of software projects. Therefore, it is recommended to be able to implement this type of games as a didactic strategy for the teaching / learning of Software Engineering applied to the Management of Software Development Projects


Author(s):  
Pankaj Kamthan

The steady rise of open source software (OSS) (Raymond, 1999) over the last few decades has made a noticeable impact on many sectors of society where software has a role to play. As reflected from the frequency of media articles, traffic on mailing lists, and growing research literature, OSS has garnered much support in the software community. Indeed, from the early days of GNU software, to X Window System, to Linux and its utilities, and more recently the Apache Software Project, OSS has changed the way software is developed and used. As the deployment of OSS increases, the issue of its quality with respect to its stakeholders arises. We contend that the open source community collectively bears responsibility of producing “high-quality” OSS. Lack of quality raises various risks for organizations adopting OSS (Golden, 2004). This article discusses the manifestation of quality in open source software development (OSSD) from a traditional software engineering standpoint. The organization is as follows. We first outline the background and related work necessary for the discussion that follows, and state our position. This is followed by a detailed treatment of key software engineering practices that directly or indirectly impact the quality of OSS. Next, challenges and directions for future research are outlined and, finally, concluding remarks are given.


Author(s):  
Roy Gelbard ◽  
Jeffrey Kantor ◽  
Liran Edelist

This study proposes and prototypes a model that integrates these three aspects of software projects by automatically mapping SE objects and accounting–costing objects into PM objects. To validate the feasibility of the model and without loss of generality, it is demonstrated using former research platform focused on conversion of data flow diagrams (DFD), which are actually full enterprise set of use cases diagrams reflecting entire system-software project into Gantt charts.


Author(s):  
Michael Hahsler

Several successful projects (Linux, Free-BSD, BIND, Apache, etc.) showed that the collaborative and self-organizing process of developing open source software produces reliable, high quality software. Without doubt, the open source software development process differs in many ways from the traditional development process in a commercial environment. An interesting research question is how these differences influence the adoption of traditional software engineering practices. In this chapter we investigate how design patterns, a widely accepted software engineering practice, are adopted by open source developers for documenting changes. We analyze the development process of almost 1,000 open source software projects using version control information and explore differences in pattern adoption using characteristics of projects and developers. By analyzing these differences, we provide evidence that design patterns are an important practice in open source projects and that there exist significant differences between developers who use design patterns and who do not.


2020 ◽  
Vol 12 (11) ◽  
pp. 4663 ◽  
Author(s):  
Mehwish Naseer ◽  
Wu Zhang ◽  
Wenhao Zhu

Software engineering is a competitive field in education and practice. Software projects are key elements of software engineering courses. Software projects feature a fusion of process and product. The process reflects the methodology of performing the overall software engineering practice. The software product is the final product produced by applying the process. Like any other academic domain, an early evaluation of the software product being developed is vital to identify the at-risk teams for sustainable education in software engineering. Guidance and instructor attention can help overcome the confusion and difficulties of low performing teams. This study proposed a hybrid approach of information gain feature selection with a J48 decision tree to predict the earliest possible phase for final performance prediction. The proposed technique was compared with the state-of-the-art machine learning (ML) classifiers, naïve Bayes (NB), artificial neural network (ANN), logistic regression (LR), simple logistic regression (SLR), repeated incremental pruning to produce error reduction (RIPPER), and sequential minimal optimization (SMO). The goal of this process is to predict the teams expected to obtain a below-average grade in software product development. The proposed technique outperforms others in the prediction of low performing teams at an early assessment stage. The proposed J48-based technique outperforms others by making 89% correct predictions.


2013 ◽  
Vol 13 (Special-Issue) ◽  
pp. 75-87
Author(s):  
Yirui Zhang ◽  
Ying Jin ◽  
Jianxiu Bai ◽  
Jing Zhang

Abstract The consistent system requirements set is the basis of successful software projects. The requirements change is very usual in a software project, and it may cause inconsistency of the requirements set, and become the key factor that affects the quality of the requirements and the software. Aiming at the problem of requirements inconsistencies caused by the requirements change, this paper proposes a compromise-based negotiation framework to manage the requirements changes, illustrates the efficiency of the proposed method by a software engineering case, gives a contrast experiment with the current mainstream method, and finally gives a comparison with the related work and a conclusion. The experimental results show that the framework proposed in this paper is more flexible and accurate than the results of the current popular framework, so it is more suitable for the requirement changes management.


Sign in / Sign up

Export Citation Format

Share Document