scholarly journals The Influence of Code Retrieval from the Web on Programmer’s Skills, Meth-odologies, and Coding Behaviors

2021 ◽  
Vol 36 (2) ◽  
pp. 160-166
Author(s):  
Alfaroq O.M. Mohammed ◽  
Ziad A. Abdelnabi ◽  
Abdalmunam Abdalla

The development of software projects consists of several stages, such as analysis and design. It also requires a set of skills that the software developer can use to work on the project, such as specifying the requirements and writing code. Developers usually search for source code on the internet for remix and reuse in software production. This paper aims to investigate the influence and effect of code retrieved from the web on programmers’ views, decisions, and skills. A questionnaire instrument was designed and distributed to programmers for their feedback. As a result, we were able to address some points and achieved a better understanding of the interaction between programmers and the code from the web, especially the code from programming forums such as Stack Over Flow.

Author(s):  
Gregorio Robles ◽  
Jesús M. González-Barahona ◽  
Daniel Izquierdo-Cortazar ◽  
Israel Herraiz

Thanks to the open nature of libre (free, open source) software projects, researchers have gained access to a rich set of data related to various aspects of software development. Although it is usually publicly available on the Internet, obtaining and analyzing the data in a convenient way is not an easy task, and many considerations have to be taken into account. In this chapter we introduce the most relevant data sources that can be found in libre software projects and that are commonly studied by scholars: source code releases, source code management systems, mailing lists and issue (bug) tracking systems. The chapter also provides some advice on the problems that can be found when retrieving and preparing the data sources for a later analysis, as well as information about the tools and datasets that support these tasks.


Author(s):  
Harry H. Cheng ◽  
Dung T. Trang

We have developed a Ch Mechanism Toolkit for analysis and design of mechanisms. It was developed using Ch, an embeddable C/C++ interpreter with extensions. The Ch Mechanism Toolkit allows users to write simple programs for solving complicated planar mechanism problems. As an extension to the toolkit, a Web-based system was created for performing mechanism design and analysis through the internet. This paper will discuss the design and implementation of the Ch Mechanism Toolkit as well as its corresponding web-based system. The web-based mechanism system is especially suitable for distance learning. The web-based system for mechanism design and analysis is available on the Web at http://www.softintegration/webservices/mechanism/.


2016 ◽  
Author(s):  
Stephen Romansky ◽  
Sadegh Charmchi ◽  
Abram Hindle

The business models of software/platform as a service have contributed to developers dependence on the Internet. Developers can rapidly point each other and consumers to the newest software changes with the power of the hyper link. But, developers are not limited to referencing software changes to one another through the web. Other shared hypermedia might include links to: Stack Overflow, Twitter, and issue trackers. This work explores the software traceability of Uniform Resource Locators (URLs) which software developers leave in commit messages and software repositories. URLs are easily extracted from commit messages and source code. Therefore, it would be useful to researchers if URLs provide additional insight on project development. To assess traceability, manual topic labelling is evaluated against automated topic labelling on URL data sets. This work also shows differences between URL data collected from commit messages versus URL data collected from source code. As well, this work explores outlying software projects with many URLs in case these projects do not provide meaningful software relationship information. Results from manual topic labelling show promise under evaluation while automated topic labelling did not yield precise topics. Further investigation of manual and automated topic analysis would be useful.


2012 ◽  
pp. 564-582
Author(s):  
Gregorio Robles ◽  
Jesús M. González-Barahona ◽  
Daniel Izquierdo-Cortazar ◽  
Israel Herraiz

Thanks to the open nature of libre (free, open source) software projects, researchers have gained access to a rich set of data related to various aspects of software development. Although it is usually publicly available on the Internet, obtaining and analyzing the data in a convenient way is not an easy task, and many considerations have to be taken into account. In this chapter we introduce the most relevant data sources that can be found in libre software projects and that are commonly studied by scholars: source code releases, source code management systems, mailing lists and issue (bug) tracking systems. The chapter also provides some advice on the problems that can be found when retrieving and preparing the data sources for a later analysis, as well as information about the tools and datasets that support these tasks.


2021 ◽  
Vol 33 (3) ◽  
pp. 87-100
Author(s):  
Denis Eyzenakh ◽  
Anton Rameykov ◽  
Igor Nikiforov

Over the past decade, the Internet has become the gigantic and richest source of data. The data is used for the extraction of knowledge by performing machine leaning analysis. In order to perform data mining of the web-information, the data should be extracted from the source and placed on analytical storage. This is the ETL-process. Different web-sources have different ways to access their data: either API over HTTP protocol or HTML source code parsing. The article is devoted to the approach of high-performance data extraction from sources that do not provide an API to access the data. Distinctive features of the proposed approach are: load balancing, two levels of data storage, and separating the process of downloading files from the process of scraping. The approach is implemented in the solution with the following technologies: Docker, Kubernetes, Scrapy, Python, MongoDB, Redis Cluster, and СephFS. The results of solution testing are described in this article as well.


Author(s):  
John D. Ferguson ◽  
James Miller

It is now widely accepted that software projects utilizing the Web (e-projects) face many of the same problems and risks experienced with more traditional software projects, only to a greater degree. Further, their characteristics of rapid development cycles combined with high frequency of software releases and adaptations make many of the traditional tools and techniques for modeling defects unsuitable. This paper proposes a simple model to explain and quantify the interaction between generic defect injection and removal processes in e-projects. The model is based upon long standing and highly regarded work from the field of quantitative ecological population modeling. This basic modeling approach is then subsequently tailored to fit the software production process within an e-project context.


2016 ◽  
Author(s):  
Stephen Romansky ◽  
Sadegh Charmchi ◽  
Abram Hindle

The business models of software/platform as a service have contributed to developers dependence on the Internet. Developers can rapidly point each other and consumers to the newest software changes with the power of the hyper link. But, developers are not limited to referencing software changes to one another through the web. Other shared hypermedia might include links to: Stack Overflow, Twitter, and issue trackers. This work explores the software traceability of Uniform Resource Locators (URLs) which software developers leave in commit messages and software repositories. URLs are easily extracted from commit messages and source code. Therefore, it would be useful to researchers if URLs provide additional insight on project development. To assess traceability, manual topic labelling is evaluated against automated topic labelling on URL data sets. This work also shows differences between URL data collected from commit messages versus URL data collected from source code. As well, this work explores outlying software projects with many URLs in case these projects do not provide meaningful software relationship information. Results from manual topic labelling show promise under evaluation while automated topic labelling did not yield precise topics. Further investigation of manual and automated topic analysis would be useful.


Sign in / Sign up

Export Citation Format

Share Document