Building trustworthy refactoring tools

Mapping Intimacies ◽

10.29007/z7pq ◽

2018 ◽

Author(s):

Simon Thompson

Keyword(s):

Functional Programming ◽

Source Code ◽

Unit Testing ◽

Refactoring Tools ◽

Extract Information

The bar for adoption of refactoring tools is high: not only does a refactoring extract information from your source code, it also transforms it, often in a radical way.After discussing what users require from their tools, we will examine ways in which tool builders can try to increase their users' confidence in the tools. These mechanisms include visualisation, unit testing, property-based testing and verification, and are based on the Kent functional programming group's experience of building the HaRe and Wrangler refactoring systems for Haskell and Erlang.

Download Full-text

Automatic Acquisition of Annotated Training Corpora for Test-Code Generation

Information ◽

10.3390/info10020066 ◽

2019 ◽

Vol 10 (2) ◽

pp. 66

Author(s):

Magdalena Kacmajor ◽

John Kelleher

Keyword(s):

Natural Language ◽

Code Generation ◽

Source Code ◽

Generative Models ◽

Training Data ◽

Training Dataset ◽

Unit Testing ◽

Test Automation ◽

Parallel Corpora ◽

Parallel Text

Open software repositories make large amounts of source code publicly available. Potentially, this source code could be used as training data to develop new, machine learning-based programming tools. For many applications, however, raw code scraped from online repositories does not constitute an adequate training dataset. Building on the recent and rapid improvements in machine translation (MT), one possibly very interesting application is code generation from natural language descriptions. One of the bottlenecks in developing these MT-inspired systems is the acquisition of parallel text-code corpora required for training code-generative models. This paper addresses the problem of automatically synthetizing parallel text-code corpora in the software testing domain. Our approach is based on the observation that self-documentation through descriptive method names is widely adopted in test automation, in particular for unit testing. Therefore, we propose synthesizing parallel corpora comprised of parsed test function names serving as code descriptions, aligned with the corresponding function bodies. We present the results of applying one of the state-of-the-art MT methods on such a generated dataset. Our experiments show that a neural MT model trained on our dataset can generate syntactically correct and semantically relevant short Java functions from quasi-natural language descriptions of functionality.

Download Full-text

GeCaP: Unit Testing Case Generation from Java Source Code

Polibits ◽

10.17562/pb-57-7 ◽

2018 ◽

Vol 57 ◽

pp. 67-73

Author(s):

D. Larrosa ◽

P. Fernandez ◽

M. Delgado

Keyword(s):

Source Code ◽

Unit Testing ◽

Testing Case

Download Full-text

Kronos: A Declarative Metaprogramming Language for Digital Signal Processing

Computer Music Journal ◽

10.1162/comj_a_00330 ◽

2015 ◽

Vol 39 (4) ◽

pp. 30-48 ◽

Cited By ~ 4

Author(s):

Vesa Norilo

Keyword(s):

Signal Processing ◽

Functional Programming ◽

Programming Model ◽

Source Code ◽

Reactive Systems ◽

Digital Signal ◽

Elementary Level ◽

Programming Paradigm ◽

Wide Range ◽

Programming Paradigms

Kronos is a signal-processing programming language based on the principles of semifunctional reactive systems. It is aimed at efficient signal processing at the elementary level, and built to scale towards higher-level tasks by utilizing the powerful programming paradigms of “metaprogramming” and reactive multirate systems. The Kronos language features expressive source code as well as a streamlined, efficient runtime. The programming model presented is adaptable for both sample-stream and event processing, offering a cleanly functional programming paradigm for a wide range of musical signal-processing problems, exemplified herein by a selection and discussion of code examples.

Download Full-text

Direct-Indirect Link Matrix

International Journal of Information Technology Project Management ◽

10.4018/ijitpm.2020100105 ◽

2020 ◽

Vol 11 (4) ◽

pp. 56-69

Author(s):

Saurabh Rawat ◽

Anushree Sah ◽

Ankur Dumka

Keyword(s):

Software Development ◽

Software Testing ◽

Source Code ◽

Black Box ◽

Test Cases ◽

Unit Testing ◽

Integration Testing ◽

Testing Technique ◽

Testing Unit ◽

Black Box Testing

Testing of software remains a fundamentally significant way to check that software behaves as required. Component-based software testing (CBST) is a crucial activity of component-based software development (CBSD) and is based on two crucial proportions: components testing by developers with the source code (e.g., system testing, integration testing, unit testing, etc.) and components testing by end users without source code (black box testing). This work proposes a black box testing technique that calculates the total number of interactions made by component-based software. This technique is helpful to identify the number of test cases for those components where availability of source code is questionable. On the basis of interaction among components, the authors draw a component-link graph and a direct-indirect-link matrix, which helps to calculate the number of interactions in component-based software.

Download Full-text

Investigating the Effect of Aspect-Oriented Refactoring on the Unit Testing Effort of Classes: An Empirical Evaluation

International Journal of Software Engineering and Knowledge Engineering ◽

10.1142/s0218194017500280 ◽

2017 ◽

Vol 27 (05) ◽

pp. 749-789 ◽

Cited By ~ 1

Author(s):

Mourad Badri ◽

Aymen Kout ◽

Linda Badri

Keyword(s):

Statistical Tests ◽

Source Code ◽

Empirical Evaluation ◽

Software Systems ◽

Test Cases ◽

Unit Testing ◽

Unit Test ◽

Testing Effort ◽

Using Data ◽

The Impact

This paper aims at investigating empirically the effect of aspect-oriented (AO) refactoring on the unit testability of classes in object-oriented software. The unit testability of classes has been addressed from the perspective of the unit testing effort, and particularly from the perspective of the unit test cases (TCs) construction. We investigated, in fact, different research questions: (1) the impact of AO refactoring on source code attributes (size, complexity, coupling, cohesion and inheritance), attributes that are mostly related to the unit testability of classes, (2) the impact of AO refactoring on unit test code attributes (size, assertions, invocations and data creation), attributes that are indicators of the effort involved to write the code of unit TCs, and (3) the relationships between the variations observed after AO refactoring in both source code and unit test code attributes. We used in the study different techniques: correlation analysis, statistical tests and linear regression. We performed an empirical evaluation using data collected from three well-known open source (Java) software systems (JHOTDRAW, HSQLBD and PETSTORE) that have been refactored using AO programming (AspectJ). Results suggest that: (1) overall, the effort involved in the construction of unit TCs of refactored classes has been reduced, (2) the variations of source code attributes have more impact on methods invocation between unit TCs, and finally (3) the variations of unit test code attributes are more influenced by the variation of the complexity of refactored classes compared to the other class attributes.

Download Full-text

RedOak: a reference-free and alignment-free structure for indexing a collection of similar genomes

10.1101/2020.12.19.423583 ◽

2020 ◽

Author(s):

Clement Agret ◽

Annie Chateau ◽

Gaetan Droc ◽

Gautier Sarah ◽

Alban Mancheron ◽

...

Keyword(s):

Software Package ◽

High Throughput Sequencing ◽

Source Code ◽

Free Software ◽

Large Collection ◽

Sequencing Technologies ◽

Alignment Free ◽

The Cost ◽

New Algorithms ◽

Extract Information

AbstractBackgroundAs the cost of DNA sequencing decreases, high-throughput sequencing technologies become increasingly accessible to many laboratories. Consequently, new issues emerge that require new algorithms, including tools for indexing and compressing hundred to thousands of complete genomes.ResultsThis paper presents RedOak, a reference-free and alignment-free software package that allows for the indexing of a large collection of similar genomes.RedOak can also be applied to reads from unassembled genomes, and it provides a nucleotide sequence query function. This software is based on a k-mer approach and has been developed to be heavily parallelized and distributed on several nodes of a cluster. The source code of our RedOak algorithm is available at https://gitlab.info-ufr.univ-montp2.fr/DoccY/RedOak.ConclusionsRedOak may be really useful for biologists and bioinformaticians expecting to extract information from large sequence datasets.

Download Full-text

Pipelined functional tree accesses and updates: scheduling, synchronization, caching and coherence

Journal of Functional Programming ◽

10.1017/s0956796801003793 ◽

2001 ◽

Vol 11 (4) ◽

pp. 359-393

Author(s):

ANDREW J. BENNETT ◽

PAUL H. J. KELLY ◽

ROSS A. PATERSON

Keyword(s):

Functional Programming ◽

Program Transformation ◽

Source Code ◽

Search Tree ◽

Binary Search ◽

Binary Search Tree ◽

Functional Program ◽

Spatial Locality ◽

Random Keys ◽

Parallel Graph

This paper is an exploration of the parallel graph reduction approach to parallel functional programming, illustrated by a particular example: pipelined, dynamically-scheduled implementation of search, updates and read-modify-write transactions on an in-store binary search tree. We use program transformation, execution-driven simulation and analytical modelling to expose the maximum potential parallelism, the minimum communication and synchronisation overheads, and to control the overall space requirement. We begin with a lazy functional program specifying a series of transactions on a binary tree, each involving several searches and updates, in a side-effect-free fashion. Transformation of the source code produces a formulation of the program with greater locality and larger grain size than can be achieved using naive parallelization methods, and we show that, with care, these tasks can be scheduled effectively. Even with a workload using random keys, significant spatial locality is found, and we evaluate a modified cache coherency protocol which avoids false sharing so that large cache lines can be used to minimise the number of messages required. As expected with a pipeline, the application should reach a steady state as soon as the first transaction is completed. However, if the network latency is too large, the rate of completion lags behind the rate at which work is admitted, and internal queues grow without bound. We determine the conditions under which this occurs, and show how it can be avoided while maximising speedup.

Download Full-text

Using the functional programming library for solving numerical problems on graphics accelerators with CUDA technology

Proceedings of the Institute for System Programming of RAS ◽

10.15514/ispras-2021-33(5)-10 ◽

2021 ◽

Vol 33 (5) ◽

pp. 167-180

Author(s):

Mikhail Mikhailovich Krasnov ◽

Olga Borisovna Feodoritova

Keyword(s):

Shared Memory ◽

Functional Programming ◽

Source Code ◽

Single Source ◽

Computing Device ◽

Source Codes ◽

Programming Library ◽

Single Text ◽

Cuda Technology ◽

Daunting Task

Modern graphics accelerators (GPUs) can significantly speed up the execution of numerical tasks. However, porting programs to graphics accelerators is not an easy task. Sometimes the transfer of programs to such accelerators is carried out by almost completely rewriting them (for example, when using the OpenCL technology). This raises the daunting task of maintaining two independent source codes. However, CUDA graphics accelerators, thanks to technology developed by NVIDIA, allow you to have a single source code for both conventional processors (CPUs) and CUDA. The machine code generated when compiling this single text depends on which compiler it is compiled with (the usual one, such as gcc, icc and msvc, or the compiler for CUDA, nvcc). However, in this single source code, you need to somehow tell the compiler which parts of this code to parallelize on shared memory. For the CPU, this is usually done using OpenMP and special pragmas to the compiler. For CUDA, parallelization is done in a completely different way. The use of the functional programming library developed by the authors allows you to hide the use of one or another parallelization mechanism on shared memory within the library and make the user source code completely independent of the computing device used (CPU or CUDA). This article shows how this can be done.

Download Full-text