Benchmarking implementations of functional languages with ‘Pseudoknot’, a float-intensive benchmark

AbstractOver 25 implementations of different functional languages are benchmarked using the same program, a floating-point intensive application taken from molecular biology. The principal aspects studied are compile time and execution time for the various implementations that were benchmarked. An important consideration is how the program can be modified and tuned to obtain maximal performance on each language implementation. With few exceptions, the compilers take a significant amount of time to compile this program, though most compilers were faster than the then current GNU C compiler (GCC version 2.5.8). Compilers that generate C or Lisp are often slower than those that generate native code directly: the cost of compiling the intermediate form is normally a large fraction of the total compilation time. There is no clear distinction between the runtime performance of eager and lazy implementations when appropriate annotations are used: lazy implementations have clearly come of age when it comes to implementing largely strict applications, such as the Pseudoknot program. The speed of C can be approached by some implementations, but to achieve this performance, special measures such as strictness annotations are required by non-strict implementations. The benchmark results have to be interpreted with care. Firstly, a benchmark based on a single program cannot cover a wide spectrum of ‘typical’ applications. Secondly, the compilers vary in the kind and level of optimisations offered, so the effort required to obtain an optimal version of the program is similarly varied.

Download Full-text

High-Radix Formats for Enhancing Floating-Point FPGA Implementations

Circuits Systems and Signal Processing ◽

10.1007/s00034-021-01855-x ◽

2021 ◽

Author(s):

Julio Villalba ◽

Javier Hormigo

Keyword(s):

Execution Time ◽

Digital Filters ◽

Dynamic Range ◽

Floating Point ◽

Input Output ◽

Point Support ◽

Point Representation ◽

High Radix ◽

The Cost ◽

Very High

AbstractThis article proposes a family of high-radix floating-point representation to efficiently deal with floating-point addition in FPGA devices with no native floating-point support. Since variable shifter implementation (required in any FP adder) has a very high cost in FPGA, high-radix formats considerably reduce the number of possible shifts, decreasing the execution time and area highly. Although the high-radix format produces also a significant penalty in the implementation of multipliers, the experimental results show that the adder improvement overweights the multiplication penalty for most of the practical and common cases (digital filters, matrix multiplications, etc.). We also provide the designer with guidelines on selecting a suitable radix as a function of the ratio between the number of additions and multiplications of the targeted algorithm. For applications with similar numbers of additions and multiplications, the high-radix version may be up to 26% faster and even having a wider dynamic range and using higher number of significant bits. Furthermore, thanks to the proposed efficient converters between the standard IEEE-754 format and our internal high-radix format, the cost of the input/output conversions in FPGA accelerators is negligible.

Download Full-text

AN EXACT SOLUTION FOR THE RESOURCE-CONSTRAINED CONSTRUCTION SCHEDULING PROBLEM

Engineering Structures and Technologies ◽

10.3846/2029882x.2016.1196916 ◽

2016 ◽

Vol 8 (2) ◽

pp. 71-78

Author(s):

Bartłomiej Sroka ◽

Elżbieta Radziszewska-Zielina

Keyword(s):

Execution Time ◽

Critical Path ◽

Scheduling Problem ◽

Resource Constrained ◽

Building Projects ◽

Construction Scheduling ◽

Program Sample ◽

Resource Constrained Scheduling Problem ◽

The Cost ◽

Exponential Complexity

Reduced time and, by the same token, the cost of the project is a crucial factor in contemporary construction. This article presents a method for the exact optimisation of a resource-constrained scheduling problem. Based on the Critical Path Method, graph theory and linear programming, an algorithm was developed and the FROPT program was written in Matlab to minimise the execution time of the task. By using the newly-created program, sample networks were calculated and the results were compared with results obtained by using the MS Project scheduling program (using approximation algorithm). The execution time obtained by using FROPT were on average 10% shorter than those obtained using MS Project. In selected cases the improvement in execution time reached 25%. A deterministic approach to the problem may shorten planned project times and bring financial benefits. Due to the exponential complexity of the algorithm, it is most useful in solving small or highly coherent networks. The algorithm and program may result in benefits not offered by commercial software for planners of building projects.

Download Full-text

Solar Concentrating Systems Using Small Mirror Arrays

Journal of Solar Energy Engineering ◽

10.1115/1.4000332 ◽

2009 ◽

Vol 132 (1) ◽

Cited By ~ 14

Author(s):

Joachim Göttsche ◽

Bernhard Hoffschmidt ◽

Stefan Schmitz ◽

Markus Sauerborn ◽

Reiner Buck ◽

...

Keyword(s):

Power Plants ◽

Raw Materials ◽

Large Fraction ◽

Steel Structure ◽

Steel Structures ◽

Cost Effective ◽

Wind Loads ◽

Drive Systems ◽

Mirror Area ◽

The Cost

The cost of solar tower power plants is dominated by the heliostat field making up roughly 50% of investment costs. Classical heliostat design is dominated by mirrors brought into position by steel structures and drives that guarantee high accuracies under wind loads and thermal stress situations. A large fraction of costs is caused by the stiffness requirements of the steel structure, typically resulting in ∼20 kg/m2 steel per mirror area. The typical cost figure of heliostats (figure mentioned by Solucar at Solar Paces Conference, Seville, 2006) is currently in the area of 150 €/m2 caused by the increasing price of the necessary raw materials. An interesting option to reduce costs lies in a heliostat design where all moving parts are protected from wind loads. In this way, drives and mechanical layout may be kept less robust, thereby reducing material input and costs. In order to keep the heliostat at an appropriate size, small mirrors (around 10×10 cm2) have to be used, which are placed in a box with a transparent cover. Innovative drive systems are developed in order to obtain a cost-effective design. A 0.5×0.5 m2 demonstration unit will be constructed. Tests of the unit are carried out with a high-precision artificial sun unit that imitates the sun’s path with an accuracy of less than 0.5 mrad and creates a beam of parallel light with a divergence of less than 4 mrad.

Download Full-text

Execution time - area tradeoff in gausing residual load decoder: Integrated exploration of chaining based schedule and allocation in HLS for hardware accelerators

Facta universitatis - series Electronics and Energetics ◽

10.2298/fuee1402235s ◽

2014 ◽

Vol 27 (2) ◽

pp. 235-249 ◽

Cited By ~ 1

Author(s):

Anirban Sengupta ◽

Reza Sedaghat ◽

Vipul Mishra

Keyword(s):

Execution Time ◽

Design Space Exploration ◽

Design Space ◽

Space Exploration ◽

Integrated Design ◽

Hardware Accelerators ◽

Average Improvement ◽

Genetic Algorithm Approach ◽

High Level ◽

The Cost

Design space exploration is an indispensable segment of High Level Synthesis (HLS) design of hardware accelerators. This paper presents a novel technique for Area-Execution time tradeoff using residual load decoding heuristics in genetic algorithms (GA) for integrated design space exploration (DSE) of scheduling and allocation. This approach is also able to resolve issues encountered during DSE of data paths for hardware accelerators, such as accuracy of the solution found, as well as the total exploration time during the process. The integrated solution found by the proposed approach satisfies the user specified constraints of hardware area and total execution time (not just latency), while at the same time offers a twofold unified solution of chaining based schedule and allocation. The cost function proposed in the genetic algorithm approach takes into account the functional units, multiplexers and demultiplexers needed during implementation. The proposed exploration system (ExpSys) was tested on a large number of benchmarks drawn from the literature for assessment of its efficiency. Results indicate an average improvement in Quality of Results (QoR) greater than 26% when compared to a recent well known GA based exploration method.

Download Full-text

The Impact of Military Spending on Economic Development

Handbook of Research on Military Expenditure on Economic and Political Resources - Advances in Public Policy and Administration ◽

10.4018/978-1-5225-4778-5.ch010 ◽

2018 ◽

pp. 182-191

Author(s):

Saptarshi Chakraborty

Keyword(s):

Economic Growth ◽

Causal Effect ◽

Large Fraction ◽

Military Spending ◽

Military Expenditure ◽

Policy Debate ◽

External Threats ◽

The Cost ◽

The Military ◽

The Impact

Some countries spend a relatively large percentage of GDP on their militaries in order to preserve or secure their status as global powers. Others do so because they are ruled by military governments or aggressive regimes that pose a military threat to their neighbors or their own populations. It is debatable whether there is a causal relationship between military spending and economic growth in the economy. It is again a policy debate how much to allocate funds for civilian and how much for military expenditure. Under these puzzling results of the impact of military expenditure on economic growth which is frequently found to be non-significant or negative, yet most countries spend a large fraction of their GDP on defense and the military. The chapter tries to investigate the relationship between military spending and economic growth in India. It also sees whether external threats, corruption, and other relevant controls have any causal effect. This chapter obtains that additional expenditure on Indian military in the presence of additional threat is significantly detrimental to growth implying that India cannot afford to fight or demonstrate power at the cost of its development.

Download Full-text

Nuclear Hazard and Asset Prices: Implications of Nuclear Disasters in the Cross-Sectional Behavior of Stock Returns

Sustainability ◽

10.3390/su12229721 ◽

2020 ◽

Vol 12 (22) ◽

pp. 9721

Author(s):

Ana Belén Alonso-Conde ◽

Javier Rojo-Suárez

Keyword(s):

Stock Returns ◽

Factor Model ◽

Large Fraction ◽

Factor Models ◽

Cross Sectional ◽

Risk Premiums ◽

Conditional Capm ◽

Equity Risk ◽

The Cross ◽

The Cost

Using stock return data for the Japanese equity market, for the period from July 1983 to June 2018, we analyze the effect of major nuclear disasters worldwide on Japanese discount rates. For that purpose, we compare the performance of the capital asset pricing model (CAPM) conditional on the event of nuclear disasters with that of the classic CAPM and the Fama–French three- and five-factor models. In order to control for nuclear disasters, we use an instrument that allows us to parameterize the linear stochastic discount factor of the conditional CAPM and transform the classic CAPM into a three-factor model. In this regard, the use of nuclear disasters as an explanatory variable for the cross-sectional behavior of stock returns is a novel contribution of this research. Our results suggest that nuclear disasters account for a large fraction of the variation of stock returns, allowing the CAPM to perform similarly to the Fama–French three- and five-factor models. Furthermore, our results show that, in general, nuclear disasters are positively related to the expected returns of a large number of assets under study. Our results have important implications for the task of estimating the cost of equity and constitute a step forward in understanding the relationship between equity risk premiums and nuclear disasters.

Download Full-text

A Method For Parallel, Automated, Thermal Cycling of Submicroliter Samples

Genome Research ◽

10.1101/gr.164401 ◽

2001 ◽

Vol 11 (3) ◽

pp. 441-447

Author(s):

Jonathan Nakane ◽

David Broemeling ◽

Roger Donaldson ◽

Andre Marziali ◽

Thomas D. Willis ◽

...

Keyword(s):

Dna Sequencing ◽

Thermal Cycling ◽

High Throughput ◽

Dna Analysis ◽

Detection Efficiency ◽

Large Fraction ◽

Cycle Sequencing ◽

High Throughput Dna Sequencing ◽

Reaction Volumes ◽

The Cost

A large fraction of the cost of DNA sequencing and other DNA-analysis processes results from the reagent costs incurred during cycle sequencing or PCR. In particular, the high cost of the enzymes and dyes used in these processes often results in thermal cycling costs exceeding $0.50 per sample. In the case of high-throughput DNA sequencing, this is a significant and unnecessary expense. Improved detection efficiency of new sequencing instrumentation allows the reaction volumes for cycle sequencing to be scaled down to one-tenth of presently used volumes, resulting in at least a 10-fold decrease in the cost of this process. However, commercially available thermal cyclers and automated reaction setup devices have inherent design limitations which make handling volumes of <1 μL extremely difficult. In this paper, we describe a method for thermal cycling aimed at reliable, automated cycling of submicroliter reaction volumes.

Download Full-text

AUTO-TUNING PARALLEL SKELETONS

Parallel Processing Letters ◽

10.1142/s0129626412400051 ◽

2012 ◽

Vol 22 (02) ◽

pp. 1240005 ◽

Cited By ~ 2

Author(s):

ALEXANDER COLLINS ◽

CHRISTIAN FENSCH ◽

HUGH LEATHER

Keyword(s):

Execution Time ◽

Human Expert ◽

Programming Abstraction ◽

Parallel Skeletons ◽

Starting Point ◽

Monte Carlo Search ◽

Exploratory Data ◽

Automatic Search ◽

Two Parameters ◽

The Cost

Parallel skeletons are a structured parallel programming abstraction that provide programmers with a predefined set of algorithmic templates that can be combined, nested and parameterized with sequential code to produce complex programs. The implementation of these skeletons is currently a manual process, requiring human expertise to choose suitable implementation parameters that provide good performance. This paper presents an empirical exploration of the optimization space of the FastFlow parallel skeleton framework. We performed this using a Monte Carlo search of a random subset of the space, for a representative set of platforms and programs. The results show that the space is program and platform dependent, non-linear, and that automatic search achieves a significant average speedup in program execution time of 1.6× over a human expert. An exploratory data analysis of the results shows a linear dependence between two of the parameters, and that another two parameters have little effect on performance. These properties are then used to reduce the size of the space by a factor of 6, reducing the cost of the search. This provides a starting point for automatically optimizing parallel skeleton programs without the need for human expertise, and with a large improvement in execution time compared to that achievable using human expert tuning.

Download Full-text

Design and performance analysis of global path planning techniques for autonomous mobile robots in grid environments

International Journal of Advanced Robotic Systems ◽

10.1177/1729881416663663 ◽

2017 ◽

Vol 14 (2) ◽

pp. 172988141666366 ◽

Cited By ~ 18

Author(s):

Imen Chaari ◽

Anis Koubaa ◽

Hachemi Bennaceur ◽

Adel Ammar ◽

Maram Alajlan ◽

...

Keyword(s):

Path Planning ◽

Execution Time ◽

Large Scale ◽

Optimal Path ◽

Heuristic Methods ◽

Solution Quality ◽

Good Trade ◽

And Performance ◽

Path Planner ◽

The Cost

This article presents the results of the 2-year iroboapp research project that aims at devising path planning algorithms for large grid maps with much faster execution times while tolerating very small slacks with respect to the optimal path. We investigated both exact and heuristic methods. We contributed with the design, analysis, evaluation, implementation and experimentation of several algorithms for grid map path planning for both exact and heuristic methods. We also designed an innovative algorithm called relaxed A-star that has linear complexity with relaxed constraints, which provides near-optimal solutions with an extremely reduced execution time as compared to A-star. We evaluated the performance of the different algorithms and concluded that relaxed A-star is the best path planner as it provides a good trade-off among all the metrics, but we noticed that heuristic methods have good features that can be exploited to improve the solution of the relaxed exact method. This led us to design new hybrid algorithms that combine our relaxed A-star with heuristic methods which improve the solution quality of relaxed A-star at the cost of slightly higher execution time, while remaining much faster than A* for large-scale problems. Finally, we demonstrate how to integrate the relaxed A-star algorithm in the robot operating system as a global path planner and show that it outperforms its default path planner with an execution time 38% faster on average.

Download Full-text

Implementation of Blockchain Consensus Algorithm on Embedded Architecture

Security and Communication Networks ◽

10.1155/2021/9918697 ◽

2021 ◽

Vol 2021 ◽

pp. 1-11

Author(s):

Tarek Frikha ◽

Faten Chaabane ◽

Nadhir Aouinti ◽

Omar Cheikhrouhou ◽

Nader Ben Amor ◽

...

Keyword(s):

Power Consumption ◽

Execution Time ◽

Autonomous Systems ◽

Consensus Algorithm ◽

Iot Applications ◽

Novel Approach ◽

Field Programmable ◽

Positive Rate ◽

Iot Devices ◽

The Cost

The adoption of Internet of Things (IoT) technology across many applications, such as autonomous systems, communication, and healthcare, is driving the market’s growth at a positive rate. The emergence of advanced data analytics techniques such as blockchain for connected IoT devices has the potential to reduce the cost and increase in cloud platform adoption. Blockchain is a key technology for real-time IoT applications providing trust in distributed robotic systems running on embedded hardware without the need for certification authorities. There are many challenges in blockchain IoT applications such as the power consumption and the execution time. These specific constraints have to be carefully considered besides other constraints such as number of nodes and data security. In this paper, a novel approach is discussed based on hybrid HW/SW architecture and designed for Proof of Work (PoW) consensus which is the most used consensus mechanism in blockchain. The proposed architecture is validated using the Ethereum blockchain with the Keccak 256 and the field-programmable gate array (FPGA) ZedBoard development kit. This implementation shows improvement in execution time of 338% and minimizing power consumption of 255% compared to the use of Nvidia Maxwell GPUs.

Download Full-text