PNS302 IMPROVING THE PERFORMANCE OF PATIENT-LEVEL SIMULATION MODELS USING MULTI-THREADING AND SINGLE INSTRUCTION MULTIPLE DATA (SIMD) OPERATIONS

AbstractSummaryPartial order alignment, which aligns a sequence to a directed acyclic graph, is now frequently used as a key component in long-read error correction and assembly. We present abPOA (adaptive banded Partial Order Alignment), a Single Instruction Multiple Data (SIMD) based C library for fast partial order alignment using adaptive banded dynamic programming. It can work as a stand-alone multiple sequence alignment and consensus calling tool or be easily integrated into any long-read error correction and assembly workflow. Compared to a state-of-the-art tool (SPOA), abPOA is up to 15 times faster with a comparable alignment accuracy.Availability and implementationabPOA is implemented in C. A stand-alone tool and a C/Python software interface are freely available at https://github.com/yangao07/[email protected] or [email protected]

Download Full-text

SIMD (Single Instruction, Multiple Data) Machines

Encyclopedia of Parallel Computing ◽

10.1007/978-0-387-09766-4_2440 ◽

2011 ◽

pp. 1819-1819

Author(s):

Jack Dongarra ◽

Piotr Luszczek ◽

Felix Wolf ◽

Jesper Larsson Träff ◽

Patrice Quinton ◽

...

Keyword(s):

Single Instruction Multiple Data ◽

Multiple Data

Download Full-text

A radix-2 FFT algorithm for modern single instruction multiple data (SIMD) architectures

IEEE International Conference on Acoustics Speech and Signal Processing ◽

10.1109/icassp.2002.1005373 ◽

2002 ◽

Cited By ~ 9

Author(s):

Rodriguez

Keyword(s):

Single Instruction Multiple Data ◽

Multiple Data

Download Full-text

A scalable ASIP for BP Polar decoding with multiple code lengths

MATEC Web of Conferences ◽

10.1051/matecconf/201823201046 ◽

2018 ◽

Vol 232 ◽

pp. 01046

Author(s):

Wan Qiao ◽

Dake Liu

Keyword(s):

Cmos Technology ◽

Single Instruction Multiple Data ◽

Instruction Set ◽

Maximum Throughput ◽

Specific Instruction ◽

Area Efficiency ◽

Multiple Data ◽

High Area ◽

Multiple Code ◽

Application Specific

In this paper, we propose a flexible scalable BP Polar decoding application-specific instruction set processor (PASIP) that supports multiple code lengths (64 to 4096) and any code rates. High throughputs and sufficient programmability are achieved by the single-instruction-multiple-data (SIMD) based architecture and specially designed Polar decoding acceleration instructions. The synthesis result using 65 nm CMOS technology shows that the total area of PASIP is 2.71 mm2. PASIP provides the maximum throughput of 1563 Mbps (for N = 1024) at the work frequency of 400MHz. The comparison with state-of-art Polar decoders reveals PASIP’s high area efficiency.

Download Full-text

A Video Specific Instruction Set Architecture for ASIP design

VLSI Design ◽

10.1155/2007/58431 ◽

2007 ◽

Vol 2007 ◽

pp. 1-7 ◽

Cited By ~ 5

Author(s):

Zheng Shen ◽

Hu He ◽

Yanjun Zhang ◽

Yihe Sun

Keyword(s):

Video Coding ◽

Digital Signal ◽

Digital Signal Processors ◽

Single Instruction Multiple Data ◽

Instruction Set ◽

Instruction Set Architecture ◽

Specific Instruction ◽

Multiple Data ◽

Signal Processors

This paper describes a novel video specific instruction set architecture for ASIP design. With single instruction multiple data (SIMD) instructions, two destination modes, and video specific instructions, an instruction set architecture is introduced to enhance the performance for video applications. Furthermore, we quantify the improvement on H.263 encoding. In this paper, we evaluate and compare the performance of VS-ISA, other DSPs (digital signal processors), and conventional SIMD media extensions in the context of video coding. Our evaluation results show that VS-ISA improves the processor's performance by approximately 5x on H.263 encoding, and VS-ISA outperforms other architectures by 1.6x to 8.57x in computing IDCT.

Download Full-text

Speed improvements of peptide-spectrum matching using Single-Instruction Multiple-Data instructions

PROTEOMICS ◽

10.1002/pmic.201100182 ◽

2011 ◽

Vol 11 (19) ◽

pp. 3779-3785 ◽

Cited By ~ 1

Author(s):

Jian Zhang ◽

Ian McQuillan ◽

Fang-Xiang Wu

Keyword(s):

Single Instruction Multiple Data ◽

Multiple Data ◽

Spectrum Matching

Download Full-text

A High-Performance Parallel FDTD Method Enhanced by Using SSE Instruction Set

International Journal of Antennas and Propagation ◽

10.1155/2012/851465 ◽

2012 ◽

Vol 2012 ◽

pp. 1-10 ◽

Cited By ~ 1

Author(s):

Dau-Chyrh Chang ◽

Lihong Zhang ◽

Xiaoling Yang ◽

Shao-Hsiang Yen ◽

Wenhua Yu

Keyword(s):

High Performance ◽

Fdtd Method ◽

Hardware Acceleration ◽

Single Instruction Multiple Data ◽

Instruction Set ◽

Computer Cluster ◽

Simulation Performance ◽

Acceleration Technique ◽

Multiple Data ◽

Difference Time

We introduce a hardware acceleration technique for the parallel finite difference time domain (FDTD) method using the SSE (streaming (single instruction multiple data) SIMD extensions) instruction set. The implementation of SSE instruction set to parallel FDTD method has achieved the significant improvement on the simulation performance. The benchmarks of the SSE acceleration on both the multi-CPU workstation and computer cluster have demonstrated the advantages of (vector arithmetic logic unit) VALU acceleration over GPU acceleration. Several engineering applications are employed to demonstrate the performance of parallel FDTD method enhanced by SSE instruction set.

Download Full-text

An Implementation of Configurable SIMD Core on FPGA

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.336-338.1925 ◽

2013 ◽

Vol 336-338 ◽

pp. 1925-1929

Author(s):

Guang Wang ◽

Yin Sheng Gao

Keyword(s):

Wireless Communications ◽

Data Processing ◽

Single Instruction Multiple Data ◽

Instruction Set ◽

Instruction Set Architecture ◽

Multiple Data ◽

4G Wireless ◽

Main Components ◽

Computing Speed

In order to meet the computing speed required by 4G wireless communications, and to provide the different data processing widths required by different algorithms, an SIMD (Single Instruction Multiple Data) core has been designed. The ISA (Instruction Set Architecture) and main components of the SIMD core are discussed focus on how the SIMD core can be configured. Finally, the simulation result of the multiplication of two 8*8 matrices is presented to show the execution of instructions in the proposed SIMD core, and the result verifies the correctness of the SIMD core design.

Download Full-text