Real-Time Speech-To-Text / Text-To-Speech Converter With Automatic Text Summarizer using Natural Language Generation And Abstract Meaning Representation

Due to extensive needs for growth in various sectors, which include software, telecom, healthcare, defence, etc., there is a necessary increase in the number as well as the duration of meetings, conference calls, reconnaissance stakeouts, financial reviews. The obtained reports of these play a significant role in defining the plan of actions. The proposed model is to convert real-time speech to corresponding text and then to its respective summary using Natural Language Grammar (NLG) and Abstract Meaning Representation (AMR) graphs and then again turned back the obtained summary to speech. The proposed model intends to achieve the task using two major algorithms, 1) Deep Speech 2, 2) AMR graphs. The speech-recognition model recommended has a speedup of 4x if the algorithm runs on a Central Processing Unit (CPU), and the use of particular Graphics Processing Units (GPUs) for running deep learning algorithms can give a speedup of 21x. The performance of the summarizer used is close to the Lead-3-AMR-Baseline model, which is a solid baseline for the CNN/Dailymail dataset. The summarizer we use scores ROGUE score close to the Lead-3- AMR-Baseline model with an accuracy of 99.37%.

Download Full-text

PI-FLAME: A parallel immune system simulator using the FLAME graphic processing unit environment

SIMULATION ◽

10.1177/0037549716673724 ◽

2016 ◽

Vol 93 (1) ◽

pp. 69-84 ◽

Cited By ~ 6

Author(s):

Shailesh Tamrakar ◽

Paul Richmond ◽

Roshan M D’Souza

Keyword(s):

Immune System ◽

Graphics Processing Units ◽

Processing Unit ◽

Human Immune System ◽

Innate And Adaptive Immunity ◽

Agent Based ◽

Central Processing ◽

Agent Simulation ◽

Study Population ◽

Graphics Processing

Agent-based models (ABMs) are increasingly being used to study population dynamics in complex systems, such as the human immune system. Previously, Folcik et al. (The basic immune simulator: an agent-based model to study the interactions between innate and adaptive immunity. Theor Biol Med Model 2007; 4: 39) developed a Basic Immune Simulator (BIS) and implemented it using the Recursive Porous Agent Simulation Toolkit (RePast) ABM simulation framework. However, frameworks such as RePast are designed to execute serially on central processing units and therefore cannot efficiently handle large model sizes. In this paper, we report on our implementation of the BIS using FLAME GPU, a parallel computing ABM simulator designed to execute on graphics processing units. To benchmark our implementation, we simulate the response of the immune system to a viral infection of generic tissue cells. We compared our results with those obtained from the original RePast implementation for statistical accuracy. We observe that our implementation has a 13× performance advantage over the original RePast implementation.

Download Full-text

An Accelerated 3D Navier–Stokes Solver for Flows in Turbomachines

Journal of Turbomachinery ◽

10.1115/1.4001192 ◽

2010 ◽

Vol 133 (2) ◽

Cited By ~ 43

Author(s):

Tobias Brandvik ◽

Graham Pullan

Keyword(s):

Graphics Processing Units ◽

Three Dimensional ◽

Navier Stokes ◽

Linear Scaling ◽

Test Case ◽

Processing Unit ◽

Central Processing ◽

Order Of Magnitude ◽

Graphics Processing ◽

Good Agreement

A new three-dimensional Navier–Stokes solver for flows in turbomachines has been developed. The new solver is based on the latest version of the Denton codes but has been implemented to run on graphics processing units (GPUs) instead of the traditional central processing unit. The change in processor enables an order-of-magnitude reduction in run-time due to the higher performance of the GPU. The scaling results for a 16 node GPU cluster are also presented, showing almost linear scaling for typical turbomachinery cases. For validation purposes, a test case consisting of a three-stage turbine with complete hub and casing leakage paths is described. Good agreement is obtained with previously published experimental results. The simulation runs in less than 10 min on a cluster with four GPUs.

Download Full-text

Controllers: An abstraction to ease the use of hardware accelerators

The International Journal of High Performance Computing Applications ◽

10.1177/1094342017702962 ◽

2017 ◽

Vol 32 (6) ◽

pp. 838-853 ◽

Cited By ~ 4

Author(s):

Ana Moreton–Fernandez ◽

Hector Ortega–Arranz ◽

Arturo Gonzalez–Escribano

Keyword(s):

Graphics Processing Units ◽

High Performance ◽

Abstract Entity ◽

Hardware Accelerators ◽

Processing Unit ◽

Central Processing ◽

Computing Platforms ◽

Graphics Processing ◽

Performance Computing ◽

Selection Of

Nowadays the use of hardware accelerators, such as the graphics processing units or XeonPhi coprocessors, is key in solving computationally costly problems that require high performance computing. However, programming solutions for an efficient deployment for these kind of devices is a very complex task that relies on the manual management of memory transfers and configuration parameters. The programmer has to carry out a deep study of the particular data that needs to be computed at each moment, across different computing platforms, also considering architectural details. We introduce the controller concept as an abstract entity that allows the programmer to easily manage the communications and kernel launching details on hardware accelerators in a transparent way. This model also provides the possibility of defining and launching central processing unit kernels in multi-core processors with the same abstraction and methodology used for the accelerators. It internally combines different native programming models and technologies to exploit the potential of each kind of device. Additionally, the model also allows the programmer to simplify the proper selection of values for several configuration parameters that can be selected when a kernel is launched. This is done through a qualitative characterization process of the kernel code to be executed. Finally, we present the implementation of the controller model in a prototype library, together with its application in several case studies. Its use has led to reductions in the development and porting costs, with significantly low overheads in the execution times when compared to manually programmed and optimized solutions which directly use CUDA and OpenMP.

Download Full-text

Graphics processing unit implementation of the F-statistic for continuous gravitational wave searches

Classical and Quantum Gravity ◽

10.1088/1361-6382/ac4616 ◽

2021 ◽

Author(s):

Liam Dunn ◽

Patrick Clearwater ◽

Andrew Melatos ◽

Karl Wette

Keyword(s):

Gravitational Wave ◽

Graphics Processing Units ◽

Graphics Processing Unit ◽

Computational Cost ◽

Processing Unit ◽

Central Processing ◽

Long Baseline ◽

Using Data ◽

Graphics Processing ◽

Gpu Implementation

Abstract The F-statistic is a detection statistic used widely in searches for continuous gravitational waves with terrestrial, long-baseline interferometers. A new implementation of the F-statistic is presented which accelerates the existing "resampling" algorithm using graphics processing units (GPUs). The new implementation runs between 10 and 100 times faster than the existing implementation on central processing units without sacrificing numerical accuracy. The utility of the GPU implementation is demonstrated on a pilot narrowband search for four newly discovered millisecond pulsars in the globular cluster Omega Centauri using data from the second Laser Interferometer Gravitational-Wave Observatory observing run. The computational cost is 17:2 GPU-hours using the new implementation, compared to 1092 core-hours with the existing implementation.

Download Full-text

High-performance computing in water resources hydrodynamics

Journal of Hydroinformatics ◽

10.2166/hydro.2020.163 ◽

2020 ◽

Vol 22 (5) ◽

pp. 1217-1235 ◽

Cited By ~ 3

Author(s):

M. Morales-Hernández ◽

M. B. Sharif ◽

S. Gangrade ◽

T. T. Dullo ◽

S.-C. Kao ◽

...

Keyword(s):

Water Resources ◽

High Performance Computing ◽

Graphics Processing Units ◽

High Performance ◽

Large Scale ◽

Test Case ◽

Processing Unit ◽

Central Processing ◽

Graphics Processing ◽

Performance Computing

Abstract This work presents a vision of future water resources hydrodynamics codes that can fully utilize the strengths of modern high-performance computing (HPC). The advances to computing power, formerly driven by the improvement of central processing unit processors, now focus on parallel computing and, in particular, the use of graphics processing units (GPUs). However, this shift to a parallel framework requires refactoring the code to make efficient use of the data as well as changing even the nature of the algorithm that solves the system of equations. These concepts along with other features such as the precision for the computations, dry regions management, and input/output data are analyzed in this paper. A 2D multi-GPU flood code applied to a large-scale test case is used to corroborate our statements and ascertain the new challenges for the next-generation parallel water resources codes.

Download Full-text

An Accelerated 3D Navier-Stokes Solver for Flows in Turbomachines

Volume 7: Turbomachinery, Parts A and B ◽

10.1115/gt2009-60052 ◽

2009 ◽

Cited By ~ 20

Author(s):

Tobias Brandvik ◽

Graham Pullan

Keyword(s):

Graphics Processing Units ◽

Three Dimensional ◽

Navier Stokes ◽

Linear Scaling ◽

Test Case ◽

Processing Unit ◽

Central Processing ◽

Order Of Magnitude ◽

Graphics Processing ◽

Good Agreement

A new three-dimensional Navier-Stokes solver for flows in turbomachines has been developed. The new solver is based on the latest version of the Denton codes, but has been implemented to run on Graphics Processing Units (GPUs) instead of the traditional Central Processing Unit (CPU). The change in processor enables an order-of-magnitude reduction in run-time due to the higher performance of the GPU. Scaling results for a 16 node GPU cluster are also presented, showing almost linear scaling for typical turbomachinery cases. For validation purposes, a test case consisting of a three-stage turbine with complete hub and casing leakage paths is described. Good agreement is obtained with previously published experimental results. The simulation runs in less than 10 minutes on a cluster with four GPUs.

Download Full-text

Accelerating the RTTOV-7 IASI and AMSU-A radiative transfer models on graphics processing units: evaluating central processing unit/graphics processing unit-hybrid and pure-graphics processing unit approaches

Journal of Applied Remote Sensing ◽

10.1117/1.3658028 ◽

2011 ◽

Vol 5 (1) ◽

pp. 051503 ◽

Cited By ~ 4

Author(s):

Jarno Mielikainen

Keyword(s):

Radiative Transfer ◽

Graphics Processing Units ◽

Graphics Processing Unit ◽

Central Processing Unit ◽

Processing Unit ◽

Central Processing ◽

Radiative Transfer Models ◽

Graphics Processing ◽

Transfer Models

Download Full-text

A Lightweight Deep Learning Model for Fast Electrocardiographic Beats Classification With a Wearable Cardiac Monitor: Development and Validation Study

JMIR Medical Informatics ◽

10.2196/17037 ◽

2020 ◽

Vol 8 (3) ◽

pp. e17037 ◽

Cited By ~ 1

Author(s):

Eunjoo Jeon ◽

Kyusam Oh ◽

Soonhwan Kwon ◽

HyeongGwan Son ◽

Yongkeun Yun ◽

...

Keyword(s):

Deep Learning ◽

Graphics Processing Units ◽

Model Complexity ◽

Massachusetts Institute Of Technology ◽

Baseline Model ◽

Accurate Analysis ◽

Central Processing ◽

Beat Classification ◽

Institute Of Technology ◽

Graphics Processing

Background Electrocardiographic (ECG) monitors have been widely used for diagnosing cardiac arrhythmias for decades. However, accurate analysis of ECG signals is difficult and time-consuming work because large amounts of beats need to be inspected. In order to enhance ECG beat classification, machine learning and deep learning methods have been studied. However, existing studies have limitations in model rigidity, model complexity, and inference speed. Objective To classify ECG beats effectively and efficiently, we propose a baseline model with recurrent neural networks (RNNs). Furthermore, we also propose a lightweight model with fused RNN for speeding up the prediction time on central processing units (CPUs). Methods We used 48 ECGs from the MIT-BIH (Massachusetts Institute of Technology-Beth Israel Hospital) Arrhythmia Database, and 76 ECGs were collected with S-Patch devices developed by Samsung SDS. We developed both baseline and lightweight models on the MXNet framework. We trained both models on graphics processing units and measured both models’ inference times on CPUs. Results Our models achieved overall beat classification accuracies of 99.72% for the baseline model with RNN and 99.80% for the lightweight model with fused RNN. Moreover, our lightweight model reduced the inference time on CPUs without any loss of accuracy. The inference time for the lightweight model for 24-hour ECGs was 3 minutes, which is 5 times faster than the baseline model. Conclusions Both our baseline and lightweight models achieved cardiologist-level accuracies. Furthermore, our lightweight model is competitive on CPU-based wearable hardware.

Download Full-text

Ray-based modeling and imaging in viscoelastic media using graphics processing units

Geophysics ◽

10.1190/geo2018-0510.1 ◽

2019 ◽

Vol 84 (5) ◽

pp. S425-S436

Author(s):

Martin Sarajaervi ◽

Henk Keers

Keyword(s):

Seismic Data ◽

Graphics Processing Units ◽

Graphics Processing Unit ◽

Parallel Implementation ◽

Processing Unit ◽

Central Processing ◽

Imaging Results ◽

Viscoelastic Modeling ◽

Graphics Processing ◽

Complex Valued

In seismic data processing, the amplitude loss caused by attenuation should be taken into account. The basis for this is provided by a 3D attenuation model described by the quality factor [Formula: see text], which is used in viscoelastic modeling and imaging. We have accomplished viscoelastic modeling and imaging using ray theory and the ray-Born approximation. This makes it possible to take [Formula: see text] into account using complex-valued and frequency-dependent traveltimes. We have developed a unified parallel implementation for modeling and imaging in the frequency domain and carried out the numerical integration on a graphics processing unit. A central part of the implementation is an efficient technique for computing large integrals. We applied the integration method to the 3D SEG/EAGE overthrust model to generate synthetic seismograms and imaging results. The attenuation effects are accurately modeled in the seismograms and compensated for in the imaging algorithm. The results indicate a significant improvement in computational efficiency compared to a parallel central processing unit baseline.

Download Full-text

GPU accelerated computation of fast spectral transforms

Facta universitatis - series Electronics and Energetics ◽

10.2298/fuee1103483g ◽

2011 ◽

Vol 24 (3) ◽

pp. 483-499

Author(s):

Dusan Gajic ◽

Radomir Stankovic

Keyword(s):

Graphics Processing Units ◽

Fast Algorithms ◽

Central Processing Unit ◽

Processing Unit ◽

Memory Transfer ◽

Simple Arithmetic ◽

Central Processing ◽

Graphics Processing ◽

Spectral Transforms ◽

Gpu Implementation

This paper discusses techniques for accelerated computation of several fast spectral transforms on graphics processing units (GPUs) using the Open Computing Language (OpenCL). We present a reformulation of fast algorithms which takes into account peculiar properties of transforms to make them suitable for the GPU implementation. A special attention is paid to the organization of computations, memory transfer reductions, impact of integer and Boolean arithmetic, different structure of algorithms, etc. Performance of the GPU implementations is compared with the classical C/C++ implementations for the central processing unit (CPU). Experiments confirm that, even though the spectral transforms considered involve only simple arithmetic, significant speedups are achieved by implementing the algorithms in OpenCL and performing them on the GPU.

Download Full-text