many integrated core Latest Research Papers

Parallelization implementation of the m ulti‐scale retinex i mage‐enhancement algorithm based on a many integrated core platform

Concurrency and Computation Practice and Experience ◽

10.1002/cpe.5832 ◽

2020 ◽

Vol 32 (22) ◽

Author(s):

Fang Huang

Keyword(s):

Many Integrated Core ◽

Enhancement Algorithm

MCtandem: an efficient tool for large-scale peptide identification on many integrated core (MIC) architecture

BMC Bioinformatics ◽

10.1186/s12859-019-2980-5 ◽

2019 ◽

Vol 20 (1) ◽

Author(s):

Chuang Li ◽

Kenli Li ◽

Keqin Li ◽

Feng Lin

Keyword(s):

Large Scale ◽

Peptide Identification ◽

Efficient Tool ◽

Many Integrated Core

Performance Modelling of Deep Learning on Intel Many Integrated Core Architectures

2019 International Conference on High Performance Computing & Simulation (HPCS) ◽

10.1109/hpcs48598.2019.9188090 ◽

2019 ◽

Author(s):

Andre Viebke ◽

Sabri Pllana ◽

Suejb Memeti ◽

Joanna Kolodziej

Keyword(s):

Deep Learning ◽

Performance Modelling ◽

Many Integrated Core

A parallel discord discovery algorithm for time series on many-core accelerators

Numerical Methods and Programming (Vychislitel'nye Metody i Programmirovanie) ◽

10.26089/nummet.v20r320 ◽

2019 ◽

pp. 211-223

Author(s):

М.Л. Цымблер

Keyword(s):

Time Series ◽

Climate Modeling ◽

Main Memory ◽

Wide Range ◽

Euclidean Distances ◽

Nvidia Gpu ◽

Two Stages ◽

Many Core ◽

Intel Mic ◽

Many Integrated Core

Диссонанс является уточнением понятия аномальной подпоследовательности (существенно непохожей на остальные подпоследовательности) временного ряда. Задача поиска диссонанса встречается в широком спектре предметных областей, связанных с временными рядами: медицина, экономика, моделирование климата и др. В работе предложен новый параллельный алгоритм поиска диссонанса во временном ряде на платформе многоядерного ускорителя для случая, когда входные данные могут быть размещены в оперативной памяти. Алгоритм использует возможность независимого вычисления евклидовых расстояний между подпоследовательностями ряда. Алгоритм состоит из двух этапов: подготовка данных и поиск. На этапе подготовки выполняется построение вспомогательных матричных структур данных, обеспечивающих распараллеливание и векторизацию вычислений. На стадии поиска осуществляется нахождение диссонанса с помощью построенных структур данных. Выполнена реализация алгоритма для ускорителей архитектур Intel MIC (Many Integrated Core) и NVIDIA GPU, распараллеливание выполнено с помощью технологий программирования OpenMP и OpenAcc соответственно. Представлены результаты вычислительных экспериментов, подтверждающих масштабируемость разработанного алгоритма. Discord is a refinement of the concept of anomalous subsequence of a time series. The discord discovery problem frequently occurs in a wide range of application areas related to time series: medicine, economics, climate modeling, etc. In this paper we propose a new parallel discord discovery algorithm for many-core systems in the case when the input data fit in the main memory. The algorithm exploits the ability to independently calculate the Euclidean distances between the subsequences of the time series. Computations are paralleled using OpenMP and OpenAcc for the Intel MIC (Many Integrated Core) and NVIDIA GPU platforms, respectively. The algorithm consists of two stages, namely precomputations and discovery. At the precomputation stage, we construct the auxiliary matrix data structures to ensure the efficient vectorization of computations on an accelerator. At the discovery stage, the algorithm searches for a discord based on the constructed structures. A number of numerical experiments confirm a high scalability of the proposed algorithm.

A parallel data clustering algorithm for Intel MIC accelerators

Numerical Methods and Programming (Vychislitel'nye Metody i Programmirovanie) ◽

10.26089/nummet.v20r211 ◽

2019 ◽

pp. 104-115

Author(s):

Т.В. Речкалов ◽

М.Л. Цымблер

Keyword(s):

Dna Microarrays ◽

Clustering Algorithm ◽

Real Data ◽

Data Sets ◽

Data Layout ◽

Partitioning Around Medoids ◽

Wide Range ◽

Input Dataset ◽

Intel Mic ◽

Many Integrated Core

Алгоритм PAM (Partitioning Around Medoids) представляет собой разделительный алгоритм кластеризации, в котором в качестве центров кластеров выбираются только кластеризуемые объекты (медоиды). Кластеризация на основе техники медоидов применяется в широком спектре приложений: сегментирование медицинских и спутниковых изображений, анализ ДНК-микрочипов и текстов и др. На сегодня имеются параллельные реализации PAM для систем GPU и FPGA, но отсутствуют таковые для многоядерных ускорителей архитектуры Intel Many Integrated Core (MIC). В настоящей статье предлагается новый параллельный алгоритм кластеризации PhiPAM для ускорителей Intel MIC. Вычисления распараллеливаются с помощью технологии OpenMP. Алгоритм предполагает использование специализированной компоновки данных в памяти и техники тайлинга, позволяющих эффективно векторизовать вычисления на системах Intel MIC. Эксперименты, проведенные на реальных наборах данных, показали хорошую масштабируемость алгоритма. The PAM (Partitioning Around Medoids) is a partitioning clustering algorithm where each cluster is represented by an object from the input dataset (called a medoid). The medoid-based clustering is used in a wide range of applications: the segmentation of medical and satellite images, the analysis of DNA microarrays and texts, etc. Currently, there are parallel implementations of PAM for GPU and FPGA systems, but not for Intel Many Integrated Core (MIC) accelerators. In this paper, we propose a novel parallel PhiPAM clustering algorithm for Intel MIC systems. Computations are parallelized by the OpenMP technology. The algorithm exploits a sophisticated memory data layout and loop tiling technique, which allows one to efficiently vectorize computations with Intel MIC. Experiments performed on real data sets show a good scalability of the algorithm.

Fluid-film lubrication computing with many-core processors and graphics processing units

Advances in Mechanical Engineering ◽

10.1177/1687814018804719 ◽

2018 ◽

Vol 10 (10) ◽

pp. 168781401880471

Author(s):

Nenzi Wang ◽

Hsin-Yi Chen ◽

Yu-Wen Chen

Keyword(s):

Parallel Computing ◽

Graphics Processing Units ◽

Fluid Film ◽

The Many ◽

Many Core ◽

Graphics Processing ◽

Processor Cores ◽

Many Integrated Core ◽

Film Lubrication ◽

Fluid Film Lubrication

The advancement of modern processors with many-core and large-cache may have little computational advantages if only serial computing is employed. In this study, several parallel computing approaches, using devices with multiple or many processor cores, and graphics processing units are applied and compared to illustrate the potential applications in fluid-film lubrication study. Two Reynolds equations and an air bearing optimum design are solved using three parallel computing paradigms, OpenMP, Compute Unified Device Architecture, and OpenACC, on standalone shared-memory computers. The newly developed processors with many-integrated-core are also using OpenMP to release the computing potential. The results show that the OpenACC computing can have a better performance than the OpenMP computing for the discretized Reynolds equation with a large gridwork. This is mainly due to larger sizes of available cache in the tested graphics processing units. The bearing design can benefit most when the system with many-integrated-core processor is being used. This is due to the many-integrated-core system can perform computation in the optimization-algorithm-level and using the many processor cores effectively. A proper combination of parallel computing devices and programming models can complement efficient numerical methods or optimization algorithms to accelerate many tribological simulations or engineering designs.

Efficient computation of motif discovery on Intel Many Integrated Core (MIC) Architecture

BMC Bioinformatics ◽

10.1186/s12859-018-2276-1 ◽

2018 ◽

Vol 19 (S9) ◽

Cited By ~ 1

Author(s):

Shaoliang Peng ◽

Minxia Cheng ◽

Kaiwen Huang ◽

YingBo Cui ◽

Zhiqiang Zhang ◽

...

Keyword(s):

Motif Discovery ◽

Efficient Computation ◽

Many Integrated Core

ELT-scale adaptive optics real-time control with the Intel Xeon Phi Many Integrated Core Architecture

Monthly Notices of the Royal Astronomical Society ◽

10.1093/mnras/sty1310 ◽

2018 ◽

Vol 478 (3) ◽

pp. 3149-3158 ◽

Cited By ~ 6

Author(s):

David R Jenkins ◽

Alastair Basden ◽

Richard M Myers

Keyword(s):

Real Time ◽

Adaptive Optics ◽

Xeon Phi ◽

Intel Xeon Phi ◽

Real Time Control ◽

Time Control ◽

Many Integrated Core ◽

Intel Xeon

Many-integrated core (MIC) technology for accelerating Monte Carlo simulation of radiation transport: A study based on the code DPM

Computer Physics Communications ◽

10.1016/j.cpc.2017.12.019 ◽

2018 ◽

Vol 225 ◽

pp. 28-35 ◽

Cited By ~ 4

Author(s):

M. Rodriguez ◽

L. Brualla

Keyword(s):

Monte Carlo Simulation ◽

Monte Carlo ◽

Radiation Transport ◽

Many Integrated Core

Performance Characterization of Multi-threaded Graph Processing Applications on Many-Integrated-Core Architecture

2018 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS) ◽

10.1109/ispass.2018.00033 ◽

2018 ◽

Author(s):

Lei Jiang ◽

Langshi Chen ◽

Judy Qiu

Keyword(s):

Graph Processing ◽

Performance Characterization ◽

Many Integrated Core

many integrated core
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Parallelization implementation of the m ulti‐scale retinex i mage‐enhancement algorithm based on a many integrated core platform

MCtandem: an efficient tool for large-scale peptide identification on many integrated core (MIC) architecture

Performance Modelling of Deep Learning on Intel Many Integrated Core Architectures

A parallel discord discovery algorithm for time series on many-core accelerators

A parallel data clustering algorithm for Intel MIC accelerators

Fluid-film lubrication computing with many-core processors and graphics processing units

Efficient computation of motif discovery on Intel Many Integrated Core (MIC) Architecture

ELT-scale adaptive optics real-time control with the Intel Xeon Phi Many Integrated Core Architecture

Many-integrated core (MIC) technology for accelerating Monte Carlo simulation of radiation transport: A study based on the code DPM

Performance Characterization of Multi-threaded Graph Processing Applications on Many-Integrated-Core Architecture

Export Citation Format

many integrated coreRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Parallelization implementation of the m ulti‐scale retinex i mage‐enhancement algorithm based on a many integrated core platform

MCtandem: an efficient tool for large-scale peptide identification on many integrated core (MIC) architecture

Performance Modelling of Deep Learning on Intel Many Integrated Core Architectures

A parallel discord discovery algorithm for time series on many-core accelerators

A parallel data clustering algorithm for Intel MIC accelerators

Fluid-film lubrication computing with many-core processors and graphics processing units

Efficient computation of motif discovery on Intel Many Integrated Core (MIC) Architecture

ELT-scale adaptive optics real-time control with the Intel Xeon Phi Many Integrated Core Architecture

Many-integrated core (MIC) technology for accelerating Monte Carlo simulation of radiation transport: A study based on the code DPM

Performance Characterization of Multi-threaded Graph Processing Applications on Many-Integrated-Core Architecture

many integrated core
Recently Published Documents