Review of Bioinformatics Application Using Intel MIC

Author(s):  
Xinyi Wang ◽  
Zhen Huang ◽  
Cangshuai Wu ◽  
Feng Liu ◽  
Congrui Wang
Author(s):  
Shakir Ullah Shah ◽  
Abdul Hameed ◽  
Jamil Ahmad ◽  
Hafeez Ur Rehman Safia Fatima ◽  
Muhammad Amin

2017 ◽  
Vol 83 ◽  
pp. 82-90 ◽  
Author(s):  
Leyi Wei ◽  
Shixiang Wan ◽  
Jiasheng Guo ◽  
Kelvin KL Wong

Author(s):  
Miaoqing Huang ◽  
Chenggang Lai ◽  
Xuan Shi ◽  
Zhijun Hao ◽  
Haihang You

Coprocessors based on the Intel Many Integrated Core (MIC) Architecture have been adopted in many high-performance computer clusters. Typical parallel programming models, such as MPI and OpenMP, are supported on MIC processors to achieve the parallelism. In this work, we conduct a detailed study on the performance and scalability of the MIC processors under different programming models using the Beacon computer cluster. Our findings are as follows. (1) The native MPI programming model on the MIC processors is typically better than the offload programming model, which offloads the workload to MIC cores using OpenMP. (2) On top of the native MPI programming model, multithreading inside each MPI process can further improve the performance for parallel applications on computer clusters with MIC coprocessors. (3) Given a fixed number of MPI processes, it is a good strategy to schedule these MPI processes to as few MIC processors as possible to reduce the cross-processor communication overhead. (4) The hybrid MPI programming model, in which data processing is distributed to both MIC cores and CPU cores, can outperform the native MPI programming model.


2014 ◽  
Author(s):  
Jarno Mielikainen ◽  
Bormin Huang ◽  
Allen H. Huang
Keyword(s):  

2015 ◽  
Vol 2015 ◽  
pp. 1-14 ◽  
Author(s):  
Xinmin Tian ◽  
Hideki Saito ◽  
Serguei V. Preis ◽  
Eric N. Garcia ◽  
Sergey S. Kozhukhov ◽  
...  

Efficiently exploiting SIMD vector units is one of the most important aspects in achieving high performance of the application code running on Intel Xeon Phi coprocessors. In this paper, we present several effective SIMD vectorization techniques such as less-than-full-vector loop vectorization, Intel MIC specific alignment optimization, and small matrix transpose/multiplication 2D vectorization implemented in the Intel C/C++ and Fortran production compilers for Intel Xeon Phi coprocessors. A set of workloads from several application domains is employed to conduct the performance study of our SIMD vectorization techniques. The performance results show that we achieved up to 12.5x performance gain on the Intel Xeon Phi coprocessor. We also demonstrate a 2000x performance speedup from the seamless integration of SIMD vectorization and parallelization.


Sign in / Sign up

Export Citation Format

Share Document