multicore processor
Recently Published Documents


TOTAL DOCUMENTS

211
(FIVE YEARS 37)

H-INDEX

11
(FIVE YEARS 3)

2021 ◽  
Vol 18 (4(Suppl.)) ◽  
pp. 1413
Author(s):  
Lee Jia Bin ◽  
Nor Asilah Wati Abdul Hamid ◽  
Zurita Ismail ◽  
Mohamed Faris Laham

RNA Sequencing (RNA-Seq) is the sequencing and analysis of transcriptomes. The main purpose of RNA-Seq analysis is to find out the presence and quantity of RNA in an experimental sample under a specific condition. Essentially, RNA raw sequence data was massive. It can be as big as hundreds of Gigabytes (GB). This massive data always makes the processing time become longer and take several days. A multicore processor can speed up a program by separating the tasks and running the tasks’ errands concurrently. Hence, a multicore processor will be a suitable choice to overcome this problem. Therefore, this study aims to use an Intel multicore processor to improve the RNA-Seq speed and analyze RNA-Seq analysis's performance with a multiprocessor. This study only processed RNA-Seq from quality control analysis until sorted the BAM (Binary Alignment/Map) file content. Three different sizes of RNA paired end has been used to make the comparison. The final experiment results showed that the implementation of RNA-Seq on an Intel multicore processor could achieve a higher speedup. The total processing time of RNA-Seq with the largest size of RNA raw sequence data (66.3 Megabytes) decreased from 317.638 seconds to 211.916 seconds. The reduced processing time was 105 seconds and near to 2 minutes. Furthermore, for the smallest RNA raw sequence data size, the total processing time decreased from 212.380 seconds to 163.961 seconds which reduced 48 seconds.


2021 ◽  
Vol 4 ◽  
pp. 1-4
Author(s):  
Hao Meng ◽  
Wei-Ming Xu ◽  
Tian-Yang Liu ◽  
Zhi-Yuan Shi ◽  
Zhou-Yang Dong

Abstract. In terms of ocean tide visualization, to meet the requirement of both display range and operational efficiency, an advanced method is proposed, in which the tide height is rapidly computed with global tide model EOT10a, and dynamically displayed by OpenGL. Aiming at the large amounts of calculation of global tide height, the feature of multicore processor is integrated into the method. The experiment shows that, compared to a single-core processor, when using a 6-core processor, the speedup ratio is about 5.4, parallel efficiency reaches 90%, and 880 000 tide heights can be calculated per second. Eventually, the result would be output as a tide height graph by OpenGL. This method could be a useful tool for marine cartography due to the large display range and the high efficiency.


2021 ◽  
Vol 9 ◽  
Author(s):  
Hao Lu ◽  
Zhiqiang Wei ◽  
Cunji Wang ◽  
Jingjing Guo ◽  
Yuandong Zhou ◽  
...  

Ultra-large-scale molecular docking can improve the accuracy of lead compounds in drug discovery. In this study, we developed a molecular docking piece of software, Vina@QNLM, which can use more than 4,80,000 parallel processes to search for potential lead compounds from hundreds of millions of compounds. We proposed a task scheduling mechanism for large-scale parallelism based on Vinardo and Sunway supercomputer architecture. Then, we readopted the core docking algorithm to incorporate the full advantage of the heterogeneous multicore processor architecture in intensive computing. We successfully expanded it to 10, 465, 065 cores (1,61,001 management process elements and 0, 465, 065 computing process elements), with a strong scalability of 55.92%. To the best of our knowledge, this is the first time that 10 million cores are used for molecular docking on Sunway. The introduction of the heterogeneous multicore processor architecture achieved the best speedup, which is 11x more than that of the management process element of Sunway. The performance of Vina@QNLM was comprehensively evaluated using the CASF-2013 and CASF-2016 protein–ligand benchmarks, and the screening power was the highest out of the 27 pieces of software tested in the CASF-2013 benchmark. In some existing applications, we used Vina@QNLM to dock more than 10 million molecules to nine rigid proteins related to SARS-CoV-2 within 8.5 h on 10 million cores. We also developed a platform for the general public to use the software.


2021 ◽  
Vol 12 (6) ◽  
pp. 295-301
Author(s):  
A. A. Titova ◽  
◽  
V. A. Roganov ◽  
G. A. Lukyanchenko ◽  
S. G. Elizarov ◽  
...  

Cryptonight is one of the possible base algorithms for cryptocurrencies. It belongs to the group of memory-bound algorithms, designed to prevent mining on specialized processors and ASICs by using 2MB of memory for each hash. Thus, it is not easy to adapt for parallel computing. The aim of this work is to prove theoretically and experimentally that this algorithm can still be optimized for a specialized multicore processor to make mining more energetically efficient than on CPU. This article describes the process of optimization, which was conducted using the following methods: data clustering, storage of repeatedly used data in local memory, usage of SIMD for parallel computing, data prefetch. Those methods are first explained, their supposed effectiveness analyzed, and then implemented. As a result, two schemes of algorithm optimization were created: first one is based on the usage of MALTs slave cores, which compute hashes independently. Although memory-boundness creates multiple problems, we were able to increase the efficiency by clustering data. The second scheme is more complicated, it suggests using SIMD processors for most cryptographic computations and also involves data prefetch, which becomes possible if more than one hash is calculated on one core at the same time. All the results are demonstrated in the paper and they indicate that it is indeed possible to optimize Cryptonight for a specialized multicore processor MALT. The practical results show that energy efficiency has increased 5 times in comparison with CPU.


2021 ◽  
Vol 35 (111) ◽  
pp. 73-82
Author(s):  
I. M. Zhuravska ◽  
◽  
V. Yu. Savinov ◽  
K. O. Obukhova

2021 ◽  
Vol 179 (1) ◽  
pp. 35-58
Author(s):  
Sirine Marrakchi ◽  
Mohamed Jemni

A new approach for solving triangular band linear systems is established in this study to balance the load and obtain a high degree of parallelism. Our investigation consists to attribute both adequate start time and processor to each task and eliminate the useless dependencies which are not used in the parallel solve stage. Thereby, processors execute in parallel their related tasks taking account of the considered precedence constraints. The theoretical lower bounds for parallel execution time and the number of processors required to carry out the task graph in the shortest time are determined. Experimentations are realized on a shared-memory multicore processor. The experimental results are fitted to the values derived from the determined mathematical formulas. The comparison of results obtained by our contribution with those from triangular systems resolution routine belonging to the library PLASMA, Parallel Linear Algebra Software for Multicore Architectures, confirms the efficiency of the proposed approach.


2020 ◽  
Vol 17 (24) ◽  
pp. 20200359-20200359
Author(s):  
Taejin Park ◽  
Jae Young Hur ◽  
Wooyoung Jang
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document