efficient parallelization
Recently Published Documents


TOTAL DOCUMENTS

119
(FIVE YEARS 17)

H-INDEX

12
(FIVE YEARS 2)

Author(s):  
Hasindu Gamaarachchi ◽  
Hiruna Samarakoon ◽  
Sasha P. Jenner ◽  
James M. Ferguson ◽  
Timothy G. Amos ◽  
...  

AbstractNanopore sequencing depends on the FAST5 file format, which does not allow efficient parallel analysis. Here we introduce SLOW5, an alternative format engineered for efficient parallelization and acceleration of nanopore data analysis. Using the example of DNA methylation profiling of a human genome, analysis runtime is reduced from more than two weeks to approximately 10.5 h on a typical high-performance computer. SLOW5 is approximately 25% smaller than FAST5 and delivers consistent improvements on different computer architectures.


Mathematics ◽  
2021 ◽  
Vol 9 (24) ◽  
pp. 3278
Author(s):  
Petr Pařík ◽  
Jin-Gyun Kim ◽  
Martin Isoz ◽  
Chang-uk Ahn

The enhanced Craig–Bampton (ECB) method is a novel extension of the original Craig–Bampton (CB) method, which has been widely used for component mode synthesis (CMS). The ECB method, using residual modal compensation that is neglected in the CB method, provides dramatic accuracy improvement of reduced matrices without an increasing number of eigenbasis. However, it also needs additional computational requirements to treat the residual flexibility. In this paper, an efficient parallelization of the ECB method is presented to handle this issue and accelerate the applicability for large-scale structural vibration problems. A new ECB formulation within a substructuring strategy is derived to achieve better scalability. The parallel implementation is based on OpenMP parallel architecture. METIS graph partitioning and Linear Algebra Package (LAPACK) are used to automated algebraic partitioning and computational linear algebra, respectively. Numerical examples are presented to evaluate the accuracy, scalability, and capability of the proposed parallel ECB method. Consequently, based on this work, one can expect effective computation of the ECB method as well as accuracy improvement.


Author(s):  
М.Л. Цымблер ◽  
А.И. Гоглачев

Поиск типичных подпоследовательностей временного ряда является одной из актуальных задач интеллектуального анализа временных рядов. Данная задача предполагает нахождение набора подпоследовательностей временного ряда, которые адекватно отражают течение процесса или явления, задаваемого этим рядом. Поиск типичных подпоследовательностей дает возможность резюмировать и визуализировать большие временные ряды в широком спектре приложений: мониторинг технического состояния сложных машин и механизмов, интеллектуальное управление системами жизнеобеспечения, мониторинг показателей функциональной диагностики организма человека и др. Предложенная недавно концепция сниппета формализует типичную подпоследовательность временного ряда следующим образом. Сниппет представляет собой подпоследовательность, на которую похожи многие другие подпоследовательности данного ряда в смысле специализированной меры схожести, основанной на евклидовом расстоянии. Поиск типичных подпоследовательностей с помощью сниппетов показывает адекватные результаты для временных рядов из широкого спектра предметных областей, однако соответствующий алгоритм имеет высокую вычислительную сложность. В настоящей работе предложен новый параллельный алгоритм поиска сниппетов во временном ряде на графическом ускорителе. Распараллеливание выполнено с помощью технологии программирования CUDA. Разработаны структуры данных, позволяющие эффективно распараллелить вычисления на графическом процессоре. Представлены результаты вычислительных экспериментов, подтверждающих высокую производительность разработанного алгоритма. Discovery of typical subsequences in a time series is one of the topical problems of time series mining. In this problem, we are to find a set of subsequences that adequately represents the specified time series. The solution of such a problem makes it possible to summarize and visualize a large time series in a wide range of applications: monitoring of the technical condition of complex machines and mechanisms, intelligent management of life support systems, monitoring of indicators of functional diagnostics of the human body, etc. The recently proposed snippet concept formalizes a typical time series subsequence as follows. A snippet of a time series is a subsequence that many other subsequences of the given series are similar to, with respect to a specialized similarity measure based on the Euclidean distance. Despite the snippets discovery algorithm shows adequate results for time series from a wide range of subject domains, it has a high computational complexity. In this article, we propose a novel parallel algorithm for snippets discovery on GPU. Parallelization is performed through the CUDA programming technology. We developed data structures that allow for efficient parallelization of GPU calculations. The experimental results show the high performance of the proposed algorithm.


2021 ◽  
Vol 33 (5) ◽  
pp. 249-258
Author(s):  
Konstantin Borisovich Koshelev ◽  
Andrei Vladimirovich Osipov ◽  
Sergei Vladimirovich Strijhak

The paper considers the possibility of the ICELIB library, developed at ISP RAS, for modeling ice formation processes on the surface of aircraft. As a test example to compare the accuracy of modeling the physical processes arising during the operation of the aircraft, the surface of a swept wing with a GLC-305 profile was studied. The possibilities of an efficient parallelization algorithm using a liquid film model, a dynamic mesh, and the geometric method of bisectors are discussed. The developed ICELIB library is a collection of three solvers. The first solver iceFoam1 is intended for preliminary estimation of the icing zones of the fuselage surface and aircraft’s swept wing. The change in the geometric shape of the investigated body is neglected, the thickness of ice formation is negligible. This version of the solver has no restrictions on the number of cores when parallelizing. The second version of solver iceDyMFoam2 is designed to simulate the formation of two types of ice, smooth (“Glaze ice”) and loose (“Rime ice"), for which the shape of ice often takes on a complex and bizarre appearance. The effect of changing the shape of the body on the icing process is taken into account. The limitations are related to the peculiarities of the construction of the mesh near the boundary layer of the streamlined body. Different algorithms are used to move the front and back edges of the film, which are optimized for their cases. The performance gain is limited and is achieved with a fixed number of cores. The third version of solver iceDyMFoam3 also allows you to take into account the effect of changes in the surface of a solid during the formation of ice on the icing process itself. For the case of smooth ice formation, the latest version of the solver is still inferior in its capabilities to the second one with complex ice surface shapes. In the third version, a somewhat simplified and more uniform approach is still used to calculate the motion of both boundaries of the ice film. The estimation of the calculation results with the data of the experiment from M. Papadakis for various airfoils and swept wing for the case of “Rime ice” is carried out. Good agreement with the experimental results was obtained.


2020 ◽  
pp. paper8-1-paper8-12
Author(s):  
Dmitry Zhdanov ◽  
Andrey Zhdanov

The current paper is devoted to the methods of the realistic rendering methods based on the bidirectional stochastic ray tracing with photon maps. The research of the backward photon mapping method to account for both caustics and indirect illumination is presented. By using the backward photon maps authors reduced the amount of data that should be stored in the photon maps that allowed to speed up the process of the indirect luminance calculation. Methods used for constructing a tree of backward photon maps and methods of efficient parallelization used in algorithms of accumulation and forming the backward photon maps along with tracing forward and back-ward rays in the rendering process are considered. Methods to estimate the attained luminance error both for single image pixels and for the entire image with the designed rendering method are presented. The rendering results obtained with the use of the developed methods and algorithms are presented.


Author(s):  
Jose Sergio Hleap ◽  
Melania E. Cristescu ◽  
Dirk Steinke

AbstractSummaryAmplicons to Global Gene (A2G2) is a Python wrapper that uses MAFFT and an “Amplicon to Gene” strategy to align very large numbers of sequences while improving alignment accuracy. It is specially developed to deal with conserved genes, where traditional aligners introduce a significant amount of gaps. A2G2 leverages the add sequences option of MAFFT to align the sequences to a global reference gene and a local reference region. Both of these references can be consensus sequences of trusted sources. Efficient parallelization of these tasks allows A2G2 to align a very large number of sequences (> 500K) in a reasonable amount of time. A2G2 can be imported in Python for easier integration with other software, or can be run via command line.AvailabilityA2G2 is implemented in Python 3 (3.6) and depends on MAFFT availability. Other package requirements can be found in the requirements.txt file at https://github.com/jshleap/A2G. A2G2 is also available via PyPi (https://pypi.org/project/A2G). It is licensed under the LGPLv3.Supplementary informationSupplementary material is available at github as jupyter notebook.


Sign in / Sign up

Export Citation Format

Share Document