A Parallel Tool for the Identification of Differentially Methylated Regions in Genomic Analyses

Methylation is a chemical process that modifies DNA through the addition of a methyl group to one or several nucleotides. Discovering differentially methylated regions is an important research field in genomics, as it can help to anticipate the risk of suffering from certain diseases. RADMeth is one of the most accurate tools in this field, but it has high computational complexity. In this work, we present a hybrid MPI-OpenMP parallel implementation of RADMeth to accelerate its execution on distributed-memory systems, reaching speedups of up to 189 when running on 256 cores and allowing for its application to large-scale datasets.

Download Full-text

Acceleration of a Feature Selection Algorithm Using High Performance Computing

Proceedings ◽

10.3390/proceedings2020054054 ◽

2020 ◽

Vol 54 (1) ◽

pp. 54

Author(s):

Bieito Beceiro ◽

Jorge González-Domínguez ◽

Juan Touriño

Keyword(s):

Feature Selection ◽

High Performance ◽

Parallel Implementation ◽

Feature Selection Method ◽

Memory Systems ◽

Main Memory ◽

Single Node ◽

Very Large Datasets ◽

Performance Computing ◽

High Computational Complexity

Feature selection is a subfield of data analysis that is on reducing the dimensionality of datasets, so that subsequent analyses over them can be performed in affordable execution times while keeping the same results. Joint Mutual Information (JMI) is a highly used feature selection method that removes irrelevant and redundant characteristics. Nevertheless, it has high computational complexity. In this work, we present a multithreaded MPI parallel implementation of JMI to accelerate its execution on distributed memory systems, reaching speedups of up to 198.60 when running on 256 cores, and allowing for the analysis of very large datasets that do not fit in the main memory of a single node.

Download Full-text

LARGE-SCALE GENOMIC ANALYSES REVEAL EXTENSIVE DIVERSITY AMONGST PNEUMOCOCCAL CAPSULAR LOCUS SEQUENCES AND PUTATIVELY NOVEL SEROTYPES

10.26226/morressier.5731f0d5d462b8029237fa18 ◽

2016 ◽

Author(s):

Andries van Tonder

Keyword(s):

Large Scale ◽

Genomic Analyses

Download Full-text

Transformer Meets Convolution: A Bilateral Awareness Network for Semantic Segmentation of Very Fine Resolution Urban Scene Images

Remote Sensing ◽

10.3390/rs13163065 ◽

2021 ◽

Vol 13 (16) ◽

pp. 3065

Author(s):

Libo Wang ◽

Rui Li ◽

Dongzhi Wang ◽

Chenxi Duan ◽

Teng Wang ◽

...

Keyword(s):

Large Scale ◽

Texture Features ◽

Semantic Segmentation ◽

Autonomous Driving ◽

Research Field ◽

Learning Approaches ◽

Fine Grained ◽

Urban Scene ◽

Fine Resolution ◽

With Memory

Semantic segmentation from very fine resolution (VFR) urban scene images plays a significant role in several application scenarios including autonomous driving, land cover classification, urban planning, etc. However, the tremendous details contained in the VFR image, especially the considerable variations in scale and appearance of objects, severely limit the potential of the existing deep learning approaches. Addressing such issues represents a promising research field in the remote sensing community, which paves the way for scene-level landscape pattern analysis and decision making. In this paper, we propose a Bilateral Awareness Network which contains a dependency path and a texture path to fully capture the long-range relationships and fine-grained details in VFR images. Specifically, the dependency path is conducted based on the ResT, a novel Transformer backbone with memory-efficient multi-head self-attention, while the texture path is built on the stacked convolution operation. In addition, using the linear attention mechanism, a feature aggregation module is designed to effectively fuse the dependency features and texture features. Extensive experiments conducted on the three large-scale urban scene image segmentation datasets, i.e., ISPRS Vaihingen dataset, ISPRS Potsdam dataset, and UAVid dataset, demonstrate the effectiveness of our BANet. Specifically, a 64.6% mIoU is achieved on the UAVid dataset.

Download Full-text

A Parallel Unmixing-Based Content Retrieval System for Distributed Hyperspectral Imagery Repository on Cloud Computing Platforms

Remote Sensing ◽

10.3390/rs13020176 ◽

2021 ◽

Vol 13 (2) ◽

pp. 176

Author(s):

Peng Zheng ◽

Zebin Wu ◽

Jin Sun ◽

Yi Zhang ◽

Yaoqin Zhu ◽

...

Keyword(s):

Cloud Computing ◽

Large Scale ◽

Retrieval System ◽

Hyperspectral Image ◽

Parallel Implementation ◽

Remotely Sensed Data ◽

Web Interfaces ◽

Content Retrieval ◽

Service Mode ◽

Computing Platforms

As the volume of remotely sensed data grows significantly, content-based image retrieval (CBIR) becomes increasingly important, especially for cloud computing platforms that facilitate processing and storing big data in a parallel and distributed way. This paper proposes a novel parallel CBIR system for hyperspectral image (HSI) repository on cloud computing platforms under the guide of unmixed spectral information, i.e., endmembers and their associated fractional abundances, to retrieve hyperspectral scenes. However, existing unmixing methods would suffer extremely high computational burden when extracting meta-data from large-scale HSI data. To address this limitation, we implement a distributed and parallel unmixing method that operates on cloud computing platforms in parallel for accelerating the unmixing processing flow. In addition, we implement a global standard distributed HSI repository equipped with a large spectral library in a software-as-a-service mode, providing users with HSI storage, management, and retrieval services through web interfaces. Furthermore, the parallel implementation of unmixing processing is incorporated into the CBIR system to establish the parallel unmixing-based content retrieval system. The performance of our proposed parallel CBIR system was verified in terms of both unmixing efficiency and accuracy.

Download Full-text

Parallel Framework for Dimensionality Reduction of Large-Scale Datasets

Scientific Programming ◽

10.1155/2015/180214 ◽

2015 ◽

Vol 2015 ◽

pp. 1-12 ◽

Cited By ~ 3

Author(s):

Sai Kiranmayee Samudrala ◽

Jaroslaw Zola ◽

Srinivas Aluru ◽

Baskar Ganapathysubramanian

Keyword(s):

Dimensionality Reduction ◽

Organic Solar Cells ◽

Large Scale ◽

Parallel Implementation ◽

High Dimensional Data ◽

Real Life ◽

Processing Parameters ◽

High Dimensional ◽

Morphology Evolution ◽

Reduction Techniques

Dimensionality reduction refers to a set of mathematical techniques used to reduce complexity of the original high-dimensional data, while preserving its selected properties. Improvements in simulation strategies and experimental data collection methods are resulting in a deluge of heterogeneous and high-dimensional data, which often makes dimensionality reduction the only viable way to gain qualitative and quantitative understanding of the data. However, existing dimensionality reduction software often does not scale to datasets arising in real-life applications, which may consist of thousands of points with millions of dimensions. In this paper, we propose a parallel framework for dimensionality reduction of large-scale data. We identify key components underlying the spectral dimensionality reduction techniques, and propose their efficient parallel implementation. We show that the resulting framework can be used to process datasets consisting of millions of points when executed on a 16,000-core cluster, which is beyond the reach of currently available methods. To further demonstrate applicability of our framework we perform dimensionality reduction of 75,000 images representing morphology evolution during manufacturing of organic solar cells in order to identify how processing parameters affect morphology evolution.

Download Full-text

A heterogeneous parallel implementation of the Markov clustering algorithm for large-scale biological networks on distributed CPU–GPU clusters

The Journal of Supercomputing ◽

10.1007/s11227-021-04204-6 ◽

2022 ◽

Author(s):

You Fu ◽

Wei Zhou

Keyword(s):

Biological Networks ◽

Large Scale ◽

Clustering Algorithm ◽

Parallel Implementation ◽

Gpu Clusters ◽

Markov Clustering

Download Full-text

Skill assessment of post-processing methods for ECMWF SEAS5 seasonal forecasts over Europe

10.5194/egusphere-egu21-11019 ◽

2021 ◽

Author(s):

Alice Crespi ◽

Marcello Petitta ◽

Lucas Grigis ◽

Paola Marson ◽

Jean-Michel Soubeyroux ◽

...

Keyword(s):

Large Scale ◽

Production Management ◽

Spatial Scales ◽

Research Field ◽

Skill Assessment ◽

Post Processing ◽

Seasonal Forecasts ◽

Climate Services ◽

Local Climate ◽

Climate Conditions

Seasonal forecasts provide information on climate conditions several months ahead and therefore they could represent a valuable support for decision making, warning systems as well as for the optimization of industry and energy sectors. However, forecast systems can be affected by systematic biases and have horizontal resolutions which are typically coarser than the spatial scales of the practical applications. For this reason, the reliability of forecasts needs to be carefully assessed before applying and interpreting them for specific applications. In addition, the use of post-processing approaches is recommended in order to improve the representativeness of the large-scale predictions of regional and local climate conditions. The development and evaluation downscaling and bias-correction procedures aiming at improving the skills of the forecasts and the quality of derived climate services is currently an open research field. In this context, we evaluated the skills of ECMWF SEAS5 forecasts of monthly mean temperature, total precipitation and wind speed over Europe and we assessed the skill improvements of calibrated predictions.For the calibration, we combined a bilinear interpolation and a quantile mapping approach to obtain corrected monthly forecasts on a 0.25&#176;x0.25&#176; grid from the original 1&#176;x1&#176; values. The forecasts were corrected against the reference ERA5 reanalysis over the hindcast period 1993&#8211;2016. The processed forecasts were compared over the same domain and period with another calibrated set of ECMWF SEAS5 forecasts obtained by the ADAMONT statistical method.The skill assessment was performed by means of both deterministic and probabilistic verification metrics evaluated over seasonal forecasted aggregations for the first lead time. Greater skills of the forecast systems in Europe were generally observed in spring and summer, especially for temperature, with a spatial distribution varying with the seasons. The calibration was proved to effectively correct the model biases for all variables, however the metrics not accounting for bias did not show significant improvements in most cases, and in some areas and seasons even small degradations in skills were observed.The presented study supported the activities of the H2020 European project SECLI-FIRM on the improvement of the seasonal forecast applicability for energy production, management and assessment.

Download Full-text

Request, Coalesce, Serve, and Forget: Miss-Optimized Memory Systems for Bandwidth-Bound Cache-Unfriendly Applications on FPGAs

ACM Transactions on Reconfigurable Technology and Systems ◽

10.1145/3466823 ◽

2022 ◽

Vol 15 (2) ◽

pp. 1-33

Author(s):

Mikhail Asiatici ◽

Paolo Ienne

Keyword(s):

Large Scale ◽

Sparse Matrix ◽

Memory Systems ◽

Graph Analytics ◽

Matrix Vector Multiplication ◽

Area Reduction ◽

Cache Line ◽

Speed Up ◽

Memory Accesses ◽

On Chip

Applications such as large-scale sparse linear algebra and graph analytics are challenging to accelerate on FPGAs due to the short irregular memory accesses, resulting in low cache hit rates. Nonblocking caches reduce the bandwidth required by misses by requesting each cache line only once, even when there are multiple misses corresponding to it. However, such reuse mechanism is traditionally implemented using an associative lookup. This limits the number of misses that are considered for reuse to a few tens, at most. In this article, we present an efficient pipeline that can process and store thousands of outstanding misses in cuckoo hash tables in on-chip SRAM with minimal stalls. This brings the same bandwidth advantage as a larger cache for a fraction of the area budget, because outstanding misses do not need a data array, which can significantly speed up irregular memory-bound latency-insensitive applications. In addition, we extend nonblocking caches to generate variable-length bursts to memory, which increases the bandwidth delivered by DRAMs and their controllers. The resulting miss-optimized memory system provides up to 25% speedup with 24× area reduction on 15 large sparse matrix-vector multiplication benchmarks evaluated on an embedded and a datacenter FPGA system.

Download Full-text

The large-scale blast score ratio (LS-BSR) pipeline: a method to rapidly compare genetic content between bacterial genomes

10.7287/peerj.preprints.220v1 ◽

2014 ◽

Author(s):

Jason W Sahl ◽

Greg Caporaso ◽

David A Rasko ◽

Paul S Keim

Keyword(s):

Large Scale ◽

Sequence Data ◽

Parallel Implementation ◽

Genetic Relationships ◽

Clinical Diagnostics ◽

Whole Genome Sequence ◽

Bacterial Isolates ◽

Bacterial Genomes ◽

E Coli ◽

Blast Score

Background. As whole genome sequence data from bacterial isolates becomes cheaper to generate, computational methods are needed to correlate sequence data with biological observations. Here we present the large-scale BLAST score ratio (LS-BSR) pipeline, which rapidly compares the genetic content of hundreds to thousands of bacterial genomes, and returns a matrix that describes the relatedness of all coding sequences (CDSs) in all genomes surveyed. This matrix can be easily parsed in order to identify genetic relationships between bacterial genomes. Although pipelines have been published that group peptides by sequence similarity, no other software performs the large-scale, flexible, full-genome comparative analyses carried out by LS-BSR. Results. To demonstrate the utility of the method, the LS-BSR pipeline was tested on 96 Escherichia coli and Shigella genomes; the pipeline ran in 163 minutes using 16 processors, which is a greater than 7-fold speedup compared to using a single processor. The BSR values for each CDS, which indicate a relative level of relatedness, were then mapped to each genome on an independent core genome single nucleotide polymorphism (SNP) based phylogeny. Comparisons were then used to identify clade specific CDS markers and validate the LS-BSR pipeline based on molecular markers that delineate between classical E. coli pathogenic variant (pathovar) designations. Scalability tests demonstrated that the LS-BSR pipeline can process 1,000 E. coli genomes in ~60h using 16 processors. Conclusions. LS-BSR is an open-source, parallel implementation of the BSR algorithm, enabling rapid comparison of the genetic content of large numbers of genomes. The results of the pipeline can be used to identify specific markers between user-defined phylogenetic groups, and to identify the loss and/or acquisition of genetic information between bacterial isolates. Taxa-specific genetic markers can then be translated into clinical diagnostics, or can be used to identify broadly conserved putative therapeutic candidates.

Download Full-text

End-to-End Transition-Based Online Dialogue Disentanglement

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/535 ◽

2020 ◽

Author(s):

Hui Liu ◽

Zhan Shi ◽

Jia-Chen Gu ◽

Quan Liu ◽

Si Wei ◽

...

Keyword(s):

Large Scale ◽

Online Algorithm ◽

Research Field ◽

Semantic Coherence ◽

Large Scale Dataset ◽

End To End ◽

Sequential Information ◽

Almost All ◽

Classification And Clustering ◽

Sequential Semantics

Dialogue disentanglement aims to separate intermingled messages into detached sessions. The existing research focuses on two-step architectures, in which a model first retrieves the relationships between two messages and then divides the message stream into separate clusters. Almost all existing work puts significant efforts on selecting features for message-pair classification and clustering, while ignoring the semantic coherence within each session. In this paper, we introduce the first end-to- end transition-based model for online dialogue disentanglement. Our model captures the sequential information of each session as the online algorithm proceeds on processing a dialogue. The coherence in a session is hence modeled when messages are sequentially added into their best-matching sessions. Meanwhile, the research field still lacks data for studying end-to-end dialogue disentanglement, so we construct a large-scale dataset by extracting coherent dialogues from online movie scripts. We evaluate our model on both the dataset we developed and the publicly available Ubuntu IRC dataset [Kummerfeld et al., 2019]. The results show that our model significantly outperforms the existing algorithms. Further experiments demonstrate that our model better captures the sequential semantics and obtains more coherent disentangled sessions.

Download Full-text