An open-source framework for the implementation of large-scale integral operators with flexible, modern HPC solutions - Enabling 3D Marchenko imaging by least squares inversion

Numerical integral operators of convolution type form the basis of most wave-equation-based methods for processing and imaging of seismic data. As several of these methods require the solution of an inverse problem, multiple forward and adjoint passes of the modeling operator are generally required to converge to a satisfactory solution. This work highlights the memory requirements and computational challenges that arise when implementing such operators on 3D seismic datasets and their usage for solving large systems of integral equations. A Python framework is presented that leverages libraries for distributed storage and computing, and provides a high-level symbolic representation of linear operators. A driving goal for our work is not only to offer a widely deployable, ready-to-use high-performance computing (HPC) framework, but to demonstrate that it enables addressing research questions that are otherwise difficult to tackle. To this end, the first example of 3D full-wavefield target-oriented imaging, which comprises of two subsequent steps of seismic redatuming, is presented. The redatumed fields are estimated by means of gradient-based inversion using the full dataset as well as spatially decimated versions of the dataset as a way to investi-gate the robustness of both inverse problems to spatial aliasing in the input dataset. Our numerical example shows that when one spatial direction is finely sampled, satisfactory redatuming and imaging can be accomplished also when the sampling in other direction is coarser than a quarter of the dominant wavelength. While aliasing introduces noise in the redatumed fields, they are less sensitive to well-known spurious artefacts compared to cheaper, adjoint-based redatuming techniques. These observations are shown to hold for a relatively simple geologic structure, and while further testing is needed for more complex scenarios, we expect them to be generally valid while possibly breaking down for extreme cases

Download Full-text

Providing large-scale disk storage at CERN

EPJ Web of Conferences ◽

10.1051/epjconf/201921404033 ◽

2019 ◽

Vol 214 ◽

pp. 04033

Author(s):

Hervé Rousseau ◽

Belinda Chan Kwok Cheong ◽

Cristian Contescu ◽

Xavier Espinal Curull ◽

Jan Iven ◽

...

Keyword(s):

High Performance ◽

Large Scale ◽

Distributed Storage ◽

Storage System ◽

Primary Data ◽

Disk Storage ◽

The Road ◽

Ongoing Work ◽

Software Distribution ◽

Microsoft Office

The CERN IT Storage group operates multiple distributed storage systems and is responsible for the support of the infrastructure to accommodate all CERN storage requirements, from the physics data generated by LHC and non-LHC experiments to the personnel users' files. EOS is now the key component of the CERN Storage strategy. It allows to operate at high incoming throughput for experiment data-taking while running concurrent complex production work-loads. This high-performance distributed storage provides now more than 250PB of raw disks and it is the key component behind the success of CERNBox, the CERN cloud synchronisation service which allows syncing and sharing files on all major mobile and desktop platforms to provide offline availability to any data stored in the EOS infrastructure. CERNBox recorded an exponential growth in the last couple of year in terms of files and data stored thanks to its increasing popularity inside CERN users community and thanks to its integration with a multitude of other CERN services (Batch, SWAN, Microsoft Office). In parallel CASTOR is being simplified and transitioning from an HSM into an archival system, focusing mainly in the long-term data recording of the primary data from the detectors, preparing the road to the next-generation tape archival system, CTA. The storage services at CERN cover as well the needs of the rest of our community: Ceph as data back-end for the CERN OpenStack infrastructure, NFS services and S3 functionality; AFS for legacy home directory filesystem services and its ongoing phase-out and CVMFS for software distribution. In this paper we will summarise our experience in supporting all our distributed storage system and the ongoing work in evolving our infrastructure, testing very-dense storage building block (nodes with more than 1PB of raw space) for the challenges waiting ahead.

Download Full-text

A Review: Map Reduce Framework for Cloud Computing

International Journal of Engineering & Technology ◽

10.14419/ijet.v7i4.6.20224 ◽

2018 ◽

Vol 7 (4.6) ◽

pp. 13

Author(s):

Mekala Sandhya ◽

Ashish Ladda ◽

Dr. Uma N Dulhare ◽

. . ◽

. .

Keyword(s):

Data Mining ◽

Cloud Computing ◽

Distributed Computing ◽

Data Storage ◽

High Performance ◽

Large Scale ◽

Distributed Storage ◽

Large Data ◽

Mass Data ◽

Internet Information

In this generation of Internet, information and data are growing continuously. Even though various Internet services and applications. The amount of information is increasing rapidly. Hundred billions even trillions of web indexes exist. Such large data brings people a mass of information and more difficulty discovering useful knowledge in these huge amounts of data at the same time. Cloud computing can provide infrastructure for large data. Cloud computing has two significant characteristics of distributed computing i.e. scalability, high availability. The scalability can seamlessly extend to large-scale clusters. Availability says that cloud computing can bear node errors. Node failures will not affect the program to run correctly. Cloud computing with data mining does significant data processing through high-performance machine. Mass data storage and distributed computing provide a new method for mass data mining and become an effective solution to the distributed storage and efficient computing in data mining.

Download Full-text

A Perspective on Energy Storage and Other Means to Integrate Increasing Shares of Renewable Electricity Generation

Green ◽

10.1515/green-2014-0001 ◽

2014 ◽

Vol 4 (1-6) ◽

Cited By ~ 2

Author(s):

Arndt Neuhaus ◽

Frank-Detlef Drake ◽

Gunnar Hoffmann ◽

Friedrich Schulte

Keyword(s):

Energy Storage ◽

Power Plants ◽

Large Scale ◽

Distributed Storage ◽

Renewable Energy Sources ◽

Energy Transition ◽

System Stability ◽

Small Scale ◽

Specific Class ◽

High Level

AbstractThe transition to a sustainable electricity supply from renewable energy sources (RES) imposes major technical and economic challenges upon market players and the legislator. In particular the rapid growth of volatile wind power and photovoltaic generation requires a high level of flexibility of the entire electricity system, therefore major investments in infrastructures are needed to maintain system stability. This raises the important question about the role that central large-scale energy storage and/or small-scale distributed storage (“energy storage at home”) are going to play in the energy transition. Economic analyses show that the importance of energy storage is going to be rather limited in the medium term. Especially competing options like intelligent grid extension and flexible operation of power plants are expected to remain favourable. Nonetheless additional storage capacities are required if the share of RES substantially exceeds 50% in the long term. Due to the fundamental significance of energy storages, R&D considers a broad variety of types each suitable for a specific class of application.

Download Full-text

A parallel finite-element framework for large-scale gradient-based design optimization of high-performance structures

Finite Elements in Analysis and Design ◽

10.1016/j.finel.2014.04.011 ◽

2014 ◽

Vol 87 ◽

pp. 56-73 ◽

Cited By ~ 80

Author(s):

Graeme J. Kennedy ◽

Joaquim R.R.A. Martins

Keyword(s):

Finite Element ◽

Design Optimization ◽

High Performance ◽

Large Scale ◽

Gradient Based ◽

Parallel Finite Element

Download Full-text

Performance Analysis of Specification Computer and Mobile with Implementation Tawaf Virtual Reality using A* Algorithm and RVO System

EMITTER International Journal of Engineering Technology ◽

10.24003/emitter.v7i1.321 ◽

2019 ◽

Vol 7 (1) ◽

pp. 55-70

Author(s):

Moh. Zikky ◽

M. Jainal Arifin ◽

Kholid Fathoni ◽

Agus Zainal Arifin

Keyword(s):

Virtual Reality ◽

High Performance ◽

Large Scale ◽

3D Models ◽

A Algorithm ◽

Virtual Reality Technology ◽

Performance Technology ◽

Outer Line ◽

High Performance Computer ◽

High Level

High-Performance Computer (HPC) is computer systems that are built to be able to solve computational loads. HPC can provide a high-performance technology and short the computing processes timing. This technology was often used in large-scale industries and several activities that require high-level computing, such as rendering virtual reality technology. In this research, we provide Tawafâ€™s Virtual Reality with 1000 of Pilgrims and realistic surroundings of Masjidil-Haram as the interactive and immersive simulation technology by imitating them with 3D models. Thus, the main purpose of this study is to calculate and to understand the processing time of its Virtual Reality with the implementation of tawaf activities using various platforms; such as computer and Android smartphone. The results showed that the outer-line or outer rotation of Kaaâ€™bah mostly consumes minimum times although he must pass the longer distance than the closer one. Â It happened because the agent with the closer area to Kaabah is facing the crowded peoples. It means an obstacle has the more impact than the distances in this case.

Download Full-text

Cloud Computing Cloud Computing in Remote Sensing : High Performance Remote Sensing Data Processing in a Big data Environment

International Journal of Computers Communications & Control ◽

10.15837/ijccc.2021.6.4236 ◽

2021 ◽

Vol 16 (6) ◽

Author(s):

Yassine Sabri ◽

Aouad Siham

Keyword(s):

Remote Sensing ◽

Cloud Computing ◽

Data Processing ◽

High Performance ◽

Large Scale ◽

Processing System ◽

Remote Sensing Data ◽

Cloud Service ◽

Intermediate Data ◽

High Level

Multi-area and multi-faceted remote sensing (SAR) datasets are widely used due to the increasing demand for accurate and up-to-date information on resources and the environment for regional and global monitoring. In general, the processing of RS data involves a complex multi-step processing sequence that includes several independent processing steps depending on the type of RS application. The processing of RS data for regional disaster and environmental monitoring is recognized as computationally and data demanding.Recently, by combining cloud computing and HPC technology, we propose a method to efficiently solve these problems by searching for a large-scale RS data processing system suitable for various applications. Real-time on-demand service. The ubiquitous, elastic, and high-level transparency of the cloud computing model makes it possible to run massive RS data management and data processing monitoring dynamic environments in any cloud. via the web interface. Hilbert-based data indexing methods are used to optimally query and access RS images, RS data products, and intermediate data. The core of the cloud service provides a parallel file system of large RS data and an interface for accessing RS data from time to time to improve localization of the data. It collects data and optimizes I/O performance. Our experimental analysis demonstrated the effectiveness of our method platform.

Download Full-text

Granular layEr Simulator: Design and Multi-GPU Simulation of the Cerebellar Granular Layer

Frontiers in Computational Neuroscience ◽

10.3389/fncom.2021.630795 ◽

2021 ◽

Vol 15 ◽

Author(s):

Giordana Florimbi ◽

Emanuele Torti ◽

Stefano Masoli ◽

Egidio D'Angelo ◽

Francesco Leporati

Keyword(s):

High Performance ◽

Large Scale ◽

Granular Layer ◽

Graphics Processing Unit ◽

Mossy Fibers ◽

Processing Unit ◽

Large Network ◽

Processing Times ◽

3D Space ◽

High Level

In modern computational modeling, neuroscientists need to reproduce long-lasting activity of large-scale networks, where neurons are described by highly complex mathematical models. These aspects strongly increase the computational load of the simulations, which can be efficiently performed by exploiting parallel systems to reduce the processing times. Graphics Processing Unit (GPU) devices meet this need providing on desktop High Performance Computing. In this work, authors describe a novel Granular layEr Simulator development implemented on a multi-GPU system capable of reconstructing the cerebellar granular layer in a 3D space and reproducing its neuronal activity. The reconstruction is characterized by a high level of novelty and realism considering axonal/dendritic field geometries, oriented in the 3D space, and following convergence/divergence rates provided in literature. Neurons are modeled using Hodgkin and Huxley representations. The network is validated by reproducing typical behaviors which are well-documented in the literature, such as the center-surround organization. The reconstruction of a network, whose volume is 600 × 150 × 1,200 μm3 with 432,000 granules, 972 Golgi cells, 32,399 glomeruli, and 4,051 mossy fibers, takes 235 s on an Intel i9 processor. The 10 s activity reproduction takes only 4.34 and 3.37 h exploiting a single and multi-GPU desktop system (with one or two NVIDIA RTX 2080 GPU, respectively). Moreover, the code takes only 3.52 and 2.44 h if run on one or two NVIDIA V100 GPU, respectively. The relevant speedups reached (up to ~38× in the single-GPU version, and ~55× in the multi-GPU) clearly demonstrate that the GPU technology is highly suitable for realistic large network simulations.

Download Full-text

Identifying Distinctive Features of Productive and Socially Efficient Schools

Standards and Monitoring in Education ◽

10.12737/article_5d2da1df971e12.57383007 ◽

2019 ◽

Vol 7 (4) ◽

pp. 15-23

Author(s):

Марина Матюшкина ◽

Marina Matyushkina ◽

Константин Белоусов ◽

Konstantin Belousov

Keyword(s):

High Performance ◽

Large Scale ◽

Empirical Studies ◽

Basic Research ◽

Successful Schools ◽

Distinctive Features ◽

Effi Ciency ◽

Level Of Use ◽

High Level ◽

State Examination

The article presents the results of a series of empirical studies devoted to the analysis of the relationship between school performance (according to the Unified State Examination criterion), its social efficiency (according to the criterion of the frequency of student circulation to tutors) and various social and pedagogical characteristics of the school. A correlation analysis was carried out on an array of data obtained over 5 years of regular comprehensive surveys in schools of St. Petersburg. The sets of signs that are most characteristic for schools with high performance and for schools with high social effi ciency are identified and described. Distinctive features of successful schools are associated with a high level of use of tutoring services by students, with good material and technical conditions, teachers’ competence in the use of design and research methods, etc. In socially eff ective schools, the achievement of students’ academic results is based on the use of their own school strengths — teachers’ potential, innovative technologies with large-scale attraction of the Internet and electronic resources. The study was carried out with the fi nancial support of the Russian Foundation for Basic Research in the framework of the scientifi c project “Signs of an eff ective school in conditions of the mass distribution of tutoring practices” No. 19-013-00455.

Download Full-text

From Physics Model to Results: An Optimizing Framework for Cross-Architecture Code Generation

Scientific Programming ◽

10.1155/2013/167841 ◽

2013 ◽

Vol 21 (1-2) ◽

pp. 1-16 ◽

Cited By ~ 6

Author(s):

Marek Blazewicz ◽

Ian Hinder ◽

David M. Koppelman ◽

Steven R. Brandt ◽

Milosz Ciznicki ◽

...

Keyword(s):

Code Generation ◽

High Performance ◽

Large Scale ◽

Einstein Equations ◽

Efficient Manner ◽

Code Transformations ◽

Wide Range ◽

Problem Description ◽

Physics Model ◽

High Level

Starting from a high-level problem description in terms of partial differential equations using abstract tensor notation, theChemoraframework discretizes, optimizes, and generates complete high performance codes for a wide range of compute architectures. Chemora extends the capabilities of Cactus, facilitating the usage of large-scale CPU/GPU systems in an efficient manner for complex applications, without low-level code tuning. Chemora achieves parallelism through MPI and multi-threading, combining OpenMP and CUDA. Optimizations include high-level code transformations, efficient loop traversal strategies, dynamically selected data and instruction cache usage strategies, and JIT compilation of GPU code tailored to the problem characteristics. The discretization is based on higher-order finite differences on multi-block domains. Chemora's capabilities are demonstrated by simulations of black hole collisions. This problem provides an acid test of the framework, as the Einstein equations contain hundreds of variables and thousands of terms.

Download Full-text

Grafs: declarative graph analytics

Proceedings of the ACM on Programming Languages ◽

10.1145/3473588 ◽

2021 ◽

Vol 5 (ICFP) ◽

pp. 1-32

Author(s):

Farzin Houshmand ◽

Mohsen Lesani ◽

Keval Vora

Keyword(s):

High Performance ◽

Large Scale ◽

Kernel Functions ◽

Runtime Systems ◽

Graph Processing ◽

Large Graphs ◽

Graph Analytics ◽

Efficient Code ◽

High Level ◽

Abstract Interface

Graph analytics elicits insights from large graphs to inform critical decisions for business, safety and security. Several large-scale graph processing frameworks feature efficient runtime systems; however, they often provide programming models that are low-level and subtly different from each other. Therefore, end users can find implementation and specially optimization of graph analytics error-prone and time-consuming. This paper regards the abstract interface of the graph processing frameworks as the instruction set for graph analytics, and presents Grafs, a high-level declarative specification language for graph analytics and a synthesizer that automatically generates efficient code for five high-performance graph processing frameworks. It features novel semantics-preserving fusion transformations that optimize the specifications and reduce them to three primitives: reduction over paths, mapping over vertices and reduction over vertices. Reductions over paths are commonly calculated based on push or pull models that iteratively apply kernel functions at the vertices. This paper presents conditions, parametric in terms of the kernel functions, for the correctness and termination of the iterative models, and uses these conditions as specifications to automatically synthesize the kernel functions. Experimental results show that the generated code matches or outperforms handwritten code, and that fusion accelerates execution.

Download Full-text