CONSENSUS-LIKE ALGORITHMS FOR ESTIMATION OF GAUSSIAN MIXTURES OVER LARGE SCALE NETWORKS

In this paper, we address the problem of estimating Gaussian mixtures in a sensor network. The scenario we consider is the following: a common signal is acquired by sensors, whose measurements are affected by standard Gaussian noise and by different offsets. The measurements can thus be statistically modeled as mixtures of Gaussians with equal variance and different expected values. The aim of the network is to achieve a common estimation of the signal, and to cluster the sensors according to their own offsets. For this purpose, we develop an iterative, distributed, consensus-like algorithm based on Maximum Likelihood estimation, which is well-suited to work in-network when the communication to a central processing unit is not allowed. Estimation is performed by the sensors themselves, which typically consist of devices with limited computational capabilities. Our main contribution is the analytical proof of the convergence of the algorithm. Our protocol is compared with existing methods via numerical simulations and the trade-offs between robustness, speed of convergence and implementation simplicity are discussed in detail.

Download Full-text

A Parallel-Computing Approach for Vector Road-Network Matching Using GPU Architecture

ISPRS International Journal of Geo-Information ◽

10.3390/ijgi7120472 ◽

2018 ◽

Vol 7 (12) ◽

pp. 472 ◽

Cited By ~ 1

Author(s):

Bo Wan ◽

Lin Yang ◽

Shunping Zhou ◽

Run Wang ◽

Dezhi Wang ◽

...

Keyword(s):

Road Network ◽

Large Scale ◽

Graphics Processing Unit ◽

Road Networks ◽

Processing Unit ◽

Data Partition ◽

Matching Method ◽

The Road ◽

Central Processing ◽

Relaxation Matching

The road-network matching method is an effective tool for map integration, fusion, and update. Due to the complexity of road networks in the real world, matching methods often contain a series of complicated processes to identify homonymous roads and deal with their intricate relationship. However, traditional road-network matching algorithms, which are mainly central processing unit (CPU)-based approaches, may have performance bottleneck problems when facing big data. We developed a particle-swarm optimization (PSO)-based parallel road-network matching method on graphics-processing unit (GPU). Based on the characteristics of the two main stages (similarity computation and matching-relationship identification), data-partition and task-partition strategies were utilized, respectively, to fully use GPU threads. Experiments were conducted on datasets with 14 different scales. Results indicate that the parallel PSO-based matching algorithm (PSOM) could correctly identify most matching relationships with an average accuracy of 84.44%, which was at the same level as the accuracy of a benchmark—the probability-relaxation-matching (PRM) method. The PSOM approach significantly reduced the road-network matching time in dealing with large amounts of data in comparison with the PRM method. This paper provides a common parallel algorithm framework for road-network matching algorithms and contributes to integration and update of large-scale road-networks.

Download Full-text

Step Ring Based 3D Path Planning via GPU Simulation for Subtractive 3D Printing

Volume 2: Materials; Biomanufacturing; Properties, Applications and Systems; Sustainable Manufacturing ◽

10.1115/msec2016-8751 ◽

2016 ◽

Cited By ~ 1

Author(s):

Zhengkai Wu ◽

Thomas M. Tucker ◽

Chandra Nath ◽

Thomas R. Kurfess ◽

Richard W. Vuduc

Keyword(s):

3D Printing ◽

Path Planning ◽

Large Scale ◽

Cnc Machining ◽

Scale Model ◽

Material Surface ◽

Processing Unit ◽

Cad Model ◽

Set Partition ◽

Central Processing

In this paper, both software model visualization with path simulation and associated machining product are produced based on the step ring based 3-axis path planning to demo model-driven graphics processing unit (GPU) feature in tool path planning and 3D image model classification by GPU simulation. Subtractive 3D printing (i.e., 3D machining) is represented as integration between 3D printing modeling and CNC machining via GPU simulated software. Path planning is applied through material surface removal visualization in high resolution and 3D path simulation via ring selective path planning based on accessibility of path through pattern selection. First, the step ring selects critical features to reconstruct computer aided design (CAD) design model as STL (stereolithography) voxel, and then local optimization is attained within interested ring area for time and energy saving of GPU volume generation as compared to global all automatic path planning with longer latency. The reconstructed CAD model comes from an original sample (GATech buzz) with 2D image information. CAD model for optimization and validation is adopted to sustain manufacturing reproduction based on system simulation feedback. To avoid collision with the produced path from retraction path, we pick adaptive ring path generation and prediction in each planning iteration, which may also minimize material removal. Moreover, we did partition analysis and g-code optimization for large scale model and high density volume data. Image classification and grid analysis based on adaptive 3D tree depth are proposed for multi-level set partition of the model to define no cutting zones. After that, accessibility map is computed based on accessibility space for rotational angular space of path orientation to compare step ring based pass planning verses global all path planning. Feature analysis via central processing unit (CPU) or GPU processor for GPU map computation contributes to high performance computing and cloud computing potential through parallel computing application of subtractive 3D printing in the future.

Download Full-text

Analysis of Heat and Smoke Propagation and Oscillatory Flow through Ceiling Vents in a Large-Scale Compartment Fire

Applied Sciences ◽

10.3390/app9163305 ◽

2019 ◽

Vol 9 (16) ◽

pp. 3305 ◽

Cited By ~ 1

Author(s):

Claudio Zanzi ◽

Pablo Gómez ◽

Joaquín López ◽

Julio Hernández

Keyword(s):

Convective Heat ◽

Large Scale ◽

Natural Ventilation ◽

Heat Propagation ◽

Oscillatory Behavior ◽

Combustion Model ◽

Processing Unit ◽

Fire Model ◽

Central Processing ◽

Mass Fluxes

One question that often arises is whether a specialized code or a more general code may be equally suitable for fire modeling. This paper investigates the performance and capabilities of a specialized code (FDS) and a general-purpose code (FLUENT) to simulate a fire in the commercial area of an underground intermodal transportation station. In order to facilitate a more precise comparison between the two codes, especially with regard to ventilation issues, the number of factors that may affect the fire evolution is reduced by simplifying the scenario and the fire model. The codes are applied to the same fire scenario using a simplified fire model, which considers a source of mass, heat and species to characterize the fire focus, and whose results are also compared with those obtained using FDS and a combustion model. An oscillating behavior of the fire-induced convective heat and mass fluxes through the natural vents is predicted, whose frequency compares well with experimental results for the ranges of compartment heights and heat release rates considered. The results obtained with the two codes for the smoke and heat propagation patterns and convective fluxes through the forced and natural ventilation systems are discussed and compared to each other. The agreement is very good for the temperature and species concentration distributions and the overall flow pattern, whereas appreciable discrepancies are only found in the oscillatory behavior of the fire-induced convective heat and mass fluxes through the natural vents. The relative performance of the codes in terms of central processing unit (CPU) time consumption is also discussed.

Download Full-text

Randomized algorithms of maximum likelihood estimation with spatial autoregressive models for large-scale networks

Statistics and Computing ◽

10.1007/s11222-019-09862-4 ◽

2019 ◽

Vol 29 (5) ◽

pp. 1165-1179

Author(s):

Miaoqi Li ◽

Emily L. Kang

Keyword(s):

Maximum Likelihood ◽

Maximum Likelihood Estimation ◽

Randomized Algorithms ◽

Large Scale ◽

Likelihood Estimation ◽

Autoregressive Models ◽

Spatial Autoregressive ◽

Spatial Autoregressive Models ◽

Large Scale Networks

Download Full-text

High-performance computing in water resources hydrodynamics

Journal of Hydroinformatics ◽

10.2166/hydro.2020.163 ◽

2020 ◽

Vol 22 (5) ◽

pp. 1217-1235 ◽

Cited By ~ 3

Author(s):

M. Morales-Hernández ◽

M. B. Sharif ◽

S. Gangrade ◽

T. T. Dullo ◽

S.-C. Kao ◽

...

Keyword(s):

Water Resources ◽

High Performance Computing ◽

Graphics Processing Units ◽

High Performance ◽

Large Scale ◽

Test Case ◽

Processing Unit ◽

Central Processing ◽

Graphics Processing ◽

Performance Computing

Abstract This work presents a vision of future water resources hydrodynamics codes that can fully utilize the strengths of modern high-performance computing (HPC). The advances to computing power, formerly driven by the improvement of central processing unit processors, now focus on parallel computing and, in particular, the use of graphics processing units (GPUs). However, this shift to a parallel framework requires refactoring the code to make efficient use of the data as well as changing even the nature of the algorithm that solves the system of equations. These concepts along with other features such as the precision for the computations, dry regions management, and input/output data are analyzed in this paper. A 2D multi-GPU flood code applied to a large-scale test case is used to corroborate our statements and ascertain the new challenges for the next-generation parallel water resources codes.

Download Full-text

An alternative approach for collaborative simulation execution on a CPU+GPU hybrid system

SIMULATION ◽

10.1177/0037549719885178 ◽

2019 ◽

Vol 96 (3) ◽

pp. 347-361

Author(s):

Wenjie Tang ◽

Wentong Cai ◽

Yiping Yao ◽

Xiao Song ◽

Feng Zhu

Keyword(s):

Hybrid System ◽

Large Scale ◽

Scheduling Algorithm ◽

Discrete Event ◽

Processing Unit ◽

Model Computation ◽

Central Processing ◽

Collaborative Simulation ◽

Alternative Approach ◽

Parallel Discrete Event

In the past few years, the graphics processing unit (GPU) has been widely used to accelerate time-consuming models in simulations. Since both model computation and simulation management are main factors that affect the performance of large-scale simulations, only accelerating model computation will limit the potential speedup. Moreover, models that can be well accelerated by a GPU could be insufficient, especially for simulations with many lightweight models. Traditionally, the parallel discrete event simulation (PDES) method is used to solve this class of simulation, but most PDES simulators only utilize the central processing unit (CPU) even though the GPU is commonly available now. Hence, we propose an alternative approach for collaborative simulation execution on a CPU+GPU hybrid system. The GPU supports both simulation management and model computation as CPUs. A concurrency-oriented scheduling algorithm was proposed to enable cooperation between the CPU and the GPU, so that multiple computation and communication resources can be efficiently utilized. In addition, GPU functions have also been carefully designed to adapt the algorithm. The combination of those efforts allows the proposed approach to achieve significant speedup compared to the traditional PDES on a CPU.

Download Full-text

Survey of deployment locations and underlying hardware architectures for contemporary deep neural networks

International Journal of Distributed Sensor Networks ◽

10.1177/1550147719868669 ◽

2019 ◽

Vol 15 (8) ◽

pp. 155014771986866

Author(s):

Miloš Kotlar ◽

Dragan Bojić ◽

Marija Punt ◽

Veljko Milutinović

Keyword(s):

Neural Networks ◽

Integrated Circuits ◽

Data Processing ◽

Deep Neural Networks ◽

Multicore Processors ◽

Two Dimensions ◽

Processing Unit ◽

Central Processing ◽

Trade Offs ◽

Dew Computing

This article overviews the emerging use of deep neural networks in data analytics and explores which type of underlying hardware and architectural approach is best used in various deployment locations when implementing deep neural networks. The locations which are discussed are in the cloud, fog, and dew computing (dew computing is performed by end devices). Covered architectural approaches include multicore processors (central processing unit), manycore processors (graphics processing unit), field programmable gate arrays, and application-specific integrated circuits. The proposed classification in this article divides the existing solutions into 12 different categories, organized in two dimensions. The proposed classification allows a comparison of existing architectures, which are predominantly cloud-based, and anticipated future architectures, which are expected to be hybrid cloud-fog-dew architectures for applications in Internet of Things and Wireless Sensor Networks. Researchers interested in studying trade-offs among data processing bandwidth, data processing latency, and processing power consumption would benefit from the classification made in this article.

Download Full-text

A robust system reliability analysis using partitioning and parallel processing of Markov chain

Artificial intelligence for engineering design analysis and manufacturing ◽

10.1017/s0890060414000493 ◽

2014 ◽

Vol 28 (4) ◽

pp. 311-322 ◽

Cited By ~ 1

Author(s):

Po Ting Lin ◽

Yu-Cheng Chou ◽

Yung Ting ◽

Shian-Shing Shyu ◽

Chang-Kuo Chen

Keyword(s):

Markov Chain ◽

Parallel Processing ◽

Reliability Analysis ◽

System Reliability ◽

Large Scale ◽

Transition Probability ◽

Transition Probability Matrix ◽

Processing Unit ◽

Central Processing ◽

System Reliability Analysis

AbstractThis paper presents a robust reliability analysis method for systems of multimodular redundant (MMR) controllers using the method of partitioning and parallel processing of a Markov chain (PPMC). A Markov chain is formulated to represent the N distinct states of the MMR controllers. Such a Markov chain has N2 directed edges, and each edge corresponds to a transition probability between a pair of start and end states. Because N can be easily increased substantially, the system reliability analysis may require large computational resources, such as the central processing unit usage and memory occupation. By the PPMC, a Markov chain's transition probability matrix can be partitioned and reordered, such that the system reliability can be evaluated through only the diagonal submatrices of the transition probability matrix. In addition, calculations regarding the submatrices are independent of each other and thus can be conducted in parallel to assure the efficiency. The simulation results show that, compared with the sequential method applied to an intact Markov chain, the proposed PPMC can improve the performance and produce allowable accuracy for the reliability analysis on large-scale systems of MMR controllers.

Download Full-text

Energy and performance costs evaluation for BLE mesh links

10.5753/sbrc.2018.2452 ◽

2018 ◽

Author(s):

Henrique Carvalho Silva ◽

Cíntia Borges Margi

Keyword(s):

Energy Consumption ◽

Large Scale ◽

Mesh Networks ◽

Lower Boundary ◽

Low Energy ◽

Trade Offs ◽

Main Challenge ◽

And Performance ◽

Large Scale Networks ◽

The Internet Of Things

Bluetooth Low Energy (BLE for short) is among the favorites to become a de facto standard in the context of the Internet of Things (IoT). However, its main challenge is the lack of standards for efficient mesh networking. Furthermore, the literature lacks works analyzing energy consumption trade-offs for BLE mesh networks. We address this issue by experimentally evaluating three minimal topologies for linking separate BLE star networks. We aim to determine a lower boundary in terms of energy and performance costs using the metrics of energy consumption, delivery rate, and goodput. We perform our experiments using a testbed comprised of TI CC1350 nodes running Contiki OS. Our results enable us to estimate similar costs for large scale networks.

Download Full-text

Large Scale Finite Element Analysis Via Assembly-Free Deflated Conjugate Gradient

Journal of Computing and Information Science in Engineering ◽

10.1115/1.4028591 ◽

2014 ◽

Vol 14 (4) ◽

Cited By ~ 11

Author(s):

Praveen Yadav ◽

Krishnan Suresh

Keyword(s):

Finite Element Analysis ◽

Finite Element ◽

Conjugate Gradient ◽

Degrees Of Freedom ◽

Large Scale ◽

Solid Mechanics ◽

Processing Unit ◽

Element Analysis ◽

Central Processing ◽

Computational Bottleneck

Large-scale finite element analysis (FEA) with millions of degrees of freedom (DOF) is becoming commonplace in solid mechanics. The primary computational bottleneck in such problems is the solution of large linear systems of equations. In this paper, we propose an assembly-free version of the deflated conjugate gradient (DCG) for solving such equations, where neither the stiffness matrix nor the deflation matrix is assembled. While assembly-free FEA is a well-known concept, the novelty pursued in this paper is the use of assembly-free deflation. The resulting implementation is particularly well suited for large-scale problems and can be easily ported to multicore central processing unit (CPU) and graphics-programmable unit (GPU) architectures. For demonstration, we show that one can solve a 50 × 106 degree of freedom system on a single GPU card, equipped with 3 GB of memory. The second contribution is an extension of the “rigid-body agglomeration” concept used in DCG to a “curvature-sensitive agglomeration.” The latter exploits classic plate and beam theories for efficient deflation of highly ill-conditioned problems arising from thin structures.

Download Full-text