Adaptive In-Network Collaborative Caching for Enhanced Ensemble Deep Learning at Edge

To enhance the quality and speed of data processing and protect the privacy and security of the data, edge computing has been extensively applied to support data-intensive intelligent processing services at edge. Among these data-intensive services, ensemble learning-based services can, in natural, leverage the distributed computation and storage resources at edge devices to achieve efficient data collection, processing, and analysis. Collaborative caching has been applied in edge computing to support services close to the data source, in order to take the limited resources at edge devices to support high-performance ensemble learning solutions. To achieve this goal, we propose an adaptive in-network collaborative caching scheme for ensemble learning at edge. First, an efficient data representation structure is proposed to record cached data among different nodes. In addition, we design a collaboration scheme to facilitate edge nodes to cache valuable data for local ensemble learning, by scheduling local caching according to a summarization of data representations from different edge nodes. Our extensive simulations demonstrate the high performance of the proposed collaborative caching scheme, which significantly reduces the learning latency and the transmission overhead.

Download Full-text

Infrastructure and Energy Conservation in Big Data Computing: A Survey

Journal of Telecommunications and Information Technology ◽

10.26636/jtit.2019.132419 ◽

2019 ◽

Vol 2 ◽

pp. 73-82

Author(s):

Ewa Niewiadomska-Szynkiewicz ◽

Michał P. Karpowicz

Keyword(s):

Big Data ◽

High Performance ◽

Physical Sciences ◽

Communication Management ◽

Data Intensive ◽

Efficient Data ◽

Software Platforms ◽

Allocation Algorithms ◽

And Storage ◽

Big Data Computing

Progress in life, physical sciences and technology depends on eﬃcient data-mining and modern computing technologies. The rapid growth of data-intensive domains requires a continuous development of new solutions for network infrastructure, servers and storage in order to address Big Datarelated problems. Development of software frameworks, include smart calculation, communication management, data decomposition and allocation algorithms is clearly one of the major technological challenges we are faced with. Reduction in energy consumption is another challenge arising in connection with the development of eﬃcient HPC infrastructures. This paper addresses the vital problem of energy-eﬃcient high performance distributed and parallel computing. An overview of recent technologies for Big Data processing is presented. The attention is focused on the most popular middleware and software platforms. Various energy-saving approaches are presented and discussed as well.

Download Full-text

A High-throughput Parallel Viterbi Algorithm via Bitslicing

ACM Transactions on Parallel Computing ◽

10.1145/3470642 ◽

2021 ◽

Vol 8 (4) ◽

pp. 1-25

Author(s):

Saleh Khalaj Monfared ◽

Omid Hajihassani ◽

Vahid Mohsseni ◽

Dara Rahmati ◽

Saeid Gorgin

Keyword(s):

High Throughput ◽

High Performance ◽

Viterbi Algorithm ◽

Data Representation ◽

Processing Unit ◽

Viterbi Decoder ◽

Soft Decision ◽

Content Type ◽

Data Intensive ◽

Representation Scheme

In this work, we present a novel bitsliced high-performance Viterbi algorithm suitable for high-throughput and data-intensive communication. A new column-major data representation scheme coupled with the bitsliced architecture is employed in our proposed Viterbi decoder that enables the maximum utilization of the parallel processing units in modern parallel accelerators. With the help of the proposed alteration of the data scheme, instead of the conventional bit-by-bit operations, 32-bit chunks of data are processed by each processing unit. This means that a single bitsliced parallel Viterbi decoder is capable of decoding 32 different chunks of data simultaneously. Here, the Viterbi’s Add-Compare-Select procedure is implemented with our proposed bitslicing technique, where it is shown that the bitsliced operations for the Viterbi internal functionalities are efficient in terms of their performance and complexity. We have achieved this level of high parallelism while keeping an acceptable bit error rate performance for our proposed methodology. Our suggested hard and soft-decision Viterbi decoder implementations on GPU platforms outperform the fastest previously proposed works by 4.3{\times } and 2.3{\times } , achieving 21.41 and 8.24 Gbps on Tesla V100, respectively.

Download Full-text

Secure, Efficient Data Transport and Replica Management for High-Performance Data-Intensive Computing

2001 Eighteenth IEEE Symposium on Mass Storage Systems and Technologies ◽

10.1109/mss.2001.10001 ◽

2001 ◽

Cited By ~ 68

Author(s):

Bill Allcock ◽

Joe Bester ◽

John Bresnahan ◽

Ann Chervenak ◽

Carl Kesselman ◽

...

Keyword(s):

High Performance ◽

Performance Data ◽

Data Transport ◽

Data Intensive Computing ◽

Data Intensive ◽

Replica Management ◽

Efficient Data

Download Full-text

Deep Learning and Edge Computing Solutions for High Performance Computing

10.1007/978-3-030-60265-9 ◽

2021 ◽

Keyword(s):

Deep Learning ◽

High Performance Computing ◽

High Performance ◽

Edge Computing ◽

Performance Computing

Download Full-text

Time-Efficient Ensemble Learning with Sample Exchange for Edge Computing

ACM Transactions on Internet Technology ◽

10.1145/3409265 ◽

2021 ◽

Vol 21 (3) ◽

pp. 1-17

Author(s):

Wu Chen ◽

Yong Yu ◽

Keke Gai ◽

Jiamou Liu ◽

Kim-Kwang Raymond Choo

Keyword(s):

Ensemble Learning ◽

Real World ◽

Interaction Mechanism ◽

Training Model ◽

Edge Computing ◽

Learning Techniques ◽

Multi Agent ◽

Real World Datasets ◽

Entire Dataset ◽

Exchange Data

In existing ensemble learning algorithms (e.g., random forest), each base learner’s model needs the entire dataset for sampling and training. However, this may not be practical in many real-world applications, and it incurs additional computational costs. To achieve better efficiency, we propose a decentralized framework: Multi-Agent Ensemble. The framework leverages edge computing to facilitate ensemble learning techniques by focusing on the balancing of access restrictions (small sub-dataset) and accuracy enhancement. Specifically, network edge nodes (learners) are utilized to model classifications and predictions in our framework. Data is then distributed to multiple base learners who exchange data via an interaction mechanism to achieve improved prediction. The proposed approach relies on a training model rather than conventional centralized learning. Findings from the experimental evaluations using 20 real-world datasets suggest that Multi-Agent Ensemble outperforms other ensemble approaches in terms of accuracy even though the base learners require fewer samples (i.e., significant reduction in computation costs).

Download Full-text

Auto-weighted concept factorization for joint feature map and data representation learning

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-200298 ◽

2021 ◽

pp. 1-13

Author(s):

Yikai Zhang ◽

Yong Peng ◽

Hongyu Bian ◽

Yuan Ge ◽

Feiwei Qin ◽

...

Keyword(s):

Objective Function ◽

Optimization Procedure ◽

Feature Space ◽

Representation Learning ◽

Data Representation ◽

Data Sets ◽

Reconstruction Process ◽

Factorization Model ◽

Efficient Data ◽

Concept Factorization

Concept factorization (CF) is an effective matrix factorization model which has been widely used in many applications. In CF, the linear combination of data points serves as the dictionary based on which CF can be performed in both the original feature space as well as the reproducible kernel Hilbert space (RKHS). The conventional CF treats each dimension of the feature vector equally during the data reconstruction process, which might violate the common sense that different features have different discriminative abilities and therefore contribute differently in pattern recognition. In this paper, we introduce an auto-weighting variable into the conventional CF objective function to adaptively learn the corresponding contributions of different features and propose a new model termed Auto-Weighted Concept Factorization (AWCF). In AWCF, on one hand, the feature importance can be quantitatively measured by the auto-weighting variable in which the features with better discriminative abilities are assigned larger weights; on the other hand, we can obtain more efficient data representation to depict its semantic information. The detailed optimization procedure to AWCF objective function is derived whose complexity and convergence are also analyzed. Experiments are conducted on both synthetic and representative benchmark data sets and the clustering results demonstrate the effectiveness of AWCF in comparison with the related models.

Download Full-text

High performance cloud computing using an efficient data service

2012 IEEE Asia Pacific Cloud Computing Congress (APCloudCC) ◽

10.1109/apcloudcc.2012.6486517 ◽

2012 ◽

Cited By ~ 1

Author(s):

Jaison Mulerikkal ◽

Peter Strazdins ◽

Boby Thekkanath

Keyword(s):

Cloud Computing ◽

High Performance ◽

Data Service ◽

Efficient Data

Download Full-text

Beyond MPI

ACM SIGMOD Record ◽

10.1145/3456859.3456862 ◽

2021 ◽

Vol 49 (4) ◽

pp. 12-17

Author(s):

Feilong Liu ◽

Claude Barthels ◽

Spyros Blanas ◽

Hideaki Kimura ◽

Garret Swart

Keyword(s):

High Performance ◽

Processing System ◽

Complex Interaction ◽

Remote Memory ◽

Interaction Patterns ◽

Round Trip ◽

Data Processing System ◽

Data Intensive ◽

Multiple Round ◽

Programming Interface

Networkswith Remote DirectMemoryAccess (RDMA) support are becoming increasingly common. RDMA, however, offers a limited programming interface to remote memory that consists of read, write and atomic operations. With RDMA alone, completing the most basic operations on remote data structures often requires multiple round-trips over the network. Data-intensive systems strongly desire higher-level communication abstractions that supportmore complex interaction patterns. A natural candidate to consider is MPI, the de facto standard for developing high-performance applications in the HPC community. This paper critically evaluates the communication primitives of MPI and shows that using MPI in the context of a data processing system comes with its own set of insurmountable challenges. Based on this analysis, we propose a new communication abstraction named RDMO, or Remote DirectMemory Operation, that dispatches a short sequence of reads, writes and atomic operations to remote memory and executes them in a single round-trip.

Download Full-text

A Proximal Algorithm for Fork-Choice in Distributed Ledger Technology for Context-Based Clustering on Edge Computing

Engineering Proceedings ◽

10.3390/ecsa-7-08261 ◽

2020 ◽

Vol 2 (1) ◽

pp. 92

Author(s):

Rahim Rahmani ◽

Ramin Firouzi ◽

Sachiko Lim ◽

Mahbub Alam

Keyword(s):

Large Scale ◽

Cluster Head ◽

Edge Computing ◽

Computing Technology ◽

Proximal Algorithm ◽

Data Intensive ◽

Distributed Ledger ◽

Smart Contract ◽

Distributed Ledger Technology ◽

Iot Devices

The major challenges of operating data-intensive of Distributed Ledger Technology (DLT) are (1) to reach consensus on the main chain as a set of validators cast public votes to decide on which blocks to finalize and (2) scalability on how to increase the number of chains which will be running in parallel. In this paper, we introduce a new proximal algorithm that scales DLT in a large-scale Internet of Things (IoT) devices network. We discuss how the algorithm benefits the integrating DLT in IoT by using edge computing technology, taking the scalability and heterogeneous capability of IoT devices into consideration. IoT devices are clustered dynamically into groups based on proximity context information. A cluster head is used to bridge the IoT devices with the DLT network where a smart contract is deployed. In this way, the security of the IoT is improved and the scalability and latency are solved. We elaborate on our mechanism and discuss issues that should be considered and implemented when using the proposed algorithm, we even show how it behaves with varying parameters like latency or when clustering.

Download Full-text

Custom templates based heterogeneous resource allocation for data-intensive applications

10.32469/10355/86482 ◽

2020 ◽

Author(s):

◽

Ronny Bazan Antequera

Keyword(s):

High Performance ◽

Real Data ◽

University Of Missouri ◽

Application Performance ◽

Data Intensive ◽

Edge Based ◽

The Right ◽

Heterogeneous Cloud ◽

Data Intensive Applications ◽

Cloud Resources

[ACCESS RESTRICTED TO THE UNIVERSITY OF MISSOURI-COLUMBIA AT REQUEST OF AUTHOR.] The increase of data-intensive applications in science and engineering fields (i.e., bioinformatics, cybermanufacturing) demand the use of high-performance computing resources. However, data-intensive applications' local resources usually present limited capacity and availability due to sizable upfront costs. Moreover, using remote public resources presents constraints at the private edge network domain. Specifically, mis-configured network policies cause bottlenecks due to the other application cross-traffic attempting to use shared networking resources. Additionally, selecting the right remote resources can be cumbersome especially for those users who are interested in the application execution considering nonfunctional requirements such as performance, security and cost. The data-intensive applications have recurrent deployments and similar infrastructure requirements that can be addressed by creating templates. In this thesis, we handle applications requirements through intelligent resource 'abstractions' coupled with 'reusable' approaches that save time and effort in deploying new cloud architectures. Specifically, we design a novel custom template middleware that can retrieve blue prints of resource configuration, technical/policy information, and benchmarks of workflow performance to facilitate repeatable/reusable resource composition. The middleware considers hybrid-recommendation methodology (Online and offline recommendation) to leverage a catalog to rapidly check custom template solution correctness before/during resource consumption. Further, it prescribes application adaptations by fostering effective social interactions during the application's scaling stages. Based on the above approach, we organize the thesis contributions under two main thrusts: (i) Custom Templates for Cloud Networking for Data-intensive Applications: This involves scheduling transit selection, engineering at the campus-edge based upon real-time policy control. Our solution ensures prioritized application performance delivery for multi-tenant traffic profiles from a diverse set of actual data intensive applications in bioinformatics. (ii) Custom Templates for Cloud Computing for Data-intensive Applications: This involves recommending cloud resources for data-intensive applications based on a custom template catalog. We develop a novel expert system approach that is implemented as a middleware to abstracts data-intensive application requirements for custom templates composition. We uniquely consider heterogeneous cloud resources selection for the deployment of cloud architectures for real data-intensive applications in cybermanufacturing.

Download Full-text