Distributed Machine Learning through Heterogeneous Edge Systems

Hanpeng Hu; Dan Wang; Chuan Wu

doi:10.1609/aaai.v34i05.6207

Distributed Machine Learning through Heterogeneous Edge Systems

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i05.6207 ◽

2020 ◽

Vol 34 (05) ◽

pp. 7179-7186

Author(s):

Hanpeng Hu ◽

Dan Wang ◽

Chuan Wu

Keyword(s):

Machine Learning ◽

Convergence Time ◽

Privacy Concerns ◽

Model Update ◽

Large Heterogeneity ◽

Testbed Implementation ◽

Model Training ◽

Iot Devices ◽

Core Idea ◽

Distributed Machine Learning

Many emerging AI applications request distributed machine learning (ML) among edge systems (e.g., IoT devices and PCs at the edge of the Internet), where data cannot be uploaded to a central venue for model training, due to their large volumes and/or security/privacy concerns. Edge devices are intrinsically heterogeneous in computing capacity, posing significant challenges to parameter synchronization for parallel training with the parameter server (PS) architecture. This paper proposes ADSP, a parameter synchronization model for distributed machine learning (ML) with heterogeneous edge systems. Eliminating the significant waiting time occurring with existing parameter synchronization models, the core idea of ADSP is to let faster edge devices continue training, while committing their model updates at strategically decided intervals. We design algorithms that decide time points for each worker to commit its model update, and ensure not only global model convergence but also faster convergence. Our testbed implementation and experiments show that ADSP outperforms existing parameter synchronization models significantly in terms of ML model convergence time, scalability and adaptability to large heterogeneity.

Download Full-text

Machine Learning at the Network Edge: A Survey

ACM Computing Surveys ◽

10.1145/3469029 ◽

2022 ◽

Vol 54 (8) ◽

pp. 1-37

Author(s):

M. G. Sarwar Murshed ◽

Christopher Murphy ◽

Daqing Hou ◽

Nazar Khan ◽

Ganesh Ananthanarayanan ◽

...

Keyword(s):

Machine Learning ◽

Learning Systems ◽

Major Research ◽

Sensors And Actuators ◽

Computing Systems ◽

Privacy Concerns ◽

Iot Devices ◽

Cloud Servers ◽

Typical Solution ◽

Operational Aspects

Resource-constrained IoT devices, such as sensors and actuators, have become ubiquitous in recent years. This has led to the generation of large quantities of data in real-time, which is an appealing target for AI systems. However, deploying machine learning models on such end-devices is nearly impossible. A typical solution involves offloading data to external computing systems (such as cloud servers) for further processing but this worsens latency, leads to increased communication costs, and adds to privacy concerns. To address this issue, efforts have been made to place additional computing devices at the edge of the network, i.e., close to the IoT devices where the data is generated. Deploying machine learning systems on such edge computing devices alleviates the above issues by allowing computations to be performed close to the data sources. This survey describes major research efforts where machine learning systems have been deployed at the edge of computer networks, focusing on the operational aspects including compression techniques, tools, frameworks, and hardware used in successful applications of intelligent edge systems.

Download Full-text

FedPARL: Client Activity and Resource-Oriented Lightweight Federated Learning Model for Resource-Constrained Heterogeneous IoT Environment

Frontiers in Communications and Networks ◽

10.3389/frcmn.2021.657653 ◽

2021 ◽

Vol 2 ◽

Author(s):

Ahmed Imteaj ◽

M. Hadi Amini

Keyword(s):

Machine Learning ◽

Resource Availability ◽

Resource Constraints ◽

Training Model ◽

Convergence Time ◽

Battery Life ◽

Sensitive Information ◽

Learning Approaches ◽

Resource Constrained ◽

Distributed Machine Learning

Federated Learning (FL) is a recently invented distributed machine learning technique that allows available network clients to perform model training at the edge, rather than sharing it with a centralized server. Unlike conventional distributed machine learning approaches, the hallmark feature of FL is to allow performing local computation and model generation on the client side, ultimately protecting sensitive information. Most of the existing FL approaches assume that each FL client has sufficient computational resources and can accomplish a given task without facing any resource-related issues. However, if we consider FL for a heterogeneous Internet of Things (IoT) environment, a major portion of the FL clients may face low resource availability (e.g., lower computational power, limited bandwidth, and battery life). Consequently, the resource-constrained FL clients may give a very slow response, or may be unable to execute expected number of local iterations. Further, any FL client can inject inappropriate model during a training phase that can prolong convergence time and waste resources of all the network clients. In this paper, we propose a novel tri-layer FL scheme, Federated Proximal, Activity and Resource-Aware 31 Lightweight model (FedPARL), that reduces model size by performing sample-based pruning, avoids misbehaved clients by examining their trust score, and allows partial amount of work by considering their resource-availability. The pruning mechanism is particularly useful while dealing with resource-constrained FL-based IoT (FL-IoT) clients. In this scenario, the lightweight training model will consume less amount of resources to accomplish a target convergence. We evaluate each interested client's resource-availability before assigning a task, monitor their activities, and update their trust scores based on their previous performance. To tackle system and statistical heterogeneities, we adapt a re-parameterization and generalization of the current state-of-the-art Federated Averaging (FedAvg) algorithm. The modification of FedAvg algorithm allows clients to perform variable or partial amounts of work considering their resource-constraints. We demonstrate that simultaneously adapting the coupling of pruning, resource and activity awareness, and re-parameterization of FedAvg algorithm leads to more robust convergence of FL in IoT environment.

Download Full-text

Communication-efficient and Scalable Decentralized Federated Edge Learning

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2021/720 ◽

2021 ◽

Author(s):

Austine Zong Han Yapp ◽

Hong Soo Nicholas Koh ◽

Yan Ting Lai ◽

Jiawen Kang ◽

Xuandi Li ◽

...

Keyword(s):

Machine Learning ◽

Data Privacy ◽

Large Scale ◽

Local Model ◽

Communication Overhead ◽

Model Training ◽

Collaborative Training ◽

Distributed Machine Learning ◽

The Coordinator

Federated Edge Learning (FEL) is a distributed Machine Learning (ML) framework for collaborative training on edge devices. FEL improves data privacy over traditional centralized ML model training by keeping data on the devices and only sending local model updates to a central coordinator for aggregation. However, challenges still remain in existing FEL architectures where there is high communication overhead between edge devices and the coordinator. In this paper, we present a working prototype of blockchain-empowered and communication-efficient FEL framework, which enhances the security and scalability towards large-scale implementation of FEL.

Download Full-text

Evaluating Federated Learning Scenarios in a Tumor Classification Application

10.5753/eradrj.2021.18558 ◽

2021 ◽

Author(s):

Rafaela C. Brum ◽

George Teodoro ◽

Lúcia Drummond ◽

Luciana Arantes ◽

Maria Clicia Castro ◽

...

Keyword(s):

Machine Learning ◽

Execution Time ◽

Data Privacy ◽

Tumor Infiltrating Lymphocytes ◽

Tumor Classification ◽

Financial Cost ◽

Privacy Concerns ◽

Learning Scenarios ◽

Infiltrating Lymphocytes ◽

Distributed Machine Learning

Federated Learning is a new area of distributed Machine Learning (ML) that emerged to deal with data privacy concerns. In this approach, each client has access to a local and private dataset. They only exchange the model weights and updates. This paper presents a Federated Learning (FL) approach to a cloud Tumor-Infiltrating Lymphocytes (TIL) application. The results show that the FL approach outperformed the centralized one in all evaluated ML metrics. It also reduced the execution time although the financial cost has increased.

Download Full-text

Joint Data Collection and Resource Allocation for Distributed Machine Learning at the Edge

IEEE Transactions on Mobile Computing ◽

10.1109/tmc.2020.3045436 ◽

2020 ◽

pp. 1-1

Author(s):

Min Chen ◽

Haichuan Wang ◽

Zeyu Meng ◽

Hongli Xu ◽

Yang Xu ◽

...

Keyword(s):

Machine Learning ◽

Resource Allocation ◽

Data Collection ◽

Distributed Machine Learning

Download Full-text

PODC 2020 Review

ACM SIGACT News ◽

10.1145/3444815.3444827 ◽

2021 ◽

Vol 51 (4) ◽

pp. 75-81

Author(s):

Ahad Mirza Baig ◽

Alkida Balliu ◽

Peter Davies ◽

Michal Dory

Keyword(s):

Machine Learning ◽

Distributed Computing ◽

Keynote Speaker ◽

Lively Discussion ◽

Theoretical Understanding ◽

New Directions ◽

New Ideas ◽

New Challenges ◽

The Impact ◽

Distributed Machine Learning

Rachid Guerraoui was the rst keynote speaker, and he got things o to a great start by discussing the broad relevance of the research done in our community relative to both industry and academia. He rst argued that, in some sense, the fact that distributed computing is so pervasive nowadays could end up sti ing progress in our community by inducing people to work on marginal problems, and becoming isolated. His rst suggestion was to try to understand and incorporate new ideas coming from applied elds into our research, and argued that this has been historically very successful. He illustrated this point via the distributed payment problem, which appears in the context of blockchains, in particular Bitcoin, but then turned out to be very theoretically interesting; furthermore, the theoretical understanding of the problem inspired new practical protocols. He then went further to discuss new directions in distributed computing, such as the COVID tracing problem, and new challenges in Byzantine-resilient distributed machine learning. Another source of innovation Rachid suggested was hardware innovations, which he illustrated with work studying the impact of RDMA-based primitives on fundamental problems in distributed computing. The talk concluded with a very lively discussion.

Download Full-text

MODES: model-based optimization on distributed embedded systems

Machine Learning ◽

10.1007/s10994-021-06014-6 ◽

2021 ◽

Author(s):

Junjie Shi ◽

Jiang Bian ◽

Jakob Richter ◽

Kuan-Hsun Chen ◽

Jörg Rahnenführer ◽

...

Keyword(s):

Machine Learning ◽

Embedded Systems ◽

Learning Model ◽

Black Box ◽

Distributed Embedded Systems ◽

Data Set ◽

Individual Model ◽

Model Based ◽

Machine Learning Model ◽

Distributed Machine Learning

AbstractThe predictive performance of a machine learning model highly depends on the corresponding hyper-parameter setting. Hence, hyper-parameter tuning is often indispensable. Normally such tuning requires the dedicated machine learning model to be trained and evaluated on centralized data to obtain a performance estimate. However, in a distributed machine learning scenario, it is not always possible to collect all the data from all nodes due to privacy concerns or storage limitations. Moreover, if data has to be transferred through low bandwidth connections it reduces the time available for tuning. Model-Based Optimization (MBO) is one state-of-the-art method for tuning hyper-parameters but the application on distributed machine learning models or federated learning lacks research. This work proposes a framework $$\textit{MODES}$$ MODES that allows to deploy MBO on resource-constrained distributed embedded systems. Each node trains an individual model based on its local data. The goal is to optimize the combined prediction accuracy. The presented framework offers two optimization modes: (1) $$\textit{MODES}$$ MODES -B considers the whole ensemble as a single black box and optimizes the hyper-parameters of each individual model jointly, and (2) $$\textit{MODES}$$ MODES -I considers all models as clones of the same black box which allows it to efficiently parallelize the optimization in a distributed setting. We evaluate $$\textit{MODES}$$ MODES by conducting experiments on the optimization for the hyper-parameters of a random forest and a multi-layer perceptron. The experimental results demonstrate that, with an improvement in terms of mean accuracy ($$\textit{MODES}$$ MODES -B), run-time efficiency ($$\textit{MODES}$$ MODES -I), and statistical stability for both modes, $$\textit{MODES}$$ MODES outperforms the baseline, i.e., carry out tuning with MBO on each node individually with its local sub-data set.

Download Full-text

Minimizing Training Time of Distributed Machine Learning by Reducing Data Communication

IEEE Transactions on Network Science and Engineering ◽

10.1109/tnse.2021.3073897 ◽

2021 ◽

pp. 1-1

Author(s):

Yubin Duan ◽

Ning Wang ◽

Jie Wu

Keyword(s):

Machine Learning ◽

Data Communication ◽

Training Time ◽

Distributed Machine Learning

Download Full-text

Machine Learning Empowered Trust Evaluation Method for IoT Devices

IEEE Access ◽

10.1109/access.2021.3076118 ◽

2021 ◽

Vol 9 ◽

pp. 65066-65077

Author(s):

Wei Ma ◽

Xing Wang ◽

Mingsheng Hu ◽

Qinglei Zhou

Keyword(s):

Machine Learning ◽

Evaluation Method ◽

Trust Evaluation ◽

Iot Devices

Download Full-text

Privacy Preserving Machine Learning with Homomorphic Encryption and Federated Learning

Future Internet ◽

10.3390/fi13040094 ◽

2021 ◽

Vol 13 (4) ◽

pp. 94

Author(s):

Haokun Fang ◽

Quan Qian

Keyword(s):

Machine Learning ◽

Homomorphic Encryption ◽

Privacy Preserving ◽

Great Success ◽

Learning Framework ◽

Computational Overhead ◽

Important Concern ◽

Speed Up ◽

Key Length ◽

Core Idea

Privacy protection has been an important concern with the great success of machine learning. In this paper, it proposes a multi-party privacy preserving machine learning framework, named PFMLP, based on partially homomorphic encryption and federated learning. The core idea is all learning parties just transmitting the encrypted gradients by homomorphic encryption. From experiments, the model trained by PFMLP has almost the same accuracy, and the deviation is less than 1%. Considering the computational overhead of homomorphic encryption, we use an improved Paillier algorithm which can speed up the training by 25–28%. Moreover, comparisons on encryption key length, the learning network structure, number of learning clients, etc. are also discussed in detail in the paper.

Download Full-text