Communication-efficient and Scalable Decentralized Federated Edge Learning

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2021/720 ◽

2021 ◽

Author(s):

Austine Zong Han Yapp ◽

Hong Soo Nicholas Koh ◽

Yan Ting Lai ◽

Jiawen Kang ◽

Xuandi Li ◽

...

Keyword(s):

Machine Learning ◽

Data Privacy ◽

Large Scale ◽

Local Model ◽

Communication Overhead ◽

Model Training ◽

Collaborative Training ◽

Distributed Machine Learning ◽

The Coordinator

Federated Edge Learning (FEL) is a distributed Machine Learning (ML) framework for collaborative training on edge devices. FEL improves data privacy over traditional centralized ML model training by keeping data on the devices and only sending local model updates to a central coordinator for aggregation. However, challenges still remain in existing FEL architectures where there is high communication overhead between edge devices and the coordinator. In this paper, we present a working prototype of blockchain-empowered and communication-efficient FEL framework, which enhances the security and scalability towards large-scale implementation of FEL.

Download Full-text

Privacy-preserving Collaborative Training for Medical Image Analysis Based on Multi-Blockchain

Combinatorial Chemistry & High Throughput Screening ◽

10.2174/1386207323666201022110616 ◽

2020 ◽

Vol 23 ◽

Author(s):

Wanlu Zhang ◽

Qigang Wang ◽

Mei Li

Keyword(s):

Medical Image ◽

Data Privacy ◽

Medical Image Analysis ◽

Auxiliary Information ◽

Training Process ◽

Private Data ◽

Medical Institutions ◽

Model Training ◽

Collaborative Training ◽

Similar Task

Background: As artificial intelligence and big data analysis develop rapidly, data privacy, especially patient medical data privacy, is getting more and more attention. Objective: To strengthen the protection of private data while ensuring the model training process, this article introduces a multi-Blockchain-based decentralized collaborative machine learning training method for medical image analysis. In this way, researchers from different medical institutions are able to collaborate to train models without exchanging sensitive patient data. Method: Partial parameter update method is applied to prevent indirect privacy leakage during model propagation. With the peer-to-peer communication in the multi-Blockchain system, a machine learning task can leverage auxiliary information from another similar task in another Blockchain. In addition, after the collaborative training process, personalized models of different medical institutions will be trained. Results: The experimental results show that our method achieves similar performance with the centralized model-training method by collecting data sets of all participants and prevents private data leakage at the same time. Transferring auxiliary information from similar task on another Blockchain has also been proven to effectively accelerate model convergence and improve model accuracy, especially in the scenario of absence of data. Personalization training process further improves model performance. Conclusion: Our approach can effectively help researchers from different organizations to achieve collaborative training without disclosing their private data.

Download Full-text

Towards efficient and secure large-scale systems for distributed machine learning training

10.14711/thesis-991012936264103412 ◽

2021 ◽

Author(s):

Chengliang Zhang

Keyword(s):

Machine Learning ◽

Large Scale ◽

Large Scale Systems ◽

Distributed Machine Learning

Download Full-text

2D-HRA: Two-Dimensional Hierarchical Ring-Based All-Reduce Algorithm in Large-Scale Distributed Machine Learning

IEEE Access ◽

10.1109/access.2020.3028367 ◽

2020 ◽

Vol 8 ◽

pp. 183488-183494

Author(s):

Youhe Jiang ◽

Huaxi Gu ◽

Yunfeng Lu ◽

Xiaoshan Yu

Keyword(s):

Machine Learning ◽

Large Scale ◽

Two Dimensional ◽

Distributed Machine Learning

Download Full-text

From distributed machine learning to federated learning: In the view of data privacy and security

Concurrency and Computation Practice and Experience ◽

10.1002/cpe.6002 ◽

2020 ◽

Author(s):

Sheng Shen ◽

Tianqing Zhu ◽

Di Wu ◽

Wei Wang ◽

Wanlei Zhou

Keyword(s):

Machine Learning ◽

Data Privacy ◽

Privacy And Security ◽

Distributed Machine Learning

Download Full-text

Evaluation of Distributed Machine Learning Algorithms for Anomaly Detection from Large-Scale System Logs: A Case Study

2018 IEEE International Conference on Big Data (Big Data) ◽

10.1109/bigdata.2018.8621967 ◽

2018 ◽

Cited By ~ 1

Author(s):

Merve Astekin ◽

Harun Zengin ◽

Hasan Sozer

Keyword(s):

Machine Learning ◽

Anomaly Detection ◽

Large Scale ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Large Scale System ◽

System Logs ◽

Distributed Machine Learning

Download Full-text

Evaluating Federated Learning Scenarios in a Tumor Classification Application

10.5753/eradrj.2021.18558 ◽

2021 ◽

Author(s):

Rafaela C. Brum ◽

George Teodoro ◽

Lúcia Drummond ◽

Luciana Arantes ◽

Maria Clicia Castro ◽

...

Keyword(s):

Machine Learning ◽

Execution Time ◽

Data Privacy ◽

Tumor Infiltrating Lymphocytes ◽

Tumor Classification ◽

Financial Cost ◽

Privacy Concerns ◽

Learning Scenarios ◽

Infiltrating Lymphocytes ◽

Distributed Machine Learning

Federated Learning is a new area of distributed Machine Learning (ML) that emerged to deal with data privacy concerns. In this approach, each client has access to a local and private dataset. They only exchange the model weights and updates. This paper presents a Federated Learning (FL) approach to a cloud Tumor-Infiltrating Lymphocytes (TIL) application. The results show that the FL approach outperformed the centralized one in all evaluated ML metrics. It also reduced the execution time although the financial cost has increased.

Download Full-text

Distributed Machine Learning through Heterogeneous Edge Systems

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i05.6207 ◽

2020 ◽

Vol 34 (05) ◽

pp. 7179-7186

Author(s):

Hanpeng Hu ◽

Dan Wang ◽

Chuan Wu

Keyword(s):

Machine Learning ◽

Convergence Time ◽

Privacy Concerns ◽

Model Update ◽

Large Heterogeneity ◽

Testbed Implementation ◽

Model Training ◽

Iot Devices ◽

Core Idea ◽

Distributed Machine Learning

Many emerging AI applications request distributed machine learning (ML) among edge systems (e.g., IoT devices and PCs at the edge of the Internet), where data cannot be uploaded to a central venue for model training, due to their large volumes and/or security/privacy concerns. Edge devices are intrinsically heterogeneous in computing capacity, posing significant challenges to parameter synchronization for parallel training with the parameter server (PS) architecture. This paper proposes ADSP, a parameter synchronization model for distributed machine learning (ML) with heterogeneous edge systems. Eliminating the significant waiting time occurring with existing parameter synchronization models, the core idea of ADSP is to let faster edge devices continue training, while committing their model updates at strategically decided intervals. We design algorithms that decide time points for each worker to commit its model update, and ensure not only global model convergence but also faster convergence. Our testbed implementation and experiments show that ADSP outperforms existing parameter synchronization models significantly in terms of ML model convergence time, scalability and adaptability to large heterogeneity.

Download Full-text

Efficient distributed machine learning for large-scale models by reducing redundant communication

2017 IEEE SmartWorld, Ubiquitous Intelligence & Computing, Advanced & Trusted Computed, Scalable Computing & Communications, Cloud & Big Data Computing, Internet of People and Smart City Innovation (SmartWorld/SCALCOM/UIC/ATC/CBDCom/IOP/SCI) ◽

10.1109/uic-atc.2017.8397638 ◽

2017 ◽

Cited By ~ 1

Author(s):

Harumichi Yokoyama ◽

Takuya Araki

Keyword(s):

Machine Learning ◽

Large Scale ◽

Scale Models ◽

Distributed Machine Learning

Download Full-text

Timed Dataflow: Reducing Communication Overhead for Distributed Machine Learning Systems

2016 IEEE 22nd International Conference on Parallel and Distributed Systems (ICPADS) ◽

10.1109/icpads.2016.0146 ◽

2016 ◽

Cited By ~ 4

Author(s):

Peng Sun ◽

Yonggang Wen ◽

Ta Nguyen Binh Duong ◽

Shengen Yan

Keyword(s):

Machine Learning ◽

Learning Systems ◽

Communication Overhead ◽

Distributed Machine Learning

Download Full-text

On Lightweight Privacy-preserving Collaborative Learning for Internet of Things by Independent Random Projections

ACM Transactions on Internet of Things ◽

10.1145/3441303 ◽

2021 ◽

Vol 2 (2) ◽

pp. 1-32

Author(s):

Linshan Jiang ◽

Rui Tan ◽

Xin Lou ◽

Guosheng Lin

Keyword(s):

Machine Learning ◽

Collaborative Learning ◽

Internet Of Things ◽

Differential Privacy ◽

Data Encryption ◽

Privacy Preserving ◽

Learning Performance ◽

Support Vector ◽

Communication Overhead ◽

The Coordinator

The Internet of Things (IoT) will be a main data generation infrastructure for achieving better system intelligence. This article considers the design and implementation of a practical privacy-preserving collaborative learning scheme, in which a curious learning coordinator trains a better machine learning model based on the data samples contributed by a number of IoT objects, while the confidentiality of the raw forms of the training data is protected against the coordinator. Existing distributed machine learning and data encryption approaches incur significant computation and communication overhead, rendering them ill-suited for resource-constrained IoT objects. We study an approach that applies independent random projection at each IoT object to obfuscate data and trains a deep neural network at the coordinator based on the projected data from the IoT objects. This approach introduces light computation overhead to the IoT objects and moves most workload to the coordinator that can have sufficient computing resources. Although the independent projections performed by the IoT objects address the potential collusion between the curious coordinator and some compromised IoT objects, they significantly increase the complexity of the projected data. In this article, we leverage the superior learning capability of deep learning in capturing sophisticated patterns to maintain good learning performance. Extensive comparative evaluation shows that this approach outperforms other lightweight approaches that apply additive noisification for differential privacy and/or support vector machines for learning in the applications with light to moderate data pattern complexities.

Download Full-text