distributed training Latest Research Papers

Exploring a Bilingual Next Word Predictor for a Federated Learning Mobile Application

10.36227/techrxiv.18058682.v1 ◽

2022 ◽

Author(s):

Natali Alfonso Burgos ◽

Karol Kiš ◽

Peter Bakarac ◽

Michal Kvasnica ◽

Giovanni Licitra

Keyword(s):

Learning Environment ◽

Mobile Application ◽

Homomorphic Encryption ◽

Aggregation Method ◽

Performance Gap ◽

Parallel Corpora ◽

Client Server ◽

Virtual Keyboard ◽

Distributed Training ◽

Target Performance

We explore a bilingual next-word predictor (NWP) under federated optimization for a mobile application. A character-based LSTM is server-trained on English and Dutch texts from a custom parallel corpora. This is used as the target performance. We simulate a federated learning environment to assess the feasibility of distributed training for the same model. The popular Federated Averaging (FedAvg) algorithm is used as the aggregation method. We show that the federated LSTM achieves decent performance, yet it is still sub-optimal. We suggest possible next steps to bridge this performance gap. Furthermore, we explore the effects of language imbalance varying the ratio of English and Dutch training texts (or clients). We show the model upholds performance (of the balanced case) up and until a 80/20 imbalance before decaying rapidly. Lastly, we describe the implementation of local client training, word prediction and client-server communication in a custom virtual keyboard for Android platforms. Additionally, homomorphic encryption is applied to provide with secure aggregation guarding the user from malicious servers.

Exploring a Bilingual Next Word Predictor for a Federated Learning Mobile Application

10.36227/techrxiv.18058682 ◽

2022 ◽

Author(s):

Natali Alfonso Burgos ◽

Karol Kiš ◽

Peter Bakarac ◽

Michal Kvasnica ◽

Giovanni Licitra

Keyword(s):

Learning Environment ◽

Mobile Application ◽

Homomorphic Encryption ◽

Aggregation Method ◽

Performance Gap ◽

Parallel Corpora ◽

Client Server ◽

Virtual Keyboard ◽

Distributed Training ◽

Target Performance

We explore a bilingual next-word predictor (NWP) under federated optimization for a mobile application. A character-based LSTM is server-trained on English and Dutch texts from a custom parallel corpora. This is used as the target performance. We simulate a federated learning environment to assess the feasibility of distributed training for the same model. The popular Federated Averaging (FedAvg) algorithm is used as the aggregation method. We show that the federated LSTM achieves decent performance, yet it is still sub-optimal. We suggest possible next steps to bridge this performance gap. Furthermore, we explore the effects of language imbalance varying the ratio of English and Dutch training texts (or clients). We show the model upholds performance (of the balanced case) up and until a 80/20 imbalance before decaying rapidly. Lastly, we describe the implementation of local client training, word prediction and client-server communication in a custom virtual keyboard for Android platforms. Additionally, homomorphic encryption is applied to provide with secure aggregation guarding the user from malicious servers.

Analyzing the impact of the MPI allreduce in distributed training of convolutional neural networks

Computing ◽

10.1007/s00607-021-01029-2 ◽

2022 ◽

Author(s):

Adrián Castelló ◽

Mar Catalán ◽

Manuel F. Dolz ◽

Enrique S. Quintana-Ortí ◽

José Duato

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Distributed Training ◽

The Impact

Evaluating Data-Parallel Distributed Training Strategies

10.1109/comsnets53615.2022.9668349 ◽

2022 ◽

Author(s):

Ganesan Ponnuswami ◽

Sriram Kailasam ◽

Dileep Aroor Dinesh

Keyword(s):

Data Parallel ◽

Distributed Training ◽

Training Strategies

SHAT: A Novel Asynchronous Training Algorithm That Provides Fast Model Convergence in Distributed Deep Learning

Applied Sciences ◽

10.3390/app12010292 ◽

2021 ◽

Vol 12 (1) ◽

pp. 292

Author(s):

Yunyong Ko ◽

Sang-Wook Kim

Keyword(s):

Large Scale ◽

Heterogeneous Environments ◽

Local Models ◽

Training Algorithm ◽

Distributed Training ◽

Large Scale Data ◽

Synchronization Overhead ◽

The Difference ◽

Asynchronous Training ◽

Scale Data

The recent unprecedented success of deep learning (DL) in various fields is underlied by its use of large-scale data and models. Training a large-scale deep neural network (DNN) model with large-scale data, however, is time-consuming. To speed up the training of massive DNN models, data-parallel distributed training based on the parameter server (PS) has been widely applied. In general, a synchronous PS-based training suffers from the synchronization overhead, especially in heterogeneous environments. To reduce the synchronization overhead, asynchronous PS-based training employs the asynchronous communication between PS and workers so that PS processes the request of each worker independently without waiting. Despite the performance improvement of asynchronous training, however, it inevitably incurs the difference among the local models of workers, where such a difference among workers may cause slower model convergence. Fro addressing this problem, in this work, we propose a novel asynchronous PS-based training algorithm, SHAT that considers (1) the scale of distributed training and (2) the heterogeneity among workers for successfully reducing the difference among the local models of workers. The extensive empirical evaluation demonstrates that (1) the model trained by SHAT converges to the higher accuracy up to 5.22% than state-of-the-art algorithms, and (2) the model convergence of SHAT is robust under various heterogeneous environments.

Intra‐cluster aggregation aware routing for distributed training in wireless sensor networks

Concurrency and Computation Practice and Experience ◽

10.1002/cpe.6795 ◽

2021 ◽

Author(s):

Zhaohong Chen ◽

Xin Long ◽

Long Chen ◽

Yalan Wu ◽

Jigang Wu ◽

...

Keyword(s):

Wireless Sensor Networks ◽

Sensor Networks ◽

Wireless Sensor ◽

Distributed Training ◽

Cluster Aggregation

Federated Learning and Differential Privacy for Medical Image Analysis

10.21203/rs.3.rs-1005694/v1 ◽

2021 ◽

Author(s):

Mohammed Adnan ◽

Shivam Kalra ◽

Jesse C. Cresswell ◽

Graham W. Taylor ◽

Hamid Tizhoosh

Keyword(s):

Machine Learning ◽

Image Analysis ◽

Medical Image ◽

Large Scale ◽

Differential Privacy ◽

Medical Image Analysis ◽

External Validation ◽

The Cancer Genome Atlas ◽

Distributed Training ◽

Histopathology Images

Abstract The artificial intelligence revolution has been spurred forward by the availability of large-scale datasets. In contrast, the paucity of large-scale medical datasets hinders the application of machine learning in healthcare. The lack of publicly available multi-centric and diverse datasets mainly stems from confidentiality and privacy concerns around sharing medical data. To demonstrate a feasible path forward in medical image imaging, we conduct a case study of applying a differentially private federated learning framework for analysis of histopathology images, the largest and perhaps most complex medical images. We study the effects of IID and non-IID distributions along with the number of healthcare providers, i.e., hospitals and clinics, and the individual dataset sizes, using The Cancer Genome Atlas (TCGA) dataset, a public repository, to simulate a distributed environment. We empirically compare the performance of private, distributed training to conventional training and demonstrate that distributed training can achieve similar performance with strong privacy guarantees. We also study the effect of different source domains for histopathology images by evaluating the performance using external validation. Our work indicates that differentially private federated learning is a viable and reliable framework for the collaborative development of machine learning models in medical image analysis.

An Intestinal Centerline Extraction Algorithm Based on Federated Framework

Wireless Communications and Mobile Computing ◽

10.1155/2021/2979214 ◽

2021 ◽

Vol 2021 ◽

pp. 1-12

Author(s):

Xiaodong Wang ◽

Zhe’nan He ◽

Ying Wang ◽

Linlin Dang ◽

Weifang Han ◽

...

Keyword(s):

Single Point ◽

Data Distribution ◽

Clinical Applications ◽

Accurate Diagnosis ◽

Training Method ◽

Single Institution ◽

Distributed Training ◽

Centerline Extraction ◽

Extraction Algorithm ◽

Sample Data

The intestine is an important organ of the human body, and its internal structure always needs to be observed in clinical applications so as to provide a basis for accurate diagnosis. However, due to the limited intestinal data obtained by a single institution, deep learning cannot effectively train the intestines, and the effect is not satisfied. For this reason, we propose a distributed training method to carry out federated learning to alleviate the situation of patient sample data shortage, not shared and uneven data distribution. And the blockchain is introduced to enhance the interaction between networks, to solve the problem of a single point of failure of the federated learning server. Fully excavate the multiscale features of samples, to construct a fusion enhancement model and intestinal segmentation module for accurate positioning. At the local end, the centerline extraction algorithm is optimized, with the edge as the main and the source as the auxiliary to realize centerline extraction.

Distributed Training for High Resolution Images: A Domain and Spatial Decomposition Approach

10.1109/rsdha54838.2021.00009 ◽

2021 ◽

Author(s):

Aristeidis Tsaris ◽

Jacob Hinkle ◽

Dalton Lunga ◽

Philipe Ambrozio Dias

Keyword(s):

High Resolution ◽

Decomposition Approach ◽

Spatial Decomposition ◽

Distributed Training ◽

High Resolution Images

Decentralized Distributed Deep Learning with Low-Bandwidth Consumption for Smart Constellations

Space: Science & Technology ◽

10.34133/2021/9879246 ◽

2021 ◽

Vol 2021 ◽

pp. 1-10

Author(s):

Qingliang Meng ◽

Meiyu Huang ◽

Yao Xu ◽

Naijin Liu ◽

Xueshuang Xiang

Keyword(s):

Deep Learning ◽

Parameter Extraction ◽

Stochastic Gradient Descent ◽

Network Bandwidth ◽

Distributed Training ◽

Intelligent Processing ◽

Benchmark Datasets ◽

Speed Up ◽

New Perspective ◽

Deep Learning Network

For the space-based remote sensing system, onboard intelligent processing based on deep learning has become an inevitable trend. To adapt to the dynamic changes of the observation scenes, there is an urgent need to perform distributed deep learning onboard to fully utilize the plentiful real-time sensing data of multiple satellites from a smart constellation. However, the network bandwidth of the smart constellation is very limited. Therefore, it is of great significance to carry out distributed training research in a low-bandwidth environment. This paper proposes a Randomized Decentralized Parallel Stochastic Gradient Descent (RD-PSGD) method for distributed training in a low-bandwidth network. To reduce the communication cost, each node in RD-PSGD just randomly transfers part of the information of the local intelligent model to its neighborhood. We further speed up the algorithm by optimizing the programming of random index generation and parameter extraction. For the first time, we theoretically analyze the convergence property of the proposed RD-PSGD and validate the advantage of this method by simulation experiments on various distributed training tasks for image classification on different benchmark datasets and deep learning network architectures. The results show that RD-PSGD can effectively save the time and bandwidth cost of distributed training and reduce the complexity of parameter selection compared with the TopK-based method. The method proposed in this paper provides a new perspective for the study of onboard intelligent processing, especially for online learning on a smart satellite constellation.

distributed training
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Exploring a Bilingual Next Word Predictor for a Federated Learning Mobile Application

Exploring a Bilingual Next Word Predictor for a Federated Learning Mobile Application

Analyzing the impact of the MPI allreduce in distributed training of convolutional neural networks

Evaluating Data-Parallel Distributed Training Strategies

SHAT: A Novel Asynchronous Training Algorithm That Provides Fast Model Convergence in Distributed Deep Learning

Intra‐cluster aggregation aware routing for distributed training in wireless sensor networks

Federated Learning and Differential Privacy for Medical Image Analysis

An Intestinal Centerline Extraction Algorithm Based on Federated Framework

Distributed Training for High Resolution Images: A Domain and Spatial Decomposition Approach

Decentralized Distributed Deep Learning with Low-Bandwidth Consumption for Smart Constellations

Export Citation Format

distributed trainingRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Exploring a Bilingual Next Word Predictor for a Federated Learning Mobile Application

Exploring a Bilingual Next Word Predictor for a Federated Learning Mobile Application

Analyzing the impact of the MPI allreduce in distributed training of convolutional neural networks

Evaluating Data-Parallel Distributed Training Strategies

SHAT: A Novel Asynchronous Training Algorithm That Provides Fast Model Convergence in Distributed Deep Learning

Intra‐cluster aggregation aware routing for distributed training in wireless sensor networks

Federated Learning and Differential Privacy for Medical Image Analysis

An Intestinal Centerline Extraction Algorithm Based on Federated Framework

Distributed Training for High Resolution Images: A Domain and Spatial Decomposition Approach

Decentralized Distributed Deep Learning with Low-Bandwidth Consumption for Smart Constellations

distributed training
Recently Published Documents