Timed Dataflow: Reducing Communication Overhead for Distributed Machine Learning Systems

Federated Edge Learning (FEL) is a distributed Machine Learning (ML) framework for collaborative training on edge devices. FEL improves data privacy over traditional centralized ML model training by keeping data on the devices and only sending local model updates to a central coordinator for aggregation. However, challenges still remain in existing FEL architectures where there is high communication overhead between edge devices and the coordinator. In this paper, we present a working prototype of blockchain-empowered and communication-efficient FEL framework, which enhances the security and scalability towards large-scale implementation of FEL.

Download Full-text

Blockchain for federated learning toward secure distributed machine learning systems: a systemic survey

Soft Computing ◽

10.1007/s00500-021-06496-5 ◽

2021 ◽

Author(s):

Dun Li ◽

Dezhi Han ◽

Tien-Hsiung Weng ◽

Zibin Zheng ◽

Hongzhi Li ◽

...

Keyword(s):

Machine Learning ◽

Learning Systems ◽

Distributed Machine Learning

Download Full-text

Exploiting Sample Diversity in Distributed Machine Learning Systems

2016 16th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid) ◽

10.1109/ccgrid.2016.75 ◽

2016 ◽

Author(s):

Zhiqiang Liu ◽

Xuanhua Shi ◽

Hai Jin

Keyword(s):

Machine Learning ◽

Learning Systems ◽

Distributed Machine Learning

Download Full-text

Toward Efficient Online Scheduling for Distributed Machine Learning Systems

IEEE Transactions on Network Science and Engineering ◽

10.1109/tnse.2021.3104513 ◽

2021 ◽

pp. 1-1

Author(s):

Menglu Yu ◽

Jia Liu ◽

Chuan Wu ◽

Bo Ji ◽

Elizabeth Bentley

Keyword(s):

Machine Learning ◽

Online Scheduling ◽

Learning Systems ◽

Distributed Machine Learning

Download Full-text

Model poisoning attacks against distributed machine learning systems

Artificial Intelligence and Machine Learning for Multi-Domain Operations Applications ◽

10.1117/12.2520275 ◽

2019 ◽

Author(s):

Richard Tomsett ◽

Kevin S. Chan ◽

Supriyo Chakraborty

Keyword(s):

Machine Learning ◽

Learning Systems ◽

Distributed Machine Learning

Download Full-text

Regulating Accuracy-Efficiency Trade-Offs in Distributed Machine Learning Systems

SSRN Electronic Journal ◽

10.2139/ssrn.3650497 ◽

2020 ◽

Author(s):

A. Feder Cooper ◽

Karen Levy ◽

Christopher De Sa

Keyword(s):

Machine Learning ◽

Learning Systems ◽

Trade Offs ◽

Distributed Machine Learning

Download Full-text

Tensor relational algebra for distributed machine learning system design

Proceedings of the VLDB Endowment ◽

10.14778/3457390.3457399 ◽

2021 ◽

Vol 14 (8) ◽

pp. 1338-1350

Author(s):

Binhang Yuan ◽

Dimitrije Jankov ◽

Jia Zou ◽

Yuxin Tang ◽

Daniel Bourgeois ◽

...

Keyword(s):

Machine Learning ◽

Empirical Study ◽

System Design ◽

High Efficiency ◽

Learning Systems ◽

Learning System ◽

Relational Algebra ◽

Activation Functions ◽

Distributed Environment ◽

Distributed Machine Learning

We consider the question: what is the abstraction that should be implemented by the computational engine of a machine learning system? Current machine learning systems typically push whole tensors through a series of compute kernels such as matrix multiplications or activation functions, where each kernel runs on an AI accelerator (ASIC) such as a GPU. This implementation abstraction provides little built-in support for ML systems to scale past a single machine, or for handling large models with matrices or tensors that do not easily fit into the RAM of an ASIC. In this paper, we present an alternative implementation abstraction called the tensor relational algebra (TRA). The TRA is a set-based algebra based on the relational algebra. Expressions in the TRA operate over binary tensor relations, where keys are multi-dimensional arrays and values are tensors. The TRA is easily executed with high efficiency in a parallel or distributed environment, and amenable to automatic optimization. Our empirical study shows that the optimized TRA-based back-end can significantly outperform alternatives for running ML workflows in distributed clusters.

Download Full-text

Platform for Analysing and Encouraging Student Activity on Contest and E-learning Systems

OLYMPIADS IN INFORMATICS ◽

10.15388/ioi.2018.07 ◽

2018 ◽

Vol 12 ◽

pp. 85-98

Author(s):

Bojan Kostadinov ◽

Mile Jovanov ◽

Emil STANKOV

Keyword(s):

Machine Learning ◽

Data Collection ◽

Educational Policy ◽

Learning Systems ◽

Data Sources ◽

Or Education ◽

Student Activity ◽

The World ◽

E Learning ◽

Analyse Data

Data collection and machine learning are changing the world. Whether it is medicine, sports or education, companies and institutions are investing a lot of time and money in systems that gather, process and analyse data. Likewise, to improve competitiveness, a lot of countries are making changes to their educational policy by supporting STEM disciplines. Therefore, it’s important to put effort into using various data sources to help students succeed in STEM. In this paper, we present a platform that can analyse student’s activity on various contest and e-learning systems, combine and process the data, and then present it in various ways that are easy to understand. This in turn enables teachers and organizers to recognize talented and hardworking students, identify issues, and/or motivate students to practice and work on areas where they’re weaker.

Download Full-text