From distributed machine learning to federated learning: In the view of data privacy and security

Big Data and Analytics in Retailing

NIM Marketing Intelligence Review ◽

10.2478/nimmir-2019-0006 ◽

2019 ◽

Vol 11 (1) ◽

pp. 36-40 ◽

Cited By ~ 1

Author(s):

Venky Shankar

Keyword(s):

Machine Learning ◽

Big Data ◽

Dynamic Pricing ◽

Data Privacy ◽

Data Science ◽

Shopping Behavior ◽

Computer Algorithms ◽

Multiple Sources ◽

Privacy And Security ◽

Customer Data

AbstractBig data are taking center stage for decision-making in many retail organizations. Customer data on attitudes and behavior across channels, touchpoints, devices and platforms are often readily available and constantly recorded. These data are integrated from multiple sources and stored or warehoused, often in a cloud-based environment. Statistical, econometric and data science models are developed for enabling appropriate decisions. Computer algorithms and programs are created for these models. Machine learning based models, are particularly useful for learning from the data and making predictive decisions. These machine learning models form the backbone for the generation and development of AI-assisted decisions. In many cases, such decisions are automated using systems such as chatbots and robots.Of special interest are issues such as omnichannel shopping behavior, resource allocation across channels, the effects of the mobile channel and mobile apps on shopper behavior, dynamic pricing, data privacy and security. Research on these issues reveals several interesting insights on which retailers can build. To fully leverage big data in today’s retailing environment, CRM strategies must be location specific, time specific and channel specific in addition to being customer specific.

Download Full-text

Communication-efficient and Scalable Decentralized Federated Edge Learning

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2021/720 ◽

2021 ◽

Author(s):

Austine Zong Han Yapp ◽

Hong Soo Nicholas Koh ◽

Yan Ting Lai ◽

Jiawen Kang ◽

Xuandi Li ◽

...

Keyword(s):

Machine Learning ◽

Data Privacy ◽

Large Scale ◽

Local Model ◽

Communication Overhead ◽

Model Training ◽

Collaborative Training ◽

Distributed Machine Learning ◽

The Coordinator

Federated Edge Learning (FEL) is a distributed Machine Learning (ML) framework for collaborative training on edge devices. FEL improves data privacy over traditional centralized ML model training by keeping data on the devices and only sending local model updates to a central coordinator for aggregation. However, challenges still remain in existing FEL architectures where there is high communication overhead between edge devices and the coordinator. In this paper, we present a working prototype of blockchain-empowered and communication-efficient FEL framework, which enhances the security and scalability towards large-scale implementation of FEL.

Download Full-text

Early Detection of Type-2 Diabetes Using Federated Learning

International Journal of Scientific Research in Science Engineering and Technology ◽

10.32628/ijsrset207644 ◽

2020 ◽

pp. 257-267

Author(s):

M. Lincy ◽

A. Meena Kowshalya

Keyword(s):

Machine Learning ◽

Type 2 Diabetes ◽

Early Detection ◽

Data Privacy ◽

Learning Algorithm ◽

Learning Model ◽

Learning Approach ◽

Distributed Data ◽

Privacy And Security

Data privacy and security are incredibly important in the healthcare industry. Federated learning is a new way of training a machine learning algorithm using distributed data which is not hosted in a centralized server. Numerous centralized machine learning models exists in literature but none offers privacy to users’ data. This paper proposes a federated learning approach for early detection of Type-2 Diabetes among patients. A simple federated architecture is exploited for early detection of Type-2 diabetes. We compare the proposed federated learning model against our centralised approach. Experimental results prove that the federated learning model ensures significant privacy over centralised learning model whereas compromising accuracy for a subtle extend.

Download Full-text

Privacy-Aware Data Forensics of VRUs Using Machine Learning and Big Data Analytics

Security and Communication Networks ◽

10.1155/2021/3320436 ◽

2021 ◽

Vol 2021 ◽

pp. 1-9

Author(s):

Muhammad Babar ◽

Muhammad Usman Tariq ◽

Ahmed S. Almasoud ◽

Mohammad Dahman Alshehri

Keyword(s):

Machine Learning ◽

Big Data ◽

Traffic Control ◽

Data Analytics ◽

Data Privacy ◽

Big Data Analytics ◽

Processing Unit ◽

Privacy And Security ◽

User Data ◽

Data Ingestion

The present spreading out of big data found the realization of AI and machine learning. With the rise of big data and machine learning, the idea of improving accuracy and enhancing the efficacy of AI applications is also gaining prominence. Machine learning solutions provide improved guard safety in hazardous traffic circumstances in the context of traffic applications. The existing architectures have various challenges, where data privacy is the foremost challenge for vulnerable road users (VRUs). The key reason for failure in traffic control for pedestrians is flawed in the privacy handling of the users. The user data are at risk and are prone to several privacy and security gaps. If an invader succeeds to infiltrate the setup, exposed data can be malevolently influenced, contrived, and misrepresented for illegitimate drives. In this study, an architecture is proposed based on machine learning to analyze and process big data efficiently in a secure environment. The proposed model considers the privacy of users during big data processing. The proposed architecture is a layered framework with a parallel and distributed module using machine learning on big data to achieve secure big data analytics. The proposed architecture designs a distinct unit for privacy management using a machine learning classifier. A stream processing unit is also integrated with the architecture to process the information. The proposed system is apprehended using real-time datasets from various sources and experimentally tested with reliable datasets that disclose the effectiveness of the proposed architecture. The data ingestion results are also highlighted along with training and validation results.

Download Full-text

VADAF: Visualization for Abnormal Client Detection and Analysis in Federated Learning

ACM Transactions on Interactive Intelligent Systems ◽

10.1145/3426866 ◽

2021 ◽

Vol 11 (3-4) ◽

pp. 1-23

Author(s):

Linhao Meng ◽

Yating Wei ◽

Rusheng Pan ◽

Shuyue Zhou ◽

Jianwei Zhang ◽

...

Keyword(s):

Machine Learning ◽

Anomaly Detection ◽

Visual Analytics ◽

Detection Method ◽

Training Process ◽

Local Data ◽

Privacy And Security ◽

Potential Client ◽

Distributed Machine Learning ◽

Large Corpus

Federated Learning (FL) provides a powerful solution to distributed machine learning on a large corpus of decentralized data. It ensures privacy and security by performing computation on devices (which we refer to as clients) based on local data to improve the shared global model. However, the inaccessibility of the data and the invisibility of the computation make it challenging to interpret and analyze the training process, especially to distinguish potential client anomalies. Identifying these anomalies can help experts diagnose and improve FL models. For this reason, we propose a visual analytics system, VADAF, to depict the training dynamics and facilitate analyzing potential client anomalies. Specifically, we design a visualization scheme that supports massive training dynamics in the FL environment. Moreover, we introduce an anomaly detection method to detect potential client anomalies, which are further analyzed based on both the client model’s visual and objective estimation. Three case studies have demonstrated the effectiveness of our system in understanding the FL training process and supporting abnormal client detection and analysis.

Download Full-text

Evaluating Federated Learning Scenarios in a Tumor Classification Application

10.5753/eradrj.2021.18558 ◽

2021 ◽

Author(s):

Rafaela C. Brum ◽

George Teodoro ◽

Lúcia Drummond ◽

Luciana Arantes ◽

Maria Clicia Castro ◽

...

Keyword(s):

Machine Learning ◽

Execution Time ◽

Data Privacy ◽

Tumor Infiltrating Lymphocytes ◽

Tumor Classification ◽

Financial Cost ◽

Privacy Concerns ◽

Learning Scenarios ◽

Infiltrating Lymphocytes ◽

Distributed Machine Learning

Federated Learning is a new area of distributed Machine Learning (ML) that emerged to deal with data privacy concerns. In this approach, each client has access to a local and private dataset. They only exchange the model weights and updates. This paper presents a Federated Learning (FL) approach to a cloud Tumor-Infiltrating Lymphocytes (TIL) application. The results show that the FL approach outperformed the centralized one in all evaluated ML metrics. It also reduced the execution time although the financial cost has increased.

Download Full-text

An Introduction to the Federated Learning Standard

GetMobile Mobile Computing and Communications ◽

10.1145/3511285.3511291 ◽

2022 ◽

Vol 25 (3) ◽

pp. 18-22

Author(s):

Ticao Zhang ◽

Shiwen Mao

Keyword(s):

Machine Learning ◽

Performance Evaluation ◽

Data Privacy ◽

Model Building ◽

Evaluation Criteria ◽

Regulatory Requirements ◽

Privacy And Security ◽

Learning Framework ◽

Learning Tasks ◽

Decentralized Learning

With the growing concern on data privacy and security, it is undesirable to collect data from all users to perform machine learning tasks. Federated learning, a decentralized learning framework, was proposed to construct a shared prediction model while keeping owners' data on their own devices. This paper presents an introduction to the emerging federated learning standard and discusses its various aspects, including i) an overview of federated learning, ii) types of federated learning, iii) major concerns and the performance evaluation criteria of federated learning, and iv) associated regulatory requirements. The purpose of this paper is to provide an understanding of the standard and facilitate its usage in model building across organizations while meeting privacy and security concerns.

Download Full-text

Federated Learning in a Medical Context: A Systematic Literature Review

ACM Transactions on Internet Technology ◽

10.1145/3412357 ◽

2021 ◽

Vol 21 (2) ◽

pp. 1-31

Author(s):

Bjarne Pfitzner ◽

Nico Steckhan ◽

Bert Arnrich

Keyword(s):

Machine Learning ◽

Literature Review ◽

Systematic Literature Review ◽

Data Privacy ◽

Research Area ◽

Learning Models ◽

Related Data ◽

Private Data ◽

Large Databases ◽

Machine Learning Models

Data privacy is a very important issue. Especially in fields like medicine, it is paramount to abide by the existing privacy regulations to preserve patients’ anonymity. However, data is required for research and training machine learning models that could help gain insight into complex correlations or personalised treatments that may otherwise stay undiscovered. Those models generally scale with the amount of data available, but the current situation often prohibits building large databases across sites. So it would be beneficial to be able to combine similar or related data from different sites all over the world while still preserving data privacy. Federated learning has been proposed as a solution for this, because it relies on the sharing of machine learning models, instead of the raw data itself. That means private data never leaves the site or device it was collected on. Federated learning is an emerging research area, and many domains have been identified for the application of those methods. This systematic literature review provides an extensive look at the concept of and research into federated learning and its applicability for confidential healthcare datasets.

Download Full-text

Joint Data Collection and Resource Allocation for Distributed Machine Learning at the Edge

IEEE Transactions on Mobile Computing ◽

10.1109/tmc.2020.3045436 ◽

2020 ◽

pp. 1-1

Author(s):

Min Chen ◽

Haichuan Wang ◽

Zeyu Meng ◽

Hongli Xu ◽

Yang Xu ◽

...

Keyword(s):

Machine Learning ◽

Resource Allocation ◽

Data Collection ◽

Distributed Machine Learning

Download Full-text

Data-driven sector coupling in 5G-based smart networks

Competition and Regulation in Network Industries ◽

10.1177/1783591721992762 ◽

2021 ◽

Vol 22 (1) ◽

pp. 53-68

Author(s):

Guenter Knieps

Keyword(s):

Data Privacy ◽

Service Providers ◽

Policy Implications ◽

Network Industries ◽

5G Networks ◽

Privacy And Security ◽

Open Set ◽

Application Service ◽

Downstream Application

5G attains the role of a GPT for an open set of downstream IoT applications in various network industries and within the app economy more generally. Traditionally, sector coupling has been a rather narrow concept focusing on the horizontal synergies of urban system integration in terms of transport, energy, and waste systems, or else the creation of new intermodal markets. The transition toward 5G has fundamentally changed the framing of sector coupling in network industries by underscoring the relevance of differentiating between horizontal and vertical sector coupling. Due to the fixed mobile convergence and the large open set of complementary use cases, 5G has taken on the characteristics of a generalized purpose technology (GPT) in its role as the enabler of a large variety of smart network applications. Due to this vertical relationship, characterized by pervasiveness and innovational complementarities between upstream 5G networks and downstream application sectors, vertical sector coupling between the provider of an upstream GPT and different downstream application industries has acquired particular relevance. In contrast to horizontal sector coupling among different application sectors, the driver of vertical sector coupling is that each of the heterogeneous application sectors requires a critical input from the upstream 5G network provider and combines this with its own downstream technology. Of particular relevance for vertical sector coupling are the innovational complementarities between upstream GPT and downstream application sectors. The focus on vertical sector coupling also has important policy implications. Although the evolution of 5G networks strongly depends on the entrepreneurial, market-driven activities of broadband network operators and application service providers, the future of 5G as a GPT is heavily contingent on the role of frequency management authorities and European regulatory policy with regard to data privacy and security regulations.

Download Full-text