scholarly journals Zen and the art of model adaptation: Low-utility-cost attack mitigations in collaborative machine learning

2021 ◽  
Vol 2022 (1) ◽  
pp. 274-290
Author(s):  
Dmitrii Usynin ◽  
Daniel Rueckert ◽  
Jonathan Passerat-Palmbach ◽  
Georgios Kaissis

Abstract In this study, we aim to bridge the gap between the theoretical understanding of attacks against collaborative machine learning workflows and their practical ramifications by considering the effects of model architecture, learning setting and hyperparameters on the resilience against attacks. We refer to such mitigations as model adaptation. Through extensive experimentation on both, benchmark and real-life datasets, we establish a more practical threat model for collaborative learning scenarios. In particular, we evaluate the impact of model adaptation by implementing a range of attacks belonging to the broader categories of model inversion and membership inference. Our experiments yield two noteworthy outcomes: they demonstrate the difficulty of actually conducting successful attacks under realistic settings when model adaptation is employed and they highlight the challenge inherent in successfully combining model adaptation and formal privacy-preserving techniques to retain the optimal balance between model utility and attack resilience.

2021 ◽  
Vol 51 (4) ◽  
pp. 75-81
Author(s):  
Ahad Mirza Baig ◽  
Alkida Balliu ◽  
Peter Davies ◽  
Michal Dory

Rachid Guerraoui was the rst keynote speaker, and he got things o to a great start by discussing the broad relevance of the research done in our community relative to both industry and academia. He rst argued that, in some sense, the fact that distributed computing is so pervasive nowadays could end up sti ing progress in our community by inducing people to work on marginal problems, and becoming isolated. His rst suggestion was to try to understand and incorporate new ideas coming from applied elds into our research, and argued that this has been historically very successful. He illustrated this point via the distributed payment problem, which appears in the context of blockchains, in particular Bitcoin, but then turned out to be very theoretically interesting; furthermore, the theoretical understanding of the problem inspired new practical protocols. He then went further to discuss new directions in distributed computing, such as the COVID tracing problem, and new challenges in Byzantine-resilient distributed machine learning. Another source of innovation Rachid suggested was hardware innovations, which he illustrated with work studying the impact of RDMA-based primitives on fundamental problems in distributed computing. The talk concluded with a very lively discussion.


SAGE Open ◽  
2021 ◽  
Vol 11 (2) ◽  
pp. 215824402110207
Author(s):  
Kolja Oswald ◽  
Xiaokang Zhao

Makerspaces are a relatively new phenomenon that seem to create an innovative environment for individuals to work on projects and learn about technology. This article presents a grounded theory study, which investigates the impact that makerspaces have on innovation. Strauss and Corbin’s grounded theory methodology is used to research this exploratory topic. The data sample consists of 16 interviews of members of a makerspace in Shanghai, China. Data analysis was conducted abiding by Strauss and Corbin’s coding framework, entailing open coding, axial coding, and selective coding as well as coding tools, such as the coding paradigm and the conditional matrix. Collaborative learning was identified as the core phenomenon of this research, and The Collaborative Learning and its Outcomes Theory was created. The emergent theory contributes to the understanding of how makerspaces impact outcomes, such as innovation and venture creation, as well as explain how collaborative learning in conjunction with other modes of learning can facilitate learning at various complexities. As such, this study’s contributions are in developing the theoretical understanding of makerspaces as well as collaborative learning. It offers managerial and pedagogical implications that can help create learning environments where collaborative learning is fostered.


2012 ◽  
Vol 10 ◽  
pp. 45-55 ◽  
Author(s):  
A. Bartsch ◽  
F. Fitzek ◽  
R. H. Rasshofer

Abstract. The application of modern series production automotive radar sensors to pedestrian recognition is an important topic in research on future driver assistance systems. The aim of this paper is to understand the potential and limits of such sensors in pedestrian recognition. This knowledge could be used to develop next generation radar sensors with improved pedestrian recognition capabilities. A new raw radar data signal processing algorithm is proposed that allows deep insights into the object classification process. The impact of raw radar data properties can be directly observed in every layer of the classification system by avoiding machine learning and tracking. This gives information on the limiting factors of raw radar data in terms of classification decision making. To accomplish the very challenging distinction between pedestrians and static objects, five significant and stable object features from the spatial distribution and Doppler information are found. Experimental results with data from a 77 GHz automotive radar sensor show that over 95% of pedestrians can be classified correctly under optimal conditions, which is compareable to modern machine learning systems. The impact of the pedestrian's direction of movement, occlusion, antenna beam elevation angle, linear vehicle movement, and other factors are investigated and discussed. The results show that under real life conditions, radar only based pedestrian recognition is limited due to insufficient Doppler frequency and spatial resolution as well as antenna side lobe effects.


2021 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Paschalis Charalampous ◽  
Ioannis Kostavelis ◽  
Theodora Kontodina ◽  
Dimitrios Tzovaras

Purpose Additive manufacturing (AM) technologies are gaining immense popularity in the manufacturing sector because of their undisputed ability to construct geometrically complex prototypes and functional parts. However, the reliability of AM processes in providing high-quality products remains an open and challenging task, as it necessitates a deep understanding of the impact of process-related parameters on certain characteristics of the manufactured part. The purpose of this study is to develop a novel method for process parameter selection in order to improve the dimensional accuracy of manufactured specimens via the fused deposition modeling (FDM) process and ensure the efficiency of the procedure. Design/methodology/approach The introduced methodology uses regression-based machine learning algorithms to predict the dimensional deviations between the nominal computer aided design (CAD) model and the produced physical part. To achieve this, a database with measurements of three-dimensional (3D) printed parts possessing primitive geometry was created for the formulation of the predictive models. Additionally, adjustments on the dimensions of the 3D model are also considered to compensate for the overall shape deviations and further improve the accuracy of the process. Findings The validity of the suggested strategy is evaluated in a real-life manufacturing scenario with a complex benchmark model and a freeform shape manufactured in different scaling factors, where various sets of printing conditions have been applied. The experimental results exhibited that the developed regressive models can be effectively used for printing conditions recommendation and compensation of the errors as well. Originality/value The present research paper is the first to apply machine learning-based regression models and compensation strategies to assess the quality of the FDM process.


Data ◽  
2021 ◽  
Vol 6 (7) ◽  
pp. 77
Author(s):  
Kassim S. Mwitondi ◽  
Raed A. Said

Data-driven solutions to societal challenges continue to bring new dimensions to our daily lives. For example, while good-quality education is a well-acknowledged foundation of sustainable development, innovation and creativity, variations in student attainment and general performance remain commonplace. Developing data -driven solutions hinges on two fronts-technical and application. The former relates to the modelling perspective, where two of the major challenges are the impact of data randomness and general variations in definitions, typically referred to as concept drift in machine learning. The latter relates to devising data-driven solutions to address real-life challenges such as identifying potential triggers of pedagogical performance, which aligns with the Sustainable Development Goal (SDG) #4-Quality Education. A total of 3145 pedagogical data points were obtained from the central data collection platform for the United Arab Emirates (UAE) Ministry of Education (MoE). Using simple data visualisation and machine learning techniques via a generic algorithm for sampling, measuring and assessing, the paper highlights research pathways for educationists and data scientists to attain unified goals in an interdisciplinary context. Its novelty derives from embedded capacity to address data randomness and concept drift by minimising modelling variations and yielding consistent results across samples. Results show that intricate relationships among data attributes describe the invariant conditions that practitioners in the two overlapping fields of data science and education must identify.


Author(s):  
Gabriella Tognola ◽  
Marta Bonato ◽  
Emma Chiaramello ◽  
Serena Fiocchi ◽  
Isabelle Magne ◽  
...  

Characterization of children exposure to extremely low frequency (ELF) magnetic fields is an important issue because of the possible correlation of leukemia onset with ELF exposure. Cluster analysis—a Machine Learning approach—was applied on personal exposure measurements from 977 children in France to characterize real-life ELF exposure scenarios. Electric networks near the child’s home or school were considered as environmental factors characterizing the exposure scenarios. The following clusters were identified: children with the highest exposure living 120–200 m from 225 kV/400 kV overhead lines; children with mid-to-high exposure living 70–100 m from 63 kV/150 kV overhead lines; children with mid-to-low exposure living 40 m from 400 V/20 kV substations and underground networks; children with the lowest exposure and the lowest number of electric networks in the vicinity. 63–225 kV underground networks within 20 m and 400 V/20 kV overhead lines within 40 m played a marginal role in differentiating exposure clusters. Cluster analysis is a viable approach to discovering variables best characterizing the exposure scenarios and thus it might be potentially useful to better tailor epidemiological studies. The present study did not assess the impact of indoor sources of exposure, which should be addressed in a further study.


Author(s):  
Andrei Dmitri Gavrilov ◽  
Alex Jordache ◽  
Maya Vasdani ◽  
Jack Deng

The current discourse in the machine learning domain converges to the agreement that machine learning methods emerged as some of the most prominent learning and classification approaches over the past decade. The CNN became one of most actively researched and broadly-applied deep machine learning methods. However, the training set has a large influence on the accuracy of a network and it is paramount to create an architecture that supports its maximum training and recognition performance. The problem considered in this article is how to prevent overfitting and underfitting. The deficiencies are addressed by comparing the statistics of CNN image recognition algorithms to the Ising model. Using a two-dimensional square-lattice array, the impact that the learning rate and regularization rate parameters have on the adaptability of CNNs for image classification are evaluated. The obtained results contribute to a better theoretical understanding of a CNN and provide concrete guidance on preventing model overfitting and underfitting when a CNN is applied for image recognition tasks.


Author(s):  
Ievgen Redko ◽  
Charlotte Laclau

Machine learning and game theory are known to exhibit a very strong link as they mutually provide each other with solutions and models allowing to study and analyze the optimal behaviour of a set of agents. In this paper, we take a closer look at a special class of games, known as fair cost sharing games, from a machine learning perspective. We show that this particular kind of games, where agents can choose between selfish behaviour and cooperation with shared costs, has a natural link to several machine learning scenarios including collaborative learning with homogeneous and heterogeneous sources of data. We further demonstrate how the game-theoretical results bounding the ratio between the best Nash equilibrium (or its approximate counterpart) and the optimal solution of a given game can be used to provide the upper bound of the gain achievable by the collaborative learning expressed as the expected risk and the sample complexity for homogeneous and heterogeneous cases, respectively. We believe that the established link can spur many possible future implications for other learning scenarios as well, with privacy-aware learning being among the most noticeable examples.


2020 ◽  
Vol 39 (5) ◽  
pp. 6579-6590
Author(s):  
Sandy Çağlıyor ◽  
Başar Öztayşi ◽  
Selime Sezgin

The motion picture industry is one of the largest industries worldwide and has significant importance in the global economy. Considering the high stakes and high risks in the industry, forecast models and decision support systems are gaining importance. Several attempts have been made to estimate the theatrical performance of a movie before or at the early stages of its release. Nevertheless, these models are mostly used for predicting domestic performances and the industry still struggles to predict box office performances in overseas markets. In this study, the aim is to design a forecast model using different machine learning algorithms to estimate the theatrical success of US movies in Turkey. From various sources, a dataset of 1559 movies is constructed. Firstly, independent variables are grouped as pre-release, distributor type, and international distribution based on their characteristic. The number of attendances is discretized into three classes. Four popular machine learning algorithms, artificial neural networks, decision tree regression and gradient boosting tree and random forest are employed, and the impact of each group is observed by compared by the performance models. Then the number of target classes is increased into five and eight and results are compared with the previously developed models in the literature.


Sign in / Sign up

Export Citation Format

Share Document