Towards scalable online machine learning collaborations with OpenML

2021 ◽  
Vol 14 (13) ◽  
pp. 3418-3418
Author(s):  
Joaquin Vanschoren

Is massively collaborative machine learning possible? Can we share and organize our collective knowledge of machine learning to solve ever more challenging problems? In a way, yes: as a community, we are already very successful at developing high-quality open-source machine learning libraries, thanks to frictionless collaboration platforms for software development. However, code is only one aspect. The answer is much less clear when we also consider the data that goes into these algorithms and the exact models that are produced. A tremendous amount of work and experience goes into the collection, cleaning, and preprocessing of data and the design, evaluation, and finetuning of models, yet very little of this is shared and organized in a way so that others can easily build on it. Suppose one had a global platform for sharing machine learning datasets, models, and reproducible experiments in a frictionless way so that anybody could chip in at any time to share a good model, add or improve data, or suggest an idea. OpenML is an open-source initiative to create such a platform. It allows anyone to share datasets, machine learning pipelines, and full experiments, organizes all of it online with rich metadata, and enables anyone to reuse and build on them in novel and unexpected ways. All data is open and accessible through APIs, and it is readily integrated into popular machine learning tools to allow easy sharing of models and experiments. This openness also allows a budding ecosystem of automated processes to scale up machine learning further, such as discovering similar datasets, creating systematic benchmarks, or learning from all collected results how to build the best machine learning models and even automatically doing so for any new dataset. We welcome all of you to become a part of it.

Nowadays, Data Mining is used everywhere for extracting information from the data and in turn, acquires knowledge for decision making. Data Mining analyzes patterns which are used to extract information and knowledge for making decisions. Many open source and licensed tools like Weka, RapidMiner, KNIME, and Orange are available for Data Mining and predictive analysis. This paper discusses about different tools available for Data Mining and Machine Learning, followed by the description, pros and cons of these tools. The article provides details of all the algorithms like classification, regression, characterization, discretization, clustering, visualization and feature selection for Data Mining and Machine Learning tools. It will help people for efficient decision making and suggests which tool is suitable according to their requirement.


Author(s):  
Olivier Bronchain ◽  
François-Xavier Standaert

We take advantage of a recently published open source implementation of the AES protected with a mix of countermeasures against side-channel attacks to discuss both the challenges in protecting COTS devices against such attacks and the limitations of closed source security evaluations. The target implementation has been proposed by the French ANSSI (Agence Nationale de la Sécurité des Systèmes d’Information) to stimulate research on the design and evaluation of side-channel secure implementations. It combines additive and multiplicative secret sharings into an affine masking scheme that is additionally mixed with a shuffled execution. Its preliminary leakage assessment did not detect data dependencies with up to 100,000 measurements. We first exhibit the gap between such a preliminary leakage assessment and advanced attacks by demonstrating how a countermeasures’ dissection exploiting a mix of dimensionality reduction, multivariate information extraction and key enumeration can recover the full key with less than 2,000 measurements. We then discuss the relevance of open source evaluations to analyze such implementations efficiently, by pointing out that certain steps of the attack are hard to automate without implementation knowledge (even with machine learning tools), while performing them manually is straightforward. Our findings are not due to design flaws but from the general difficulty to prevent side-channel attacks in COTS devices with limited noise. We anticipate that high security on such devices requires significantly more shares.


2019 ◽  
Vol 7 (4) ◽  
pp. 184-190
Author(s):  
Himani Maheshwari ◽  
Pooja Goswami ◽  
Isha Rana

2021 ◽  
Vol 192 ◽  
pp. 103181
Author(s):  
Jagadish Timsina ◽  
Sudarshan Dutta ◽  
Krishna Prasad Devkota ◽  
Somsubhra Chakraborty ◽  
Ram Krishna Neupane ◽  
...  

i-com ◽  
2021 ◽  
Vol 20 (1) ◽  
pp. 19-32
Author(s):  
Daniel Buschek ◽  
Charlotte Anlauff ◽  
Florian Lachner

Abstract This paper reflects on a case study of a user-centred concept development process for a Machine Learning (ML) based design tool, conducted at an industry partner. The resulting concept uses ML to match graphical user interface elements in sketches on paper to their digital counterparts to create consistent wireframes. A user study (N=20) with a working prototype shows that this concept is preferred by designers, compared to the previous manual procedure. Reflecting on our process and findings we discuss lessons learned for developing ML tools that respect practitioners’ needs and practices.


2020 ◽  
Vol 53 (5) ◽  
pp. 704-709
Author(s):  
Yan Liu ◽  
Zhijing Ling ◽  
Boyu Huo ◽  
Boqian Wang ◽  
Tianen Chen ◽  
...  
Keyword(s):  

2021 ◽  
Vol 59 ◽  
pp. 102353
Author(s):  
Amber Grace Young ◽  
Ann Majchrzak ◽  
Gerald C. Kane

Sign in / Sign up

Export Citation Format

Share Document