scholarly journals Enhancing robustness of machine learning systems via data transformations

Author(s):  
Arjun Nitin Bhagoji ◽  
Daniel Cullina ◽  
Chawin Sitawarin ◽  
Prateek Mittal
2018 ◽  
Vol 12 ◽  
pp. 85-98
Author(s):  
Bojan Kostadinov ◽  
Mile Jovanov ◽  
Emil STANKOV

Data collection and machine learning are changing the world. Whether it is medicine, sports or education, companies and institutions are investing a lot of time and money in systems that gather, process and analyse data. Likewise, to improve competitiveness, a lot of countries are making changes to their educational policy by supporting STEM disciplines. Therefore, it’s important to put effort into using various data sources to help students succeed in STEM. In this paper, we present a platform that can analyse student’s activity on various contest and e-learning systems, combine and process the data, and then present it in various ways that are easy to understand. This in turn enables teachers and organizers to recognize talented and hardworking students, identify issues, and/or motivate students to practice and work on areas where they’re weaker.


2021 ◽  
Vol 13 (3) ◽  
pp. 408
Author(s):  
Charles Nickmilder ◽  
Anthony Tedde ◽  
Isabelle Dufrasne ◽  
Françoise Lessire ◽  
Bernard Tychon ◽  
...  

Accurate information about the available standing biomass on pastures is critical for the adequate management of grazing and its promotion to farmers. In this paper, machine learning models are developed to predict available biomass expressed as compressed sward height (CSH) from readily accessible meteorological, optical (Sentinel-2) and radar satellite data (Sentinel-1). This study assumed that combining heterogeneous data sources, data transformations and machine learning methods would improve the robustness and the accuracy of the developed models. A total of 72,795 records of CSH with a spatial positioning, collected in 2018 and 2019, were used and aggregated according to a pixel-like pattern. The resulting dataset was split into a training one with 11,625 pixellated records and an independent validation one with 4952 pixellated records. The models were trained with a 19-fold cross-validation. A wide range of performances was observed (with mean root mean square error (RMSE) of cross-validation ranging from 22.84 mm of CSH to infinite-like values), and the four best-performing models were a cubist, a glmnet, a neural network and a random forest. These models had an RMSE of independent validation lower than 20 mm of CSH at the pixel-level. To simulate the behavior of the model in a decision support system, performances at the paddock level were also studied. These were computed according to two scenarios: either the predictions were made at a sub-parcel level and then aggregated, or the data were aggregated at the parcel level and the predictions were made for these aggregated data. The results obtained in this study were more accurate than those found in the literature concerning pasture budgeting and grassland biomass evaluation. The training of the 124 models resulting from the described framework was part of the realization of a decision support system to help farmers in their daily decision making.


Sensors ◽  
2021 ◽  
Vol 21 (7) ◽  
pp. 2514
Author(s):  
Tharindu Kaluarachchi ◽  
Andrew Reis ◽  
Suranga Nanayakkara

After Deep Learning (DL) regained popularity recently, the Artificial Intelligence (AI) or Machine Learning (ML) field is undergoing rapid growth concerning research and real-world application development. Deep Learning has generated complexities in algorithms, and researchers and users have raised concerns regarding the usability and adoptability of Deep Learning systems. These concerns, coupled with the increasing human-AI interactions, have created the emerging field that is Human-Centered Machine Learning (HCML). We present this review paper as an overview and analysis of existing work in HCML related to DL. Firstly, we collaborated with field domain experts to develop a working definition for HCML. Secondly, through a systematic literature review, we analyze and classify 162 publications that fall within HCML. Our classification is based on aspects including contribution type, application area, and focused human categories. Finally, we analyze the topology of the HCML landscape by identifying research gaps, highlighting conflicting interpretations, addressing current challenges, and presenting future HCML research opportunities.


Entropy ◽  
2020 ◽  
Vol 23 (1) ◽  
pp. 18
Author(s):  
Pantelis Linardatos ◽  
Vasilis Papastefanopoulos ◽  
Sotiris Kotsiantis

Recent advances in artificial intelligence (AI) have led to its widespread industrial adoption, with machine learning systems demonstrating superhuman performance in a significant number of tasks. However, this surge in performance, has often been achieved through increased model complexity, turning such systems into “black box” approaches and causing uncertainty regarding the way they operate and, ultimately, the way that they come to decisions. This ambiguity has made it problematic for machine learning systems to be adopted in sensitive yet critical domains, where their value could be immense, such as healthcare. As a result, scientific interest in the field of Explainable Artificial Intelligence (XAI), a field that is concerned with the development of new methods that explain and interpret machine learning models, has been tremendously reignited over recent years. This study focuses on machine learning interpretability methods; more specifically, a literature review and taxonomy of these methods are presented, as well as links to their programming implementations, in the hope that this survey would serve as a reference point for both theorists and practitioners.


Author(s):  
Yiming Tang ◽  
Raffi Khatchadourian ◽  
Mehdi Bagherzadeh ◽  
Rhia Singh ◽  
Ajani Stewart ◽  
...  

2021 ◽  
Author(s):  
Fadheela Hussain ◽  
Mustafa Hammad ◽  
Wael El-Medany ◽  
Riadh Ksantini

2019 ◽  
Vol 20 (1) ◽  
pp. 221-256 ◽  
Author(s):  
Helen Nissenbaum

Abstract According to the theory of contextual integrity (CI), privacy norms prescribe information flows with reference to five parameters — sender, recipient, subject, information type, and transmission principle. Because privacy is grasped contextually (e.g., health, education, civic life, etc.), the values of these parameters range over contextually meaningful ontologies — of information types (or topics) and actors (subjects, senders, and recipients), in contextually defined capacities. As an alternative to predominant approaches to privacy, which were ineffective against novel information practices enabled by IT, CI was able both to pinpoint sources of disruption and provide grounds for either accepting or rejecting them. Mounting challenges from a burgeoning array of networked, sensor-enabled devices (IoT) and data-ravenous machine learning systems, similar in form though magnified in scope, call for renewed attention to theory. This Article introduces the metaphor of a data (food) chain to capture the nature of these challenges. With motion up the chain, where data of higher order is inferred from lower-order data, the crucial question is whether privacy norms governing lower-order data are sufficient for the inferred higher-order data. While CI has a response to this question, a greater challenge comes from data primitives, such as digital impulses of mouse clicks, motion detectors, and bare GPS coordinates, because they appear to have no meaning. Absent a semantics, they escape CI’s privacy norms entirely.


2022 ◽  
Vol 54 (8) ◽  
pp. 1-37
Author(s):  
M. G. Sarwar Murshed ◽  
Christopher Murphy ◽  
Daqing Hou ◽  
Nazar Khan ◽  
Ganesh Ananthanarayanan ◽  
...  

Resource-constrained IoT devices, such as sensors and actuators, have become ubiquitous in recent years. This has led to the generation of large quantities of data in real-time, which is an appealing target for AI systems. However, deploying machine learning models on such end-devices is nearly impossible. A typical solution involves offloading data to external computing systems (such as cloud servers) for further processing but this worsens latency, leads to increased communication costs, and adds to privacy concerns. To address this issue, efforts have been made to place additional computing devices at the edge of the network, i.e., close to the IoT devices where the data is generated. Deploying machine learning systems on such edge computing devices alleviates the above issues by allowing computations to be performed close to the data sources. This survey describes major research efforts where machine learning systems have been deployed at the edge of computer networks, focusing on the operational aspects including compression techniques, tools, frameworks, and hardware used in successful applications of intelligent edge systems.


Sign in / Sign up

Export Citation Format

Share Document