scholarly journals Orchestrating Heterogeneous Devices and AI Services as Virtual Sensors for Secure Cloud-Based IoT Applications

Sensors ◽  
2021 ◽  
Vol 21 (22) ◽  
pp. 7509
Author(s):  
Sebastian Alberternst ◽  
Alexander Anisimov ◽  
Andre Antakli ◽  
Benjamin Duppe ◽  
Hilko Hoffmann ◽  
...  

The concept of the cloud-to-thing continuum addresses advancements made possible by the widespread adoption of cloud, edge, and IoT resources. It opens the possibility of combining classical symbolic AI with advanced machine learning approaches in a meaningful way. In this paper, we present a thing registry and an agent-based orchestration framework, which we combine to support semantic orchestration of IoT use cases across several federated cloud environments. We use the concept of virtual sensors based on machine learning (ML) services as abstraction, mediating between the instance level and the semantic level. We present examples of virtual sensors based on ML models for activity recognition and describe an approach to remedy the problem of missing or scarce training data. We illustrate the approach with a use case from an assisted living scenario.

2019 ◽  
Vol 11 (3) ◽  
pp. 284 ◽  
Author(s):  
Linglin Zeng ◽  
Shun Hu ◽  
Daxiang Xiang ◽  
Xiang Zhang ◽  
Deren Li ◽  
...  

Soil moisture mapping at a regional scale is commonplace since these data are required in many applications, such as hydrological and agricultural analyses. The use of remotely sensed data for the estimation of deep soil moisture at a regional scale has received far less emphasis. The objective of this study was to map the 500-m, 8-day average and daily soil moisture at different soil depths in Oklahoma from remotely sensed and ground-measured data using the random forest (RF) method, which is one of the machine-learning approaches. In order to investigate the estimation accuracy of the RF method at both a spatial and a temporal scale, two independent soil moisture estimation experiments were conducted using data from 2010 to 2014: a year-to-year experiment (with a root mean square error (RMSE) ranging from 0.038 to 0.050 m3/m3) and a station-to-station experiment (with an RMSE ranging from 0.044 to 0.057 m3/m3). Then, the data requirements, importance factors, and spatial and temporal variations in estimation accuracy were discussed based on the results using the training data selected by iterated random sampling. The highly accurate estimations of both the surface and the deep soil moisture for the study area reveal the potential of RF methods when mapping soil moisture at a regional scale, especially when considering the high heterogeneity of land-cover types and topography in the study area.


Author(s):  
Julien Siebert ◽  
Lisa Joeckel ◽  
Jens Heidrich ◽  
Adam Trendowicz ◽  
Koji Nakamichi ◽  
...  

AbstractNowadays, systems containing components based on machine learning (ML) methods are becoming more widespread. In order to ensure the intended behavior of a software system, there are standards that define necessary qualities of the system and its components (such as ISO/IEC 25010). Due to the different nature of ML, we have to re-interpret existing qualities for ML systems or add new ones (such as trustworthiness). We have to be very precise about which quality property is relevant for which entity of interest (such as completeness of training data or correctness of trained model), and how to objectively evaluate adherence to quality requirements. In this article, we present how to systematically construct quality models for ML systems based on an industrial use case. This quality model enables practitioners to specify and assess qualities for ML systems objectively. In addition to the overall construction process described, the main outcomes include a meta-model for specifying quality models for ML systems, reference elements regarding relevant views, entities, quality properties, and measures for ML systems based on existing research, an example instantiation of a quality model for a concrete industrial use case, and lessons learned from applying the construction process. We found that it is crucial to follow a systematic process in order to come up with measurable quality properties that can be evaluated in practice. In the future, we want to learn how the term quality differs between different types of ML systems and come up with reference quality models for evaluating qualities of ML systems.


2020 ◽  
Author(s):  
Paul Francoeur ◽  
Tomohide Masuda ◽  
David R. Koes

One of the main challenges in drug discovery is predicting protein-ligand binding affinity. Recently, machine learning approaches have made substantial progress on this task. However, current methods of model evaluation are overly optimistic in measuring generalization to new targets, and there does not exist a standard dataset of sufficient size to compare performance between models. We present a new dataset for structure-based machine learning, the CrossDocked2020 set, with 22.5 million poses of ligands docked into multiple similar binding pockets across the Protein Data Bank and perform a comprehensive evaluation of grid-based convolutional neural network models on this dataset. We also demonstrate how the partitioning of the training data and test data can impact the results of models trained with the PDBbind dataset, how performance improves by adding more, lower-quality training data, and how training with docked poses imparts pose sensitivity to the predicted affinity of a complex. Our best performing model, an ensemble of 5 densely connected convolutional newtworks, achieves a root mean squared error of 1.42 and Pearson R of 0.612 on the affinity prediction task, an AUC of 0.956 at binding pose classification, and a 68.4% accuracy at pose selection on the CrossDocked2020 set. By providing data splits for clustered cross-validation and the raw data for the CrossDocked2020 set, we establish the first standardized dataset for training machine learning models to recognize ligands in non-cognate target structures while also greatly expanding the number of poses available for training. In order to facilitate community adoption of this dataset for benchmarking protein-ligand binding affinity prediction, we provide our models, weights, and the CrossDocked2020 set at https://github.com/gnina/models.


Science ◽  
2021 ◽  
Vol 371 (6535) ◽  
pp. eabe8628
Author(s):  
Marshall Burke ◽  
Anne Driscoll ◽  
David B. Lobell ◽  
Stefano Ermon

Accurate and comprehensive measurements of a range of sustainable development outcomes are fundamental inputs into both research and policy. We synthesize the growing literature that uses satellite imagery to understand these outcomes, with a focus on approaches that combine imagery with machine learning. We quantify the paucity of ground data on key human-related outcomes and the growing abundance and improving resolution (spatial, temporal, and spectral) of satellite imagery. We then review recent machine learning approaches to model-building in the context of scarce and noisy training data, highlighting how this noise often leads to incorrect assessment of model performance. We quantify recent model performance across multiple sustainable development domains, discuss research and policy applications, explore constraints to future progress, and highlight research directions for the field.


2020 ◽  
Author(s):  
Paul Francoeur ◽  
Tomohide Masuda ◽  
David R. Koes

One of the main challenges in drug discovery is predicting protein-ligand binding affinity. Recently, machine learning approaches have made substantial progress on this task. However, current methods of model evaluation are overly optimistic in measuring generalization to new targets, and there does not exist a standard dataset of sufficient size to compare performance between models. We present a new dataset for structure-based machine learning, the CrossDocked2020 set, with 22.5 million poses of ligands docked into multiple similar binding pockets across the Protein Data Bank and perform a comprehensive evaluation of grid-based convolutional neural network models on this dataset. We also demonstrate how the partitioning of the training data and test data can impact the results of models trained with the PDBbind dataset, how performance improves by adding more, lower-quality training data, and how training with docked poses imparts pose sensitivity to the predicted affinity of a complex. Our best performing model, an ensemble of 5 densely connected convolutional newtworks, achieves a root mean squared error of 1.42 and Pearson R of 0.612 on the affinity prediction task, an AUC of 0.956 at binding pose classification, and a 68.4% accuracy at pose selection on the CrossDocked2020 set. By providing data splits for clustered cross-validation and the raw data for the CrossDocked2020 set, we establish the first standardized dataset for training machine learning models to recognize ligands in non-cognate target structures while also greatly expanding the number of poses available for training. In order to facilitate community adoption of this dataset for benchmarking protein-ligand binding affinity prediction, we provide our models, weights, and the CrossDocked2020 set at https://github.com/gnina/models.


Author(s):  
Gebreab K. Zewdie ◽  
David J. Lary ◽  
Estelle Levetin ◽  
Gemechu F. Garuma

Allergies to airborne pollen are a significant issue affecting millions of Americans. Consequently, accurately predicting the daily concentration of airborne pollen is of significant public benefit in providing timely alerts. This study presents a method for the robust estimation of the concentration of airborne Ambrosia pollen using a suite of machine learning approaches including deep learning and ensemble learners. Each of these machine learning approaches utilize data from the European Centre for Medium-Range Weather Forecasts (ECMWF) atmospheric weather and land surface reanalysis. The machine learning approaches used for developing a suite of empirical models are deep neural networks, extreme gradient boosting, random forests and Bayesian ridge regression methods for developing our predictive model. The training data included twenty-four years of daily pollen concentration measurements together with ECMWF weather and land surface reanalysis data from 1987 to 2011 is used to develop the machine learning predictive models. The last six years of the dataset from 2012 to 2017 is used to independently test the performance of the machine learning models. The correlation coefficients between the estimated and actual pollen abundance for the independent validation datasets for the deep neural networks, random forest, extreme gradient boosting and Bayesian ridge were 0.82, 0.81, 0.81 and 0.75 respectively, showing that machine learning can be used to effectively forecast the concentrations of airborne pollen.


2020 ◽  
Vol 2020 (14) ◽  
pp. 341-1-341-10
Author(s):  
Han Hu ◽  
Yang Lei ◽  
Daisy Xin ◽  
Viktor Shkolnikov ◽  
Steven Barcelo ◽  
...  

Separation and isolation of living cells plays an important role in the fields of medicine and biology with label-free imaging often used for isolating cells. The analysis of label-free cell images has many challenges when examining the behavior of cells. This paper presents methods to analyze label-free cells. Many of the tools we describe are based on machine learning approaches. We also investigate ways of augmenting limited availability of training data. Our results demonstrate that our proposed methods are capable of successfully segmenting and classifying label-free cells.


Author(s):  
Kai Hu ◽  
Zhaodi Zhou ◽  
Liguo Weng ◽  
Jia Liu ◽  
Lihua Wang ◽  
...  

Machine learning is a subfield of artificial intelligence concerned with techniques that allow computers to improve their outputs based on previous experiences. Among numerous machine learning algorithms, Weighted Extreme Learning Machine (WELM) is one of the famous cases recently. It not only has Extreme Learning Machine (ELM)’s extremely fast training speed and better generalization performance than traditional Neuron Network (NN), but also has the merit in handling imbalance data by assigning more weight to minority class and less weight to majority class. But it still has the limitation of its weight generated according to class distribution of training data, thereby, creating dependency on input data [R. Sharma and A. S. Bist, Genetic algorithm based weighted extreme learning machine for binary imbalance learning, 2015 Int. Conf. Cognitive Computing and Information Processing (CCIP) (IEEE, 2015), pp. 1–6; N. Koutsouleris, Classification/machine learning approaches, Annu. Rev. Clin. Psychol. 13(1) (2016); G. Dudek, Extreme learning machine for function approximation–interval problem of input weights and biases, 2015 IEEE 2nd Int. Conf. Cybernetics (CYBCONF) (IEEE, 2015), pp. 62–67; N. Zhang, Y. Qu and A. Deng, Evolutionary extreme learning machine based weighted nearest-neighbor equality classification, 2015 7th Int. Conf. Intelligent Human-Machine Systems and Cybernetics (IHMSC), Vol. 2 (IEEE, 2015), pp. 274–279]. This leads to the lack of finding optimal weight at which good generalization performance could be achieved [R. Sharma and A. S. Bist, Genetic algorithm based weighted extreme learning machine for binary imbalance learning, 2015 Int. Conf. Cognitive Computing and Information Processing (CCIP) (IEEE, 2015), pp. 1–6; N. Koutsouleris, Classification/machine learning approaches, Annu. Rev. Clin. Psychol. 13(1) (2016); G. Dudek, Extreme learning machine for function approximation–interval problem of input weights and biases, 2015 IEEE 2nd Int. Conf. Cybernetics (CYBCONF) (IEEE, 2015), pp. 62–67; N. Zhang, Y. Qu and A. Deng, Evolutionary extreme learning machine based weighted nearest-neighbor equality classification, 2015 7th Int. Conf. Intelligent Human-Machine Systems and Cybernetics (IHMSC), Vol. 2 (IEEE, 2015), pp. 274–279]. To solve it, a hybrid algorithm which composed by WELM algorithm and Particle Swarm Optimization (PSO) is proposed. Firstly, it distributes the weight according to the number of different samples, determines weighted method; Then, it combines the ELM model and the weighted method to establish WELM model; finally it utilizes PSO to optimize WELM’s three parameters (input weight, bias, the weight of imbalanced training data). Experiment data from both prediction and recognition show that it has better performance than classical WELM algorithms.


2021 ◽  
Vol 13 (2) ◽  
pp. 275
Author(s):  
Michael Meadows ◽  
Matthew Wilson

Given the high financial and institutional cost of collecting and processing accurate topography data, many large-scale flood hazard assessments continue to rely instead on freely-available global Digital Elevation Models, despite the significant vertical biases known to affect them. To predict (and thereby reduce) these biases, we apply a fully-convolutional neural network (FCN), a form of artificial neural network originally developed for image segmentation which is capable of learning from multi-variate spatial patterns at different scales. We assess its potential by training such a model on a wide variety of remote-sensed input data (primarily multi-spectral imagery), using high-resolution, LiDAR-derived Digital Terrain Models published by the New Zealand government as the reference topography data. In parallel, two more widely used machine learning models are also trained, in order to provide benchmarks against which the novel FCN may be assessed. We find that the FCN outperforms the other models (reducing root mean square error in the testing dataset by 71%), likely due to its ability to learn from spatial patterns at multiple scales, rather than only a pixel-by-pixel basis. Significantly for flood hazard modelling applications, corrections were found to be especially effective along rivers and their floodplains. However, our results also suggest that models are likely to be biased towards the land cover and relief conditions most prevalent in their training data, with further work required to assess the importance of limiting training data inputs to those most representative of the intended application area(s).


Sensors ◽  
2021 ◽  
Vol 21 (2) ◽  
pp. 546
Author(s):  
Omer Mujahid ◽  
Ivan Contreras ◽  
Josep Vehi

(1) Background: the use of machine learning techniques for the purpose of anticipating hypoglycemia has increased considerably in the past few years. Hypoglycemia is the drop in blood glucose below critical levels in diabetic patients. This may cause loss of cognitive ability, seizures, and in extreme cases, death. In almost half of all the severe cases, hypoglycemia arrives unannounced and is essentially asymptomatic. The inability of a diabetic patient to anticipate and intervene the occurrence of a hypoglycemic event often results in crisis. Hence, the prediction of hypoglycemia is a vital step in improving the life quality of a diabetic patient. The objective of this paper is to review work performed in the domain of hypoglycemia prediction by using machine learning and also to explore the latest trends and challenges that the researchers face in this area; (2) Methods: literature obtained from PubMed and Google Scholar was reviewed. Manuscripts from the last five years were searched for this purpose. A total of 903 papers were initially selected of which 57 papers were eventually shortlisted for detailed review; (3) Results: a thorough dissection of the shortlisted manuscripts provided an interesting split between the works based on two categories: hypoglycemia prediction and hypoglycemia detection. The entire review was carried out keeping this categorical distinction in perspective while providing a thorough overview of the machine learning approaches used to anticipate hypoglycemia, the type of training data, and the prediction horizon.


Sign in / Sign up

Export Citation Format

Share Document