scholarly journals A Generative Adversarial Network (GAN) Technique for Internet of Medical Things Data

Sensors ◽  
2021 ◽  
Vol 21 (11) ◽  
pp. 3726
Author(s):  
Ivan Vaccari ◽  
Vanessa Orani ◽  
Alessia Paglialonga ◽  
Enrico Cambiaso ◽  
Maurizio Mongelli

The application of machine learning and artificial intelligence techniques in the medical world is growing, with a range of purposes: from the identification and prediction of possible diseases to patient monitoring and clinical decision support systems. Furthermore, the widespread use of remote monitoring medical devices, under the umbrella of the “Internet of Medical Things” (IoMT), has simplified the retrieval of patient information as they allow continuous monitoring and direct access to data by healthcare providers. However, due to possible issues in real-world settings, such as loss of connectivity, irregular use, misuse, or poor adherence to a monitoring program, the data collected might not be sufficient to implement accurate algorithms. For this reason, data augmentation techniques can be used to create synthetic datasets sufficiently large to train machine learning models. In this work, we apply the concept of generative adversarial networks (GANs) to perform a data augmentation from patient data obtained through IoMT sensors for Chronic Obstructive Pulmonary Disease (COPD) monitoring. We also apply an explainable AI algorithm to demonstrate the accuracy of the synthetic data by comparing it to the real data recorded by the sensors. The results obtained demonstrate how synthetic datasets created through a well-structured GAN are comparable with a real dataset, as validated by a novel approach based on machine learning.

Author(s):  
Derek Reiman ◽  
Yang Dai

AbstractThe microbiome of the human body has been shown to have profound effects on physiological regulation and disease pathogenesis. However, association analysis based on statistical modeling of microbiome data has continued to be a challenge due to inherent noise, complexity of the data, and high cost of collecting large number of samples. To address this challenge, we employed a deep learning framework to construct a data-driven simulation of microbiome data using a conditional generative adversarial network. Conditional generative adversarial networks train two models against each other while leveraging side information learn from a given dataset to compute larger simulated datasets that are representative of the original dataset. In our study, we used a cohorts of patients with inflammatory bowel disease to show that not only can the generative adversarial network generate samples representative of the original data based on multiple diversity metrics, but also that training machine learning models on the synthetic samples can improve disease prediction through data augmentation. In addition, we also show that the synthetic samples generated by this cohort can boost disease prediction of a different external cohort.


2017 ◽  
Author(s):  
Benjamin Sanchez-Lengeling ◽  
Carlos Outeiral ◽  
Gabriel L. Guimaraes ◽  
Alan Aspuru-Guzik

Molecular discovery seeks to generate chemical species tailored to very specific needs. In this paper, we present ORGANIC, a framework based on Objective-Reinforced Generative Adversarial Networks (ORGAN), capable of producing a distribution over molecular space that matches with a certain set of desirable metrics. This methodology combines two successful techniques from the machine learning community: a Generative Adversarial Network (GAN), to create non-repetitive sensible molecular species, and Reinforcement Learning (RL), to bias this generative distribution towards certain attributes. We explore several applications, from optimization of random physicochemical properties to candidates for drug discovery and organic photovoltaic material design.


2017 ◽  
Vol 25 (3) ◽  
pp. 811-827 ◽  
Author(s):  
Dimitris Spathis ◽  
Panayiotis Vlamos

This study examines the clinical decision support systems in healthcare, in particular about the prevention, diagnosis and treatment of respiratory diseases, such as Asthma and chronic obstructive pulmonary disease. The empirical pulmonology study of a representative sample (n = 132) attempts to identify the major factors that contribute to the diagnosis of these diseases. Machine learning results show that in chronic obstructive pulmonary disease’s case, Random Forest classifier outperforms other techniques with 97.7 per cent precision, while the most prominent attributes for diagnosis are smoking, forced expiratory volume 1, age and forced vital capacity. In asthma’s case, the best precision, 80.3 per cent, is achieved again with the Random Forest classifier, while the most prominent attribute is MEF2575.


Author(s):  
Huifang Li ◽  
◽  
Rui Fan ◽  
Qisong Shi ◽  
Zijian Du

Recent advancements in machine learning and communication technologies have enabled new approaches to automated fault diagnosis and detection in industrial systems. Given wide variation in occurrence frequencies of different classes of faults, the class distribution of real-world industrial fault data is usually imbalanced. However, most prior machine learning-based classification methods do not take this imbalance into consideration, and thus tend to be biased toward recognizing the majority classes and result in poor accuracy for minority ones. To solve such problems, we propose a k-means clustering generative adversarial network (KM-GAN)-based fault diagnosis approach able to reduce imbalance in fault data and improve diagnostic accuracy for minority classes. First, we design a new k-means clustering algorithm and GAN-based oversampling method to generate diverse minority-class samples obeying the similar distribution to the original minority data. The k-means clustering algorithm is adopted to divide minority-class samples into k clusters, while a GAN is applied to learn the data distribution of the resulting clusters and generate a given number of minority-class samples as a supplement to the original dataset. Then, we construct a deep neural network (DNN) and deep belief network (DBN)-based heterogeneous ensemble model as a fault classifier to improve generalization, in which DNN and DBN models are trained separately on the resulting dataset, and then the outputs from both are averaged as the final diagnostic result. A series of comparative experiments are conducted to verify the effectiveness of our proposed method, and the experimental results show that our method can improve diagnostic accuracy for minority-class samples.


2021 ◽  
Vol 13 (18) ◽  
pp. 10435
Author(s):  
Seoro Lee ◽  
Jonggun Kim ◽  
Gwanjae Lee ◽  
Jiyeong Hong ◽  
Joo Hyun Bae ◽  
...  

Changes in hydrological characteristics and increases in various pollutant loadings due to rapid climate change and urbanization have a significant impact on the deterioration of aquatic ecosystem health (AEH). Therefore, it is important to effectively evaluate the AEH in advance and establish appropriate strategic plans. Recently, machine learning (ML) models have been widely used to solve hydrological and environmental problems in various fields. However, in general, collecting sufficient data for ML training is time-consuming and labor-intensive. Especially in classification problems, data imbalance can lead to erroneous prediction results of ML models. In this study, we proposed a method to solve the data imbalance problem through data augmentation based on Wasserstein Generative Adversarial Network (WGAN) and to efficiently predict the grades (from A to E grades) of AEH indices (i.e., Benthic Macroinvertebrate Index (BMI), Trophic Diatom Index (TDI), Fish Assessment Index (FAI)) through the ML models. Raw datasets for the AEH indices composed of various physicochemical factors (i.e., WT, DO, BOD5, SS, TN, TP, and Flow) and AEH grades were built and augmented through the WGAN. The performance of each ML model was evaluated through a 10-fold cross-validation (CV), and the performances of the ML models trained on the raw and WGAN-based training sets were compared and analyzed through AEH grade prediction on the test sets. The results showed that the ML models trained on the WGAN-based training set had an average F1-score for grades of each AEH index of 0.9 or greater for the test set, which was superior to the models trained on the raw training set (fewer data compared to other datasets) only. Through the above results, it was confirmed that by using the dataset augmented through WGAN, the ML model can yield better AEH grade predictive performance compared to the model trained on limited datasets; this approach reduces the effort needed for actual data collection from rivers which requires enormous time and cost. In the future, the results of this study can be used as basic data to construct big data of aquatic ecosystems, needed to efficiently evaluate and predict AEH in rivers based on the ML models.


Author(s):  
Md Golam Moula Mehedi Hasan ◽  
Douglas A. Talbert

Counterfactual explanations are gaining in popularity as a way of explaining machine learning models. Counterfactual examples are generally created to help interpret the decision of a model. In this case, if a model makes a certain decision for an instance, the counterfactual examples of that instance reverse the decision of the model. The counterfactual examples can be created by craftily changing particular feature values of the instance. Though counterfactual examples are generated to explain the decision of machine learning models, in this work, we explore another potential application area of counterfactual examples, whether counterfactual examples are useful for data augmentation. We demonstrate the efficacy of this approach on the widely used “Adult-Income” dataset. We consider several scenarios where we do not have enough data and use counterfactual examples to augment the dataset. We compare our approach with Generative Adversarial Networks approach for dataset augmentation. The experimental results show that our proposed approach can be an effective way to augment a dataset.


2021 ◽  
Vol 2021 ◽  
pp. 1-13
Author(s):  
Andrei Bratu ◽  
Gabriela Czibula

Data augmentation is a commonly used technique in data science for improving the robustness and performance of machine learning models. The purpose of the paper is to study the feasibility of generating synthetic data points of temporal nature towards this end. A general approach named DAuGAN (Data Augmentation using Generative Adversarial Networks) is presented for identifying poorly represented sections of a time series, studying the synthesis and integration of new data points, and performance improvement on a benchmark machine learning model. The problem is studied and applied in the domain of algorithmic trading, whose constraints are presented and taken into consideration. The experimental results highlight an improvement in performance on a benchmark reinforcement learning agent trained on a dataset enhanced with DAuGAN to trade a financial instrument.


Energies ◽  
2019 ◽  
Vol 13 (1) ◽  
pp. 130 ◽  
Author(s):  
Mohammad Navid Fekri ◽  
Ananda Mohon Ghosh ◽  
Katarina Grolinger

The smart grid employs computing and communication technologies to embed intelligence into the power grid and, consequently, make the grid more efficient. Machine learning (ML) has been applied for tasks that are important for smart grid operation including energy consumption and generation forecasting, anomaly detection, and state estimation. These ML solutions commonly require sufficient historical data; however, this data is often not readily available because of reasons such as data collection costs and concerns regarding security and privacy. This paper introduces a recurrent generative adversarial network (R-GAN) for generating realistic energy consumption data by learning from real data. Generativea adversarial networks (GANs) have been mostly used for image tasks (e.g., image generation, super-resolution), but here they are used with time series data. Convolutional neural networks (CNNs) from image GANs are replaced with recurrent neural networks (RNNs) because of RNN’s ability to capture temporal dependencies. To improve training stability and increase quality of generated data, Wasserstein GANs (WGANs) and Metropolis-Hastings GAN (MH-GAN) approaches were applied. The accuracy is further improved by adding features created with ARIMA and Fourier transform. Experiments demonstrate that data generated by R-GAN can be used for training energy forecasting models.


2021 ◽  
Author(s):  
Matthew Li ◽  
Christopher McComb

Abstract Computational Fluid Dynamics (CFD) simulations are useful to the field of engineering design as they provide deep insights on product or system performance without the need to construct and test physical prototypes. However, they can be very computationally intensive to run. Machine learning methods have been shown to reconstruct high-resolution single-phase turbulent fluid flow simulations from low-resolution inputs. This offers a potential avenue towards alleviating computational cost in iterative engineering design applications. However, little work thus far has explored the application of machine learning image super-resolution methods to multiphase fluid flow (which is important for important for emerging fields such as marine hydrokinetic energy conversion). In this work, we apply a modified version of the Super-Resolution Generative Adversarial Network (SRGAN) model to a multiphase turbulent fluid flow problem, specifically to reconstruct fluid phase fraction at a higher resolution. Two models were created in this work, one with a simple physics-constrained loss function and one without, and the results are discussed and analyzed. We found that both models were able to significantly outperform non-machine learning upsampling methods and can preserve an impressive amount of detail and nuance, showing the versatility of the SRGAN model for upsampling fluid simulations. However, the difference in accuracy between the two models is quite minimal. This indicates that, for these contexts studied here, the additional complexity of a physics-informed approach may not be justified.


Risks ◽  
2021 ◽  
Vol 9 (3) ◽  
pp. 49
Author(s):  
Kwanda Sydwell Ngwenduna ◽  
Rendani Mbuvha

To build adequate predictive models, a substantial amount of data is desirable. However, when expanding to new or unexplored territories, this required level of information is rarely always available. To build such models, actuaries often have to: procure data from local providers, use limited unsuitable industry and public research, or rely on extrapolations from other better-known markets. Another common pathology when applying machine learning techniques in actuarial domains is the prevalence of imbalanced classes where risk events of interest, such as mortality and fraud, are under-represented in data. In this work, we show how an implicit model using the Generative Adversarial Network (GAN) can alleviate these problems through the generation of adequate quality data from very limited or highly imbalanced samples. We provide an introduction to GANs and how they are used to synthesize data that accurately enhance the data resolution of very infrequent events and improve model robustness. Overall, we show a significant superiority of GANs for boosting predictive models when compared to competing approaches on benchmark data sets. This work offers numerous of contributions to actuaries with applications to inter alia new sample creation, data augmentation, boosting predictive models, anomaly detection, and missing data imputation.


Sign in / Sign up

Export Citation Format

Share Document