scholarly journals Beyond Simple Images: Human Knowledge-Guided GANs for Clinical Data Generation

2021 ◽  
Author(s):  
Devendra Singh Dhami ◽  
Mayukh Das ◽  
Sriraam Natarajan

While Generative Adversarial Networks (GANs) have accelerated the use of generative modelling within the machine learning community, most of the adaptations of GANs are restricted to images. The use of GANs to generate clinical data has been rare due to the inability of GANs to faithfully capture the intrinsic relationships between features given a small amount of observational data. We hypothesize and verify that this challenge can be mitigated by incorporating rich domain knowledge in the form of expert advice in the generative process. Specifically, we propose human-allied GANs that uses correlation advice from humans to create synthetic clinical data. We construct a system that takes a symbolic representation of the expert advice and converts it into constraints on correlation of the features during the generative process. Our empirical evaluation demonstrates (a) the superiority of our approach over other GAN models, (b) the importance of incorporating advice over instance noise and (c) an initial framework for incorporation of privacy in our model while capturing the relationships between features.


2017 ◽  
Author(s):  
Benjamin Sanchez-Lengeling ◽  
Carlos Outeiral ◽  
Gabriel L. Guimaraes ◽  
Alan Aspuru-Guzik

Molecular discovery seeks to generate chemical species tailored to very specific needs. In this paper, we present ORGANIC, a framework based on Objective-Reinforced Generative Adversarial Networks (ORGAN), capable of producing a distribution over molecular space that matches with a certain set of desirable metrics. This methodology combines two successful techniques from the machine learning community: a Generative Adversarial Network (GAN), to create non-repetitive sensible molecular species, and Reinforcement Learning (RL), to bias this generative distribution towards certain attributes. We explore several applications, from optimization of random physicochemical properties to candidates for drug discovery and organic photovoltaic material design.



2021 ◽  
Vol 7 (8) ◽  
pp. 133
Author(s):  
Jonas Denck ◽  
Jens Guehring ◽  
Andreas Maier ◽  
Eva Rothgang

A magnetic resonance imaging (MRI) exam typically consists of the acquisition of multiple MR pulse sequences, which are required for a reliable diagnosis. With the rise of generative deep learning models, approaches for the synthesis of MR images are developed to either synthesize additional MR contrasts, generate synthetic data, or augment existing data for AI training. While current generative approaches allow only the synthesis of specific sets of MR contrasts, we developed a method to generate synthetic MR images with adjustable image contrast. Therefore, we trained a generative adversarial network (GAN) with a separate auxiliary classifier (AC) network to generate synthetic MR knee images conditioned on various acquisition parameters (repetition time, echo time, and image orientation). The AC determined the repetition time with a mean absolute error (MAE) of 239.6 ms, the echo time with an MAE of 1.6 ms, and the image orientation with an accuracy of 100%. Therefore, it can properly condition the generator network during training. Moreover, in a visual Turing test, two experts mislabeled 40.5% of real and synthetic MR images, demonstrating that the image quality of the generated synthetic and real MR images is comparable. This work can support radiologists and technologists during the parameterization of MR sequences by previewing the yielded MR contrast, can serve as a valuable tool for radiology training, and can be used for customized data generation to support AI training.



2020 ◽  
pp. 1-13
Author(s):  
Yundong Li ◽  
Yi Liu ◽  
Han Dong ◽  
Wei Hu ◽  
Chen Lin

The intrusion detection of railway clearance is crucial for avoiding railway accidents caused by the invasion of abnormal objects, such as pedestrians, falling rocks, and animals. However, detecting intrusions using deep learning methods from infrared images captured at night remains a challenging task because of the lack of sufficient training samples. To address this issue, a transfer strategy that migrates daytime RGB images to the nighttime style of infrared images is proposed in this study. The proposed method consists of two stages. In the first stage, a data generation model is trained on the basis of generative adversarial networks using RGB images and a small number of infrared images, and then, synthetic samples are generated using a well-trained model. In the second stage, a single shot multibox detector (SSD) model is trained using synthetic data and utilized to detect abnormal objects from infrared images at nighttime. To validate the effectiveness of the proposed method, two groups of experiments, namely, railway and non-railway scenes, are conducted. Experimental results demonstrate the effectiveness of the proposed method, and an improvement of 17.8% is achieved for object detection at nighttime.



Author(s):  
Maomi Ueno

This study describes an agent that acquires domain knowledge related to the content from a learning history log database in a learning community and automatically generates motivational messages for the learner. The unique features of this system are as follows: The agent builds a learner model automatically by applying the decision tree model. The agent predicts a learner’s final status (Failed; Abandon; Successful; or Excellent) using the learner model and his/her current learning history log data. The constructed learner model becomes more exact as the amount of data accumulated in the database increases. Furthermore, the agent compares a learner’s learning processes with “Excellent” status learners’ learning processes stored in the database, diagnoses the learner’s learning processes, and generates adaptive instructional messages for the learner. A comparison between a class of students that used the system and one that did not demonstrates the effectiveness of the system.



2010 ◽  
pp. 170-184
Author(s):  
David DiBiase ◽  
Mark Gahegan

This chapter investigates the problem of connecting advanced domain knowledge (from geography educators in this instance) with the strong pedagogic descriptions provided by colleagues from the University of Southampton, as described in Chapter IX, and then adding to this the learning materials that together comprise a learning object. Specifically, the chapter describes our efforts to enhance our open-source concept mapping tool (ConceptVista) with a variety of tools and methods that support the visualization, integration, packaging, and publishing of learning objects. We give examples of learning objects created from existing course materials, but enhanced with formal descriptions of both domain content and pedagogy. We then show how such descriptions can offer significant advantages in terms of making domain and pedagogic knowledge explicit, browsing such knowledge to better communicate educational aims and processes, tracking the development of ideas amongst the learning community, providing richer indices into learning material, and packaging these learning materials together with their descriptive knowledge. We explain how the resulting learning objects might be deployed within next-generation digital libraries that provide rich search languages to help educators locate useful learning objects from vast collections of learning materials.



Entropy ◽  
2020 ◽  
Vol 22 (8) ◽  
pp. 888
Author(s):  
Frantzeska Lavda ◽  
Magda Gregorová ◽  
Alexandros Kalousis

One of the major shortcomings of variational autoencoders is the inability to produce generations from the individual modalities of data originating from mixture distributions. This is primarily due to the use of a simple isotropic Gaussian as the prior for the latent code in the ancestral sampling procedure for data generations. In this paper, we propose a novel formulation of variational autoencoders, conditional prior VAE (CP-VAE), with a two-level generative process for the observed data where continuous z and a discrete c variables are introduced in addition to the observed variables x. By learning data-dependent conditional priors, the new variational objective naturally encourages a better match between the posterior and prior conditionals, and the learning of the latent categories encoding the major source of variation of the original data in an unsupervised manner. Through sampling continuous latent code from the data-dependent conditional priors, we are able to generate new samples from the individual mixture components corresponding, to the multimodal structure over the original data. Moreover, we unify and analyse our objective under different independence assumptions for the joint distribution of the continuous and discrete latent variables. We provide an empirical evaluation on one synthetic dataset and three image datasets, FashionMNIST, MNIST, and Omniglot, illustrating the generative performance of our new model comparing to multiple baselines.



2011 ◽  
Vol 58-60 ◽  
pp. 2085-2090 ◽  
Author(s):  
Xin Xin Liu ◽  
Shao Hua Tang ◽  
Kai Wei

This paper presents OntoRT, an ontology model for Role-base Trust-management(RT) framework, which covers a large fragment of RT including RT0, RT1, RT2 and application domain specification documents (ADSDs). RT addresses distributed authorization problems in decentralized collaborative systems. OntoRT establishes a common vocabulary for RT roles and policies across domains. We describe OntoRT formally in Description Logic(DL) SHOIN(D) and DL-safe SWRL rules. Basing on our logical formalization it is feasible to authorize and analyze RT policies automatically via the state of arts DL reasoners. Finally, we show how OntoRT can be integrated with OWL-DL ontologies which are W3C standard for representing information on the Web. By referring to OWL-DL ontologies that provide rich domain knowledge, specification and management of RT policies are simplified.



2017 ◽  
Vol 2017 ◽  
pp. 1-10 ◽  
Author(s):  
Jung-wei Fan ◽  
Jianrong Li ◽  
Yves A. Lussier

Exposome is a critical dimension in the precision medicine paradigm. Effective representation of exposomics knowledge is instrumental to melding nongenetic factors into data analytics for clinical research. There is still limited work in (1) modeling exposome entities and relations with proper integration to mainstream ontologies and (2) systematically studying their presence in clinical context. Through selected ontological relations, we developed a template-driven approach to identifying exposome concepts from the Unified Medical Language System (UMLS). The derived concepts were evaluated in terms of literature coverage and the ability to assist in annotating clinical text. The generated semantic model represents rich domain knowledge about exposure events (454 pairs of relations between exposure and outcome). Additionally, a list of 5667 disorder concepts with microbial etiology was created for inferred pathogen exposures. The model consistently covered about 90% of PubMed literature on exposure-induced iatrogenic diseases over 10 years (2001–2010). The model contributed to the efficiency of exposome annotation in clinical text by filtering out 78% of irrelevant machine annotations. Analysis into 50 annotated discharge summaries helped advance our understanding of the exposome information in clinical text. This pilot study demonstrated feasibility of semiautomatically developing a useful semantic resource for exposomics.



Digital technology is fast changing in the recent years and with this change, the number of data systems, sources, and formats has also increased exponentially. So the process of extracting data from these multiple source systems and transforming it to suit for various analytics processes is gaining importance at an alarming rate. In order to handle Big Data, the process of transformation is quite challenging, as data generation is a continuous process. In this paper, we extract data from various heterogeneous sources from the web and try to transform it into a form which is vastly used in data warehousing so that it caters to the analytical needs of the machine learning community.



Sign in / Sign up

Export Citation Format

Share Document