scholarly journals Environmental Adaptation and Differential Replication in Machine Learning

Entropy ◽  
2020 ◽  
Vol 22 (10) ◽  
pp. 1122
Author(s):  
Irene Unceta ◽  
Jordi Nin ◽  
Oriol Pujol

When deployed in the wild, machine learning models are usually confronted with an environment that imposes severe constraints. As this environment evolves, so do these constraints. As a result, the feasible set of solutions for the considered need is prone to change in time. We refer to this problem as that of environmental adaptation. In this paper, we formalize environmental adaptation and discuss how it differs from other problems in the literature. We propose solutions based on differential replication, a technique where the knowledge acquired by the deployed models is reused in specific ways to train more suitable future generations. We discuss different mechanisms to implement differential replications in practice, depending on the considered level of knowledge. Finally, we present seven examples where the problem of environmental adaptation can be solved through differential replication in real-life applications.

2021 ◽  
Vol 36 (1) ◽  
pp. 583-589
Author(s):  
Suraya Masrom ◽  
Thuraiya Mohd ◽  
Nur Syafiqah Jamil

Researchers and industry players acknowledged that machine learning application is useful in assisting human for solving many kinds of real life problems, including in real estate and property industry. In this paper, we present the empirical steps for implementing machine learning approaches in the prediction of green building price. Green building conserve natural resources and reduce the negative impact of the building development. This paper provides a report from the data collection method, preliminary data analysis with statistical method, and the experimental implementation of the machine learning models from training, validating to testing. The results show that the tree based machine learning produced better performances on the green building properties, which further tested with another five hold-out data. The testing results show that the machine learning with tree based scheme was able to predict the green building price higher than the observed price for the eight out of the ten cases within the acceptable valuation ranges.


2021 ◽  
Vol 11 (5) ◽  
pp. 2158
Author(s):  
Fida K. Dankar ◽  
Mahmoud Ibrahim

Synthetic data provides a privacy protecting mechanism for the broad usage and sharing of healthcare data for secondary purposes. It is considered a safe approach for the sharing of sensitive data as it generates an artificial dataset that contains no identifiable information. Synthetic data is increasing in popularity with multiple synthetic data generators developed in the past decade, yet its utility is still a subject of research. This paper is concerned with evaluating the effect of various synthetic data generation and usage settings on the utility of the generated synthetic data and its derived models. Specifically, we investigate (i) the effect of data pre-processing on the utility of the synthetic data generated, (ii) whether tuning should be applied to the synthetic datasets when generating supervised machine learning models, and (iii) whether sharing preliminary machine learning results can improve the synthetic data models. Lastly, (iv) we investigate whether one utility measure (Propensity score) can predict the accuracy of the machine learning models generated from the synthetic data when employed in real life. We use two popular measures of synthetic data utility, propensity score and classification accuracy, to compare the different settings. We adopt a recent mechanism for the calculation of propensity, which looks carefully into the choice of model for the propensity score calculation. Accordingly, this paper takes a new direction with investigating the effect of various data generation and usage settings on the quality of the generated data and its ensuing models. The goal is to inform on the best strategies to follow when generating and using synthetic data.


2020 ◽  
Vol 2 (1) ◽  
pp. 3-6
Author(s):  
Eric Holloway

Imagination Sampling is the usage of a person as an oracle for generating or improving machine learning models. Previous work demonstrated a general system for using Imagination Sampling for obtaining multibox models. Here, the possibility of importing such models as the starting point for further automatic enhancement is explored.


2021 ◽  
Author(s):  
Norberto Sánchez-Cruz ◽  
Jose L. Medina-Franco

<p>Epigenetic targets are a significant focus for drug discovery research, as demonstrated by the eight approved epigenetic drugs for treatment of cancer and the increasing availability of chemogenomic data related to epigenetics. This data represents a large amount of structure-activity relationships that has not been exploited thus far for the development of predictive models to support medicinal chemistry efforts. Herein, we report the first large-scale study of 26318 compounds with a quantitative measure of biological activity for 55 protein targets with epigenetic activity. Through a systematic comparison of machine learning models trained on molecular fingerprints of different design, we built predictive models with high accuracy for the epigenetic target profiling of small molecules. The models were thoroughly validated showing mean precisions up to 0.952 for the epigenetic target prediction task. Our results indicate that the herein reported models have considerable potential to identify small molecules with epigenetic activity. Therefore, our results were implemented as freely accessible and easy-to-use web application.</p>


2020 ◽  
Author(s):  
Shreya Reddy ◽  
Lisa Ewen ◽  
Pankti Patel ◽  
Prerak Patel ◽  
Ankit Kundal ◽  
...  

<p>As bots become more prevalent and smarter in the modern age of the internet, it becomes ever more important that they be identified and removed. Recent research has dictated that machine learning methods are accurate and the gold standard of bot identification on social media. Unfortunately, machine learning models do not come without their negative aspects such as lengthy training times, difficult feature selection, and overwhelming pre-processing tasks. To overcome these difficulties, we are proposing a blockchain framework for bot identification. At the current time, it is unknown how this method will perform, but it serves to prove the existence of an overwhelming gap of research under this area.<i></i></p>


Sign in / Sign up

Export Citation Format

Share Document