Using Distributed Cloud Computing to Solve Resource-Intensive Tasks

2021 ◽  
Vol 12 (5) ◽  
pp. 233-254
Author(s):  
D. Yu. Bulgakov ◽  

A method for solving resource-intensive tasks that actively use the CPU, when the computing resources of one server become insufficient, is proposed. The need to solve this class of problems arises when using various machine learning models in a production environment, as well as in scientific research. Cloud computing allows you to organize distributed task processing on virtual servers that are easy to create, maintain, and replicate. An approach based on the use of free software implemented in the Python programming language is justified and proposed. The resulting solution is considered from the point of view of the theory of queuing. The effect of the proposed approach in solving problems of face recognition and analysis of biomedical signals is described.

Author(s):  
Anuja Phapale ◽  
Puja Kasture ◽  
Keshav Katkar ◽  
Omkar Karale ◽  
Atal Deshmukh

This paper focuses on framework developed with the goal to enhance the quality of underwater images using machine learning models for the Underwater Image enhancement system. It also covers the various technologies and language used in the development process using Python programming language. The developed system provides two major functionality such as feature to provide input as image or video and returns enhanced image or video depending upon user input type with focus on more image quality, sharpness, colour correctness etc.


2021 ◽  
Author(s):  
Michael Tarasiou

This paper presents DeepSatData a pipeline for automatically generating satellite imagery datasets for training machine learning models. We also discuss design considerations with emphasis on dense classification tasks, e.g. semantic segmentation. The implementation presented makes use of freely available Sentinel-2 data which allows the generation of large scale datasets required for training deep neural networks (DNN). We discuss issues faced from the point of view of DNN training and evaluation such as checking the quality of ground truth data and comment on the scalability of the approach.


Entropy ◽  
2021 ◽  
Vol 23 (3) ◽  
pp. 275
Author(s):  
Pol Borrellas ◽  
Irene Unceta

The deployment of machine learning models is expected to bring several benefits. Nevertheless, as a result of the complexity of the ecosystem in which models are generally trained and deployed, this technology also raises concerns regarding its (1) interpretability, (2) fairness, (3) safety, and (4) privacy. These issues can have substantial economic implications because they may hinder the development and mass adoption of machine learning. In light of this, the purpose of this paper was to determine, from a positive economics point of view, whether the free use of machine learning models maximizes aggregate social welfare or, alternatively, regulations are required. In cases in which restrictions should be enacted, policies are proposed. The adaptation of current tort and anti-discrimination laws is found to guarantee an optimal level of interpretability and fairness. Additionally, existing market solutions appear to incentivize machine learning operators to equip models with a degree of security and privacy that maximizes aggregate social welfare. These findings are expected to be valuable to inform the design of efficient public policies.


2021 ◽  
Author(s):  
Michael Tarasiou

This paper presents DeepSatData a pipeline for automatically generating satellite imagery datasets for training machine learning models. We also discuss design considerations with emphasis on dense classification tasks, e.g. semantic segmentation. The implementation presented makes use of freely available Sentinel-2 data which allows the generation of large scale datasets required for training deep neural networks (DNN). We discuss issues faced from the point of view of DNN training and evaluation such as checking the quality of ground truth data and comment on the scalability of the approach.


2020 ◽  
Author(s):  
Ashleigh Massam ◽  
Ashley Barnes ◽  
Siân Lane ◽  
Robert Platt ◽  
David Wood

<p>JBA Risk Management (JBA) uses JFlow®,  a two-dimensional hydraulic model, to simulate surface water, fluvial, and dam break flood risk. National flood maps are generated on a computer cluster that parallelises up to 20,000 model simulations , covering an area of up to 320,000 km3 and creating up to 10 GB of data per day.</p><p>JBA uses machine-learning models to identify artefacts in the flood simulations. The ability of machine-learning models to quickly process and detect these artefacts, combined with the use of an automated control system, means that hydraulic modelling throughput can be maximised with little user intervention. However, continual retraining of the model and application of software updates introduce the risk of a significant decrease in performance. This necessitates the use of a system to monitor the performance of the machine-learning model to ensure that a sufficient level of quality is maintained, and to allow drops in quality to be investigated.</p><p>We  present an approach used to develop performance checks on a machine-learning model that identifies artificial depth differences between hydraulic model simulations. Performance checks are centred on the use of control charts, an approach commonly used in manufacturing processes to monitor the proportion of items produced with defects. In order to develop this approach for a geoscientific context, JBA has (i) built a database of randomly-sampled hydraulic model outputs currently totalling 200 GB of data; (ii) developed metrics to summarise key features across a modelled region, including geomorphology and hydrology; (iii) used a random forest regression model to identify feature dominance to determine the most robust relationships that contribute to depth differences in the flood map; and (iv) developed the performance check in an automated system that tests every nth hydraulic modelling output against data sampled  based on common features.</p><p>The implementation of the performance checks allows JBA to assess potential changes in the quality of artificial feature identification following a training cycle in a development environment prior to release in a production environment.</p>


2020 ◽  
Vol 2 (1) ◽  
pp. 3-6
Author(s):  
Eric Holloway

Imagination Sampling is the usage of a person as an oracle for generating or improving machine learning models. Previous work demonstrated a general system for using Imagination Sampling for obtaining multibox models. Here, the possibility of importing such models as the starting point for further automatic enhancement is explored.


2020 ◽  
Vol 65 (1) ◽  
pp. 96-104
Author(s):  
Tatian-Cristian Mălin

We introduce in this paper an application developed in the Python programming language that can be used to generate digital signals with known frequencies and amplitudes. These digital signals, since have known parameters, can be used to create benchmarks for test and numerical simulation.


2021 ◽  
Author(s):  
Norberto Sánchez-Cruz ◽  
Jose L. Medina-Franco

<p>Epigenetic targets are a significant focus for drug discovery research, as demonstrated by the eight approved epigenetic drugs for treatment of cancer and the increasing availability of chemogenomic data related to epigenetics. This data represents a large amount of structure-activity relationships that has not been exploited thus far for the development of predictive models to support medicinal chemistry efforts. Herein, we report the first large-scale study of 26318 compounds with a quantitative measure of biological activity for 55 protein targets with epigenetic activity. Through a systematic comparison of machine learning models trained on molecular fingerprints of different design, we built predictive models with high accuracy for the epigenetic target profiling of small molecules. The models were thoroughly validated showing mean precisions up to 0.952 for the epigenetic target prediction task. Our results indicate that the herein reported models have considerable potential to identify small molecules with epigenetic activity. Therefore, our results were implemented as freely accessible and easy-to-use web application.</p>


Sign in / Sign up

Export Citation Format

Share Document