Using GPUs to Speed up a Tomographic Reconstructor Based on Machine Learning

A Survey for Predicting Enzyme Family Classes Using Machine Learning Methods

Current Drug Targets ◽

10.2174/1389450119666181002143355 ◽

2019 ◽

Vol 20 (5) ◽

pp. 540-550 ◽

Cited By ~ 11

Author(s):

Jiu-Xin Tan ◽

Hao Lv ◽

Fang Wang ◽

Fu-Ying Dao ◽

Wei Chen ◽

...

Keyword(s):

Machine Learning ◽

Catalytic Mechanism ◽

Biological Function ◽

Learning Methods ◽

Biochemical Processes ◽

Machine Learning Methods ◽

Enzyme Family ◽

The Family ◽

Speed Up ◽

Family Classification

Enzymes are proteins that act as biological catalysts to speed up cellular biochemical processes. According to their main Enzyme Commission (EC) numbers, enzymes are divided into six categories: EC-1: oxidoreductase; EC-2: transferase; EC-3: hydrolase; EC-4: lyase; EC-5: isomerase and EC-6: synthetase. Different enzymes have different biological functions and acting objects. Therefore, knowing which family an enzyme belongs to can help infer its catalytic mechanism and provide information about the relevant biological function. With the large amount of protein sequences influxing into databanks in the post-genomics age, the annotation of the family for an enzyme is very important. Since the experimental methods are cost ineffective, bioinformatics tool will be a great help for accurately classifying the family of the enzymes. In this review, we summarized the application of machine learning methods in the prediction of enzyme family from different aspects. We hope that this review will provide insights and inspirations for the researches on enzyme family classification.

Download Full-text

Privacy Preserving Machine Learning with Homomorphic Encryption and Federated Learning

Future Internet ◽

10.3390/fi13040094 ◽

2021 ◽

Vol 13 (4) ◽

pp. 94

Author(s):

Haokun Fang ◽

Quan Qian

Keyword(s):

Machine Learning ◽

Homomorphic Encryption ◽

Privacy Preserving ◽

Great Success ◽

Learning Framework ◽

Computational Overhead ◽

Important Concern ◽

Speed Up ◽

Key Length ◽

Core Idea

Privacy protection has been an important concern with the great success of machine learning. In this paper, it proposes a multi-party privacy preserving machine learning framework, named PFMLP, based on partially homomorphic encryption and federated learning. The core idea is all learning parties just transmitting the encrypted gradients by homomorphic encryption. From experiments, the model trained by PFMLP has almost the same accuracy, and the deviation is less than 1%. Considering the computational overhead of homomorphic encryption, we use an improved Paillier algorithm which can speed up the training by 25–28%. Moreover, comparisons on encryption key length, the learning network structure, number of learning clients, etc. are also discussed in detail in the paper.

Download Full-text

Machine Learning Speeding Up the Development of Portfolio of New Crop Varieties to Adapt to and Mitigate Climate Change

10.1101/2021.10.06.463347 ◽

2021 ◽

Author(s):

Abdallah Bari ◽

Hassan Ouabbou ◽

abderrazek Jilal ◽

Hamid Khazaei ◽

Fred Stoddard ◽

...

Keyword(s):

Climate Change ◽

Machine Learning ◽

Crop Varieties ◽

Short Period ◽

Speed Up ◽

Increasing Demand ◽

Mitigate Climate Change ◽

High Yielding Varieties ◽

New Crop ◽

Adaptation And Mitigation

Climate change poses serious challenges to achieving food security in a time of a need to produce more food to keep up with the worlds increasing demand for food. There is an urgent need to speed up the development of new high yielding varieties with traits of adaptation and mitigation to climate change. Mathematical approaches, including ML approaches, have been used to search for such traits, leading to unprecedented results as some of the traits, including heat traits that have been long sought-for, have been found within a short period of time.

Download Full-text

OCCAM: prediction of small ORFs in bacterial genomes by means of a target-decoy database approach and machine learning techniques

Database ◽

10.1093/database/baaa067 ◽

2020 ◽

Vol 2020 ◽

Author(s):

Fabio R. Cerqueira ◽

Ana Tereza Ribeiro Vasconcelos

Keyword(s):

Machine Learning ◽

Machine Learning Algorithms ◽

Open Reading Frames ◽

Machine Learning Techniques ◽

Bacterial Genomes ◽

Small Proteins ◽

Learning Techniques ◽

Speed Up ◽

Computational Procedures ◽

Small Orfs

Abstract Small open reading frames (ORFs) have been systematically disregarded by automatic genome annotation. The difficulty in finding patterns in tiny sequences is the main reason that makes small ORFs to be overlooked by computational procedures. However, advances in experimental methods show that small proteins can play vital roles in cellular activities. Hence, it is urgent to make progress in the development of computational approaches to speed up the identification of potential small ORFs. In this work, our focus is on bacterial genomes. We improve a previous approach to identify small ORFs in bacteria. Our method uses machine learning techniques and decoy subject sequences to filter out spurious ORF alignments. We show that an advanced multivariate analysis can be more effective in terms of sensitivity than applying the simplistic and widely used e-value cutoff. This is particularly important in the case of small ORFs for which alignments present higher e-values than usual. Experiments with control datasets show that the machine learning algorithms used in our method to curate significant alignments can achieve average sensitivity and specificity of 97.06% and 99.61%, respectively. Therefore, an important step is provided here toward the construction of more accurate computational tools for the identification of small ORFs in bacteria.

Download Full-text

A machine-learning approach to speed-up simulation towards the design of optimum operating profiles of power plants

Proceedings of the 8th International Conference on Informatics, Environment, Energy and Applications - IEEA '19 ◽

10.1145/3323716.3323735 ◽

2019 ◽

Author(s):

Erik Rosado-Tamariz ◽

Miguel A. Zuniga-Garcia ◽

G. Santamaria-Bonfil ◽

Rafael Batres

Keyword(s):

Machine Learning ◽

Power Plants ◽

Optimum Operating ◽

Learning Approach ◽

Machine Learning Approach ◽

Speed Up

Download Full-text

Real-Time Minimum Snap Trajectory Generation for Quadcopters: Algorithm Speed-up Through Machine Learning

2019 International Conference on Robotics and Automation (ICRA) ◽

10.1109/icra.2019.8793569 ◽

2019 ◽

Cited By ~ 1

Author(s):

Marcelino M. de Almeida ◽

Rahul Moghe ◽

Maruthi Akella

Keyword(s):

Machine Learning ◽

Real Time ◽

Trajectory Generation ◽

Speed Up

Download Full-text

Using machine learning to speed up and improve calorimeter R&D

Journal of Instrumentation ◽

10.1088/1748-0221/15/05/c05032 ◽

2020 ◽

Vol 15 (05) ◽

pp. C05032-C05032

Author(s):

F. Ratnikov

Keyword(s):

Machine Learning ◽

Speed Up

Download Full-text

FAST RESISTIVITY LOGS SIMULATION IN TWO-DIMENSIONAL ANISOTROPIC NEAR-WELLBORE SPACE MODELS BASED ON NUMERICAL SIMULATION AND MACHINE LEARNING

Interexpo GEO-Siberia ◽

10.33764/2618-981x-2021-2-2-210-217 ◽

2021 ◽

Vol 2 (2) ◽

pp. 210-217

Author(s):

Aleksei M. Petrov ◽

Kirill N. Danilovskiy ◽

Vasiliy V. Eremenko

Keyword(s):

Machine Learning ◽

Learning Technologies ◽

Oil Well ◽

Geological Environment ◽

Two Dimensional ◽

New Approach ◽

Resistivity Logs ◽

Speed Up ◽

Environment Parameters ◽

Modern Machine

The article presents the results of a new approach application for oil well galvanic and induction resistivity logs simulation to enhance the efficiency of geological environment parameters evaluation and to speed up the interpretation. The use of modern machine learning technologies allows us to create algorithms for resistivity logs simulation in high-detailed two-dimensional anisotropic geoelectric models. The developed algorithms are characterized by a qualitatively new level of performance compared to the approaches used today.

Download Full-text

Performance and scaling behavior of bioinformatic applications in virtualization environments to create awareness for the efficient use of compute resources

PLoS Computational Biology ◽

10.1371/journal.pcbi.1009244 ◽

2021 ◽

Vol 17 (7) ◽

pp. e1009244

Author(s):

Maximilian Hanussek ◽

Felix Bartusch ◽

Jens Krüger

Keyword(s):

Machine Learning ◽

Virtual Environments ◽

High Performance ◽

Biological Data ◽

Scaling Behavior ◽

Bare Metal ◽

Learning Framework ◽

Speed Up ◽

Clustal Omega ◽

Performance Computing

The large amount of biological data available in the current times, makes it necessary to use tools and applications based on sophisticated and efficient algorithms, developed in the area of bioinformatics. Further, access to high performance computing resources is necessary, to achieve results in reasonable time. To speed up applications and utilize available compute resources as efficient as possible, software developers make use of parallelization mechanisms, like multithreading. Many of the available tools in bioinformatics offer multithreading capabilities, but more compute power is not always helpful. In this study we investigated the behavior of well-known applications in bioinformatics, regarding their performance in the terms of scaling, different virtual environments and different datasets with our benchmarking tool suite BOOTABLE. The tool suite includes the tools BBMap, Bowtie2, BWA, Velvet, IDBA, SPAdes, Clustal Omega, MAFFT, SINA and GROMACS. In addition we added an application using the machine learning framework TensorFlow. Machine learning is not directly part of bioinformatics but applied to many biological problems, especially in the context of medical images (X-ray photographs). The mentioned tools have been analyzed in two different virtual environments, a virtual machine environment based on the OpenStack cloud software and in a Docker environment. The gained performance values were compared to a bare-metal setup and among each other. The study reveals, that the used virtual environments produce an overhead in the range of seven to twenty-five percent compared to the bare-metal environment. The scaling measurements showed, that some of the analyzed tools do not benefit from using larger amounts of computing resources, whereas others showed an almost linear scaling behavior. The findings of this study have been generalized as far as possible and should help users to find the best amount of resources for their analysis. Further, the results provide valuable information for resource providers to handle their resources as efficiently as possible and raise the user community’s awareness of the efficient usage of computing resources.

Download Full-text

Exact Maximum Clique Algorithm for Different Graph Types Using Machine Learning

Mathematics ◽

10.3390/math10010097 ◽

2021 ◽

Vol 10 (1) ◽

pp. 97

Author(s):

Kristjan Reba ◽

Matej Guid ◽

Kati Rozman ◽

Dušanka Janežič ◽

Janez Konc

Keyword(s):

Machine Learning ◽

Maximum Clique ◽

Dynamic Algorithm ◽

Graph Theoretic ◽

Research Areas ◽

Novel Approach ◽

Search Speed ◽

Speed Up ◽

Clique Algorithm ◽

And Function

Finding a maximum clique is important in research areas such as computational chemistry, social network analysis, and bioinformatics. It is possible to compare the maximum clique size between protein graphs to determine their similarity and function. In this paper, improvements based on machine learning (ML) are added to a dynamic algorithm for finding the maximum clique in a protein graph, Maximum Clique Dynamic (MaxCliqueDyn; short: MCQD). This algorithm was published in 2007 and has been widely used in bioinformatics since then. It uses an empirically determined parameter, Tlimit, that determines the algorithm’s flow. We have extended the MCQD algorithm with an initial phase of a machine learning-based prediction of the Tlimit parameter that is best suited for each input graph. Such adaptability to graph types based on state-of-the-art machine learning is a novel approach that has not been used in most graph-theoretic algorithms. We show empirically that the resulting new algorithm MCQD-ML improves search speed on certain types of graphs, in particular molecular docking graphs used in drug design where they determine energetically favorable conformations of small molecules in a protein binding site. In such cases, the speed-up is twofold.

Download Full-text