Machine learning-based analysis of multi-omics data on the cloud for investigating gene regulations

Author(s):  
Minsik Oh ◽  
Sungjoon Park ◽  
Sun Kim ◽  
Heejoon Chae

Abstract Gene expressions are subtly regulated by quantifiable measures of genetic molecules such as interaction with other genes, methylation, mutations, transcription factor and histone modifications. Integrative analysis of multi-omics data can help scientists understand the condition or patient-specific gene regulation mechanisms. However, analysis of multi-omics data is challenging since it requires not only the analysis of multiple omics data sets but also mining complex relations among different genetic molecules by using state-of-the-art machine learning methods. In addition, analysis of multi-omics data needs quite large computing infrastructure. Moreover, interpretation of the analysis results requires collaboration among many scientists, often requiring reperforming analysis from different perspectives. Many of the aforementioned technical issues can be nicely handled when machine learning tools are deployed on the cloud. In this survey article, we first survey machine learning methods that can be used for gene regulation study, and we categorize them according to five different goals: gene regulatory subnetwork discovery, disease subtype analysis, survival analysis, clinical prediction and visualization. We also summarize the methods in terms of multi-omics input types. Then, we explain why the cloud is potentially a good solution for the analysis of multi-omics data, followed by a survey of two state-of-the-art cloud systems, Galaxy and BioVLAB. Finally, we discuss important issues when the cloud is used for the analysis of multi-omics data for the gene regulation study.

2021 ◽  
Author(s):  
Andreas Sepp

Artificial intelligence and machine learning methods had significant contribution to the advancement and progress of predictive analytics. This article presents a state of the art of methods and applications of artificial intelligence and machine learning.


2020 ◽  
Vol 36 (2) ◽  
pp. 159-172
Author(s):  
Cong Thanh Bui ◽  
Loi Cao Van ◽  
Minh Hoang ◽  
Quang Uy Nguyen

The rapid development of the Internet and the wide spread of its applications has affected many aspects of our life. However, this development also makes the cyberspace more vulnerable to various attacks. Thus, detecting and preventing these attacks are crucial for the next development of the Internet and its services. Recently, machine learning methods have been widely adopted in detecting network attacks. Among many machine learning methods, AutoEncoders (AEs) are known as the state-of-the-art techniques for network anomaly detection. Although, AEs have been successfully applied to detect many types of attacks, it is often unable to detect some difficult attacks that attempt to mimic the normal network traffic. In order to handle this issue, we propose a new model based on AutoEncoder called Double-Shrink AutoEncoder (DSAE). DSAE put more shrinkage on the normal data in the middle hidden layer. This helps to pull out some anomalies that are very similar to normal data. DSAE are evaluated on six well-known network attacks datasets. The experimental results show that our model performs competitively to the state-of-the-art model, and often out-performs this model on the attacks group that is difficult for the previous methods.


2020 ◽  
Vol 49 (1) ◽  
pp. 129-138
Author(s):  
Martti Juhola ◽  
Kirsi Penttinen ◽  
Henry Joutsijoki ◽  
Katriina Aalto-Setälä

AbstractPatient-specific induced pluripotent stem cell-derived cardiomyocytes (iPSC-CMs) offer an attractive experimental platform to investigate cardiac diseases and therapeutic outcome. In this study, iPSC-CMs were utilized to study their calcium transient signals and drug effects by means of machine learning, a central part of artificial intelligence. Drug effects were assessed in six iPSC-lines carrying different mutations causing catecholaminergic polymorphic ventricular tachycardia (CPVT), a highly malignant inherited arrhythmogenic disorder. The antiarrhythmic effect of dantrolene, an inhibitor of sarcoplasmic calcium release, was studied in iPSC-CMs after adrenaline, an adrenergic agonist, stimulation by machine learning analysis of calcium transient signals. First, beats of transient signals were identified with our peak recognition algorithm previously developed. Then 12 peak variables were computed for every identified peak of a signal and by means of this data signals were classified into different classes corresponding to those affected by adrenaline or, thereafter, affected by a drug, dantrolene. The best classification accuracy was approximately 79% indicating that machine learning methods can be utilized in analysis of iPSC-CM drug effects. In the future, data analysis of iPSC-CM drug effects together with machine learning methods can create a very valuable and efficient platform to individualize medication in addition to drug screening and cardiotoxicity studies.


2021 ◽  
Vol 23 ◽  
Author(s):  
Xiong Li ◽  
Yangping Qiu ◽  
Juan Zhou ◽  
Ziruo Xie

Background: Recent development in neuroimaging and genetic testing technologies have made it possible to measure pathological features associated with Alzheimer's disease (AD) in vivo. Mining potential molecular markers of AD from high-dimensional, multi-modal neuroimaging and omics data will provide a new basis for early diagnosis and intervention in AD. In order to discover the real pathogenic mutation and even understand the pathogenic mechanism of AD, lots of machine learning methods have been designed and successfully applied to the analysis and processing of large-scale AD biomedical data. Objective: To introduce and summarize the applications and challenges of machine learning methods in Alzheimer's disease multi-source data analysis. Methods: The literature selected in the review is obtained from Google Scholar, PubMed, and Web of Science. The keywords of literature retrieval include Alzheimer's disease, bioinformatics, image genetics, genome-wide association research, molecular interaction network, multi-omics data integration, and so on. Conclusion: This study comprehensively introduces machine learning-based processing techniques for AD neuroimaging data and then shows the progress of computational analysis methods in omics data, such as the genome, proteome, and so on. Subsequently, machine learning methods for AD imaging analysis are also summarized. Finally, we elaborate on the current emerging technology of multi-modal neuroimaging, multi-omics data joint analysis, and present some outstanding issues and future research directions.


Metals ◽  
2021 ◽  
Vol 11 (8) ◽  
pp. 1164
Author(s):  
Saba Ayub ◽  
Beh Hoe Guan ◽  
Faiz Ahmad ◽  
Muhammad Faisal Javed ◽  
Amir Mosavi ◽  
...  

Advancement of novel electromagnetic inference (EMI) materials is essential in various industries. The purpose of this study is to present a state-of-the-art review on the methods used in the formation of graphene-, metal- and polymer-based composite EMI materials. The study indicates that in graphene- and metal-based composites, the utilization of alternating deposition method provides the highest shielding effectiveness. However, in polymer-based composite, the utilization of chemical vapor deposition method showed the highest shielding effectiveness. Furthermore, this review reveals that there is a gap in the literature in terms of the application of artificial intelligence and machine learning methods. The results further reveal that within the past half-decade machine learning methods, including artificial neural networks, have brought significant improvement for modelling EMI materials. We identified a research trend in the direction of using advanced forms of machine learning for comparative analysis, research and development employing hybrid and ensemble machine learning methods to deliver higher performance.


Sign in / Sign up

Export Citation Format

Share Document