Comparison of the Meta-Active Machine Learning Model Applied to Biological Data-Driven Experiments with Other Models

Currently, many methods that could estimate the effects of conditions on a given biological target require either strong modelling assumptions or separate screens. Traditionally, many conditions and targets, without doing all possible experiments, could be achieved by driven experimentation or several mathematical methods, especially conversational machine learning methods. However, these methods still could not avoid and replace manual labels completely. This paper presented a meta-active machine learning method to resolve this problem. This project has used nine traditional machine learning methods to compare their accuracy and running time. In addition, this paper analyzes the meta-active machine learning method (MAML) compared with a classical screening method and progressive experiments. The obtained results show that applying this method yields the best experimental results on the current dataset.

Download Full-text

24-hour cloud cover calculation using ground-based imager with machine learning

10.5194/amt-2021-179 ◽

2021 ◽

Author(s):

Bu-Yo Kim ◽

Joo Wan Cha ◽

Ki-Ho Chang

Keyword(s):

Machine Learning ◽

Support Vector Regression ◽

Cloud Cover ◽

Image Data ◽

Support Vector ◽

Machine Learning Method ◽

Learning Method ◽

Learning Methods ◽

Human Eye ◽

Machine Learning Methods

Abstract. In this study, image data features and machine learning methods were used to calculate 24-h continuous cloud cover from image data obtained by a camera-based imager on the ground. The image data features were the time (Julian day and hour), solar zenith angle, and statistical characteristics of the red-blue ratio, blue–red difference, and luminance. These features were determined from the red, green, and blue brightness of images subjected to a pre-processing process involving masking removal and distortion correction. The collected image data were divided into training, validation, and test sets and were used to optimize and evaluate the accuracy of each machine learning method. The cloud cover calculated by each machine learning method was verified with human-eye observation data from a manned observatory. Supervised machine learning models suitable for nowcasting, namely, support vector regression, random forest, gradient boosting machine, k-nearest neighbor, artificial neural network, and multiple linear regression methods, were employed and their results were compared. The best learning results were obtained by the support vector regression model, which had an accuracy, recall, and precision of 0.94, 0.70, and 0.76, respectively. Further, bias, root mean square error, and correlation coefficient values of 0.04 tenth, 1.45 tenths, and 0.93, respectively, were obtained for the cloud cover calculated using the test set. When the difference between the calculated and observed cloud cover was allowed to range between 0, 1, and 2 tenths, high agreement of approximately 42 %, 79 %, and 91 %, respectively, were obtained. The proposed system involving a ground-based imager and machine learning methods is expected to be suitable for application as an automated system to replace human-eye observations.

Download Full-text

A Study of Fall Detection in Assisted Living: Identifying and Improving the Optimal Machine Learning Method

Journal of Sensor and Actuator Networks ◽

10.3390/jsan10030039 ◽

2021 ◽

Vol 10 (3) ◽

pp. 39

Author(s):

Nirmalya Thakur ◽

Chia Y. Han

Keyword(s):

Machine Learning ◽

Assisted Living ◽

Fall Detection ◽

Machine Learning Method ◽

Learning Method ◽

Learning Methods ◽

Detection Systems ◽

Machine Learning Methods ◽

Performance Accuracy ◽

Optimal Machine

This paper makes four scientific contributions to the field of fall detection in the elderly to contribute to their assisted living in the future of internet of things (IoT)-based pervasive living environments, such as smart homes. First, it presents and discusses a comprehensive comparative study, where 19 different machine learning methods were used to develop fall detection systems, to deduce the optimal machine learning method for the development of such systems. This study was conducted on two different datasets, and the results show that out of all the machine learning methods, the k-NN classifier is best suited for the development of fall detection systems in terms of performance accuracy. Second, it presents a framework that overcomes the limitations of binary classifier-based fall detection systems by being able to detect falls and fall-like motions. Third, to increase the trust and reliance on fall detection systems, it introduces a novel methodology based on the usage of k-folds cross-validation and the AdaBoost algorithm that improves the performance accuracy of the k-NN classifier-based fall detection system to the extent that it outperforms all similar works in this field. This approach achieved performance accuracies of 99.87% and 99.66%, respectively, when evaluated on the two datasets. Finally, the proposed approach is also highly accurate in detecting the activity of standing up from a lying position to infer whether a fall was followed by a long lie, which can cause minor to major health-related concerns. The above contributions address multiple research challenges in the field of fall detection, that we identified after conducting a comprehensive review of related works, which is also presented in this paper.

Download Full-text

A Review for Detecting Gene-Gene Interactions Using Machine Learning Methods in Genetic Epidemiology

BioMed Research International ◽

10.1155/2013/432375 ◽

2013 ◽

Vol 2013 ◽

pp. 1-13 ◽

Cited By ~ 22

Author(s):

Ching Lee Koo ◽

Mei Jing Liew ◽

Mohd Saberi Mohamad ◽

Abdul Hakim Mohamed Salleh

Keyword(s):

Machine Learning ◽

Genetic Epidemiology ◽

Support Vector ◽

Gene Interactions ◽

Machine Learning Method ◽

Learning Method ◽

Multifactorial Disease ◽

Learning Methods ◽

Machine Learning Methods ◽

Genes And Environment

Recently, the greatest statistical computational challenge in genetic epidemiology is to identify and characterize the genes that interact with other genes and environment factors that bring the effect on complex multifactorial disease. These gene-gene interactions are also denoted as epitasis in which this phenomenon cannot be solved by traditional statistical method due to the high dimensionality of the data and the occurrence of multiple polymorphism. Hence, there are several machine learning methods to solve such problems by identifying such susceptibility gene which are neural networks (NNs), support vector machine (SVM), and random forests (RFs) in such common and multifactorial disease. This paper gives an overview on machine learning methods, describing the methodology of each machine learning methods and its application in detecting gene-gene and gene-environment interactions. Lastly, this paper discussed each machine learning method and presents the strengths and weaknesses of each machine learning method in detecting gene-gene interactions in complex human disease.

Download Full-text

A Brief Survey of Machine Learning Methods in Protein Sub-Golgi Localization

Current Bioinformatics ◽

10.2174/1574893613666181113131415 ◽

2019 ◽

Vol 14 (3) ◽

pp. 234-240 ◽

Cited By ~ 68

Author(s):

Wuritu Yang ◽

Xiao-Juan Zhu ◽

Jian Huang ◽

Hui Ding ◽

Hao Lin

Keyword(s):

Machine Learning ◽

Golgi Apparatus ◽

Machine Learning Method ◽

Learning Method ◽

Learning Methods ◽

Advantages And Disadvantages ◽

Machine Learning Methods ◽

Golgi Localization ◽

Protein Location ◽

Localization Prediction

Background:The location of proteins in a cell can provide important clues to their functions in various biological processes. Thus, the application of machine learning method in the prediction of protein subcellular localization has become a hotspot in bioinformatics. As one of key organelles, the Golgi apparatus is in charge of protein storage, package, and distribution.Objective:The identification of protein location in Golgi apparatus will provide in-depth insights into their functions. Thus, the machine learning-based method of predicting protein location in Golgi apparatus has been extensively explored. The development of protein sub-Golgi apparatus localization prediction should be reviewed for providing a whole background for the fields.Method:The benchmark dataset, feature extraction, machine learning method and published results were summarized.Results:We briefly introduced the recent progresses in protein sub-Golgi apparatus localization prediction using machine learning methods and discussed their advantages and disadvantages.Conclusion:We pointed out the perspective of machine learning methods in protein sub-Golgi localization prediction.

Download Full-text

Twenty-four-hour cloud cover calculation using a ground-based imager with machine learning

Atmospheric Measurement Techniques ◽

10.5194/amt-14-6695-2021 ◽

2021 ◽

Vol 14 (10) ◽

pp. 6695-6710

Author(s):

Bu-Yo Kim ◽

Joo Wan Cha ◽

Ki-Ho Chang

Keyword(s):

Machine Learning ◽

Support Vector Regression ◽

Cloud Cover ◽

Image Data ◽

Support Vector ◽

Machine Learning Method ◽

Learning Method ◽

Learning Methods ◽

Human Eye ◽

Machine Learning Methods

Abstract. In this study, image data features and machine learning methods were used to calculate 24 h continuous cloud cover from image data obtained by a camera-based imager on the ground. The image data features were the time (Julian day and hour), solar zenith angle, and statistical characteristics of the red–blue ratio, blue–red difference, and luminance. These features were determined from the red, green, and blue brightness of images subjected to a pre-processing process involving masking removal and distortion correction. The collected image data were divided into training, validation, and test sets and were used to optimize and evaluate the accuracy of each machine learning method. The cloud cover calculated by each machine learning method was verified with human-eye observation data from a manned observatory. Supervised machine learning models suitable for nowcasting, namely, support vector regression, random forest, gradient boosting machine, k-nearest neighbor, artificial neural network, and multiple linear regression methods, were employed and their results were compared. The best learning results were obtained by the support vector regression model, which had an accuracy, recall, and precision of 0.94, 0.70, and 0.76, respectively. Further, bias, root mean square error, and correlation coefficient values of 0.04 tenths, 1.45 tenths, and 0.93, respectively, were obtained for the cloud cover calculated using the test set. When the difference between the calculated and observed cloud cover was allowed to range between 0, 1, and 2 tenths, high agreements of approximately 42 %, 79 %, and 91 %, respectively, were obtained. The proposed system involving a ground-based imager and machine learning methods is expected to be suitable for application as an automated system to replace human-eye observations.

Download Full-text

Integrating Machine/Deep Learning Methods and Filtering Techniques for Reliable Mineral Phase Segmentation of 3D X-ray Computed Tomography Images

Energies ◽

10.3390/en14154595 ◽

2021 ◽

Vol 14 (15) ◽

pp. 4595

Author(s):

Parisa Asadi ◽

Lauren E. Beckingham

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Random Forest ◽

Ct Images ◽

Ct Imaging ◽

Learning Method ◽

Learning Methods ◽

X Ray ◽

Machine Learning Methods ◽

Filtering Techniques

X-ray CT imaging provides a 3D view of a sample and is a powerful tool for investigating the internal features of porous rock. Reliable phase segmentation in these images is highly necessary but, like any other digital rock imaging technique, is time-consuming, labor-intensive, and subjective. Combining 3D X-ray CT imaging with machine learning methods that can simultaneously consider several extracted features in addition to color attenuation, is a promising and powerful method for reliable phase segmentation. Machine learning-based phase segmentation of X-ray CT images enables faster data collection and interpretation than traditional methods. This study investigates the performance of several filtering techniques with three machine learning methods and a deep learning method to assess the potential for reliable feature extraction and pixel-level phase segmentation of X-ray CT images. Features were first extracted from images using well-known filters and from the second convolutional layer of the pre-trained VGG16 architecture. Then, K-means clustering, Random Forest, and Feed Forward Artificial Neural Network methods, as well as the modified U-Net model, were applied to the extracted input features. The models’ performances were then compared and contrasted to determine the influence of the machine learning method and input features on reliable phase segmentation. The results showed considering more dimensionality has promising results and all classification algorithms result in high accuracy ranging from 0.87 to 0.94. Feature-based Random Forest demonstrated the best performance among the machine learning models, with an accuracy of 0.88 for Mancos and 0.94 for Marcellus. The U-Net model with the linear combination of focal and dice loss also performed well with an accuracy of 0.91 and 0.93 for Mancos and Marcellus, respectively. In general, considering more features provided promising and reliable segmentation results that are valuable for analyzing the composition of dense samples, such as shales, which are significant unconventional reservoirs in oil recovery.

Download Full-text

The rise and fall of machine learning methods in biomedical research

F1000Research ◽

10.12688/f1000research.13016.1 ◽

2017 ◽

Vol 6 ◽

pp. 2012 ◽

Cited By ~ 6

Author(s):

Hashem Koohy

Keyword(s):

Machine Learning ◽

Biomedical Research ◽

Life Sciences ◽

Biological Data ◽

Research Note ◽

Machine Learning Techniques ◽

Learning Methods ◽

The Past ◽

Machine Learning Methods ◽

Learning Techniques

In the era of explosion in biological data, machine learning techniques are becoming more popular in life sciences, including biology and medicine. This research note examines the rise and fall of the most commonly used machine learning techniques in life sciences over the past three decades.

Download Full-text

Prediction of Compound-Protein Interactions with Machine Learning Methods

Chemoinformatics and Advanced Machine Learning Perspectives ◽

10.4018/978-1-61520-911-8.ch016 ◽

2011 ◽

pp. 304-317

Author(s):

Yoshihiro Yamanishi ◽

Hisashi Kashima

Keyword(s):

Machine Learning ◽

Protein Interactions ◽

Chemical Structure ◽

Genomic Sequence ◽

Sequence Data ◽

Binary Classification ◽

Biological Data ◽

Supervised Machine Learning ◽

Learning Methods ◽

Machine Learning Methods

In silico prediction of compound-protein interactions from heterogeneous biological data is critical in the process of drug development. In this chapter the authors review several supervised machine learning methods to predict unknown compound-protein interactions from chemical structure and genomic sequence information simultaneously. The authors review several kernel-based algorithms from two different viewpoints: binary classification and dimension reduction. In the results, they demonstrate the usefulness of the methods on the prediction of drug-target interactions and ligand-protein interactions from chemical structure data and genomic sequence data.

Download Full-text

Modulation Classification of Underwater Communication with Deep Learning Network

Computational Intelligence and Neuroscience ◽

10.1155/2019/8039632 ◽

2019 ◽

Vol 2019 ◽

pp. 1-12 ◽

Cited By ~ 7

Author(s):

Yan Wang ◽

Hao Zhang ◽

Zhanliang Sang ◽

Lingwei Xu ◽

Conghui Cao ◽

...

Keyword(s):

Machine Learning ◽

Feature Extraction ◽

Deep Learning ◽

Language Processing ◽

Communication Process ◽

Underwater Acoustic Communication ◽

Learning Method ◽

Modulation Recognition ◽

Learning Methods ◽

Machine Learning Methods

Automatic modulation recognition has successfully used various machine learning methods and achieved certain results. As a subarea of machine learning, deep learning has made great progress in recent years and has made remarkable progress in the field of image and language processing. Deep learning requires a large amount of data support. As a communication field with a large amount of data, there is an inherent advantage of applying deep learning. However, the extensive application of deep learning in the field of communication has not yet been fully developed, especially in underwater acoustic communication. In this paper, we mainly discuss the modulation recognition process which is an important part of communication process by using the deep learning method. Different from the common machine learning methods that require feature extraction, the deep learning method does not require feature extraction and obtains more effects than common machine learning.

Download Full-text

Deep learning accurately predicts estrogen receptor status in breast cancer metabolomics data

10.1101/214254 ◽

2017 ◽

Author(s):

Fadhl M Alakwaa ◽

Kumardeep Chaudhary ◽

Lana X Garmire

Keyword(s):

Breast Cancer ◽

Machine Learning ◽

Estrogen Receptor ◽

Deep Learning ◽

Support Vector ◽

Integrated Analysis ◽

Learning Method ◽

Learning Methods ◽

Metabolomics Data ◽

Machine Learning Methods

ABSTRACTMetabolomics holds the promise as a new technology to diagnose highly heterogeneous diseases. Conventionally, metabolomics data analysis for diagnosis is done using various statistical and machine learning based classification methods. However, it remains unknown if deep neural network, a class of increasingly popular machine learning methods, is suitable to classify metabolomics data. Here we use a cohort of 271 breast cancer tissues, 204 positive estrogen receptor (ER+) and 67 negative estrogen receptor (ER-), to test the accuracies of autoencoder, a deep learning (DL) framework, as well as six widely used machine learning models, namely Random Forest (RF), Support Vector Machines (SVM), Recursive Partitioning and Regression Trees (RPART), Linear Discriminant Analysis (LDA), Prediction Analysis for Microarrays (PAM), and Generalized Boosted Models (GBM). DL framework has the highest area under the curve (AUC) of 0.93 in classifying ER+/ER-patients, compared to the other six machine learning algorithms. Furthermore, the biological interpretation of the first hidden layer reveals eight commonly enriched significant metabolomics pathways (adjusted P-value<0.05) that cannot be discovered by other machine learning methods. Among them, protein digestion & absorption and ATP-binding cassette (ABC) transporters pathways are also confirmed in integrated analysis between metabolomics and gene expression data in these samples. In summary, deep learning method shows advantages for metabolomics based breast cancer ER status classification, with both the highest prediction accurcy (AUC=0.93) and better revelation of disease biology. We encourage the adoption of autoencoder based deep learning method in the metabolomics research community for classification.

Download Full-text