scholarly journals Comparison of the Meta-Active Machine Learning Model Applied to Biological Data-Driven Experiments with Other Models

2021 ◽  
Vol 2021 ◽  
pp. 1-8
Author(s):  
Hao Wang

Currently, many methods that could estimate the effects of conditions on a given biological target require either strong modelling assumptions or separate screens. Traditionally, many conditions and targets, without doing all possible experiments, could be achieved by driven experimentation or several mathematical methods, especially conversational machine learning methods. However, these methods still could not avoid and replace manual labels completely. This paper presented a meta-active machine learning method to resolve this problem. This project has used nine traditional machine learning methods to compare their accuracy and running time. In addition, this paper analyzes the meta-active machine learning method (MAML) compared with a classical screening method and progressive experiments. The obtained results show that applying this method yields the best experimental results on the current dataset.

2021 ◽  
Author(s):  
Bu-Yo Kim ◽  
Joo Wan Cha ◽  
Ki-Ho Chang

Abstract. In this study, image data features and machine learning methods were used to calculate 24-h continuous cloud cover from image data obtained by a camera-based imager on the ground. The image data features were the time (Julian day and hour), solar zenith angle, and statistical characteristics of the red-blue ratio, blue–red difference, and luminance. These features were determined from the red, green, and blue brightness of images subjected to a pre-processing process involving masking removal and distortion correction. The collected image data were divided into training, validation, and test sets and were used to optimize and evaluate the accuracy of each machine learning method. The cloud cover calculated by each machine learning method was verified with human-eye observation data from a manned observatory. Supervised machine learning models suitable for nowcasting, namely, support vector regression, random forest, gradient boosting machine, k-nearest neighbor, artificial neural network, and multiple linear regression methods, were employed and their results were compared. The best learning results were obtained by the support vector regression model, which had an accuracy, recall, and precision of 0.94, 0.70, and 0.76, respectively. Further, bias, root mean square error, and correlation coefficient values of 0.04 tenth, 1.45 tenths, and 0.93, respectively, were obtained for the cloud cover calculated using the test set. When the difference between the calculated and observed cloud cover was allowed to range between 0, 1, and 2 tenths, high agreement of approximately 42 %, 79 %, and 91 %, respectively, were obtained. The proposed system involving a ground-based imager and machine learning methods is expected to be suitable for application as an automated system to replace human-eye observations.


2021 ◽  
Vol 10 (3) ◽  
pp. 39
Author(s):  
Nirmalya Thakur ◽  
Chia Y. Han

This paper makes four scientific contributions to the field of fall detection in the elderly to contribute to their assisted living in the future of internet of things (IoT)-based pervasive living environments, such as smart homes. First, it presents and discusses a comprehensive comparative study, where 19 different machine learning methods were used to develop fall detection systems, to deduce the optimal machine learning method for the development of such systems. This study was conducted on two different datasets, and the results show that out of all the machine learning methods, the k-NN classifier is best suited for the development of fall detection systems in terms of performance accuracy. Second, it presents a framework that overcomes the limitations of binary classifier-based fall detection systems by being able to detect falls and fall-like motions. Third, to increase the trust and reliance on fall detection systems, it introduces a novel methodology based on the usage of k-folds cross-validation and the AdaBoost algorithm that improves the performance accuracy of the k-NN classifier-based fall detection system to the extent that it outperforms all similar works in this field. This approach achieved performance accuracies of 99.87% and 99.66%, respectively, when evaluated on the two datasets. Finally, the proposed approach is also highly accurate in detecting the activity of standing up from a lying position to infer whether a fall was followed by a long lie, which can cause minor to major health-related concerns. The above contributions address multiple research challenges in the field of fall detection, that we identified after conducting a comprehensive review of related works, which is also presented in this paper.


2013 ◽  
Vol 2013 ◽  
pp. 1-13 ◽  
Author(s):  
Ching Lee Koo ◽  
Mei Jing Liew ◽  
Mohd Saberi Mohamad ◽  
Abdul Hakim Mohamed Salleh

Recently, the greatest statistical computational challenge in genetic epidemiology is to identify and characterize the genes that interact with other genes and environment factors that bring the effect on complex multifactorial disease. These gene-gene interactions are also denoted as epitasis in which this phenomenon cannot be solved by traditional statistical method due to the high dimensionality of the data and the occurrence of multiple polymorphism. Hence, there are several machine learning methods to solve such problems by identifying such susceptibility gene which are neural networks (NNs), support vector machine (SVM), and random forests (RFs) in such common and multifactorial disease. This paper gives an overview on machine learning methods, describing the methodology of each machine learning methods and its application in detecting gene-gene and gene-environment interactions. Lastly, this paper discussed each machine learning method and presents the strengths and weaknesses of each machine learning method in detecting gene-gene interactions in complex human disease.


2019 ◽  
Vol 14 (3) ◽  
pp. 234-240 ◽  
Author(s):  
Wuritu Yang ◽  
Xiao-Juan Zhu ◽  
Jian Huang ◽  
Hui Ding ◽  
Hao Lin

Background:The location of proteins in a cell can provide important clues to their functions in various biological processes. Thus, the application of machine learning method in the prediction of protein subcellular localization has become a hotspot in bioinformatics. As one of key organelles, the Golgi apparatus is in charge of protein storage, package, and distribution.Objective:The identification of protein location in Golgi apparatus will provide in-depth insights into their functions. Thus, the machine learning-based method of predicting protein location in Golgi apparatus has been extensively explored. The development of protein sub-Golgi apparatus localization prediction should be reviewed for providing a whole background for the fields.Method:The benchmark dataset, feature extraction, machine learning method and published results were summarized.Results:We briefly introduced the recent progresses in protein sub-Golgi apparatus localization prediction using machine learning methods and discussed their advantages and disadvantages.Conclusion:We pointed out the perspective of machine learning methods in protein sub-Golgi localization prediction.


2021 ◽  
Vol 14 (10) ◽  
pp. 6695-6710
Author(s):  
Bu-Yo Kim ◽  
Joo Wan Cha ◽  
Ki-Ho Chang

Abstract. In this study, image data features and machine learning methods were used to calculate 24 h continuous cloud cover from image data obtained by a camera-based imager on the ground. The image data features were the time (Julian day and hour), solar zenith angle, and statistical characteristics of the red–blue ratio, blue–red difference, and luminance. These features were determined from the red, green, and blue brightness of images subjected to a pre-processing process involving masking removal and distortion correction. The collected image data were divided into training, validation, and test sets and were used to optimize and evaluate the accuracy of each machine learning method. The cloud cover calculated by each machine learning method was verified with human-eye observation data from a manned observatory. Supervised machine learning models suitable for nowcasting, namely, support vector regression, random forest, gradient boosting machine, k-nearest neighbor, artificial neural network, and multiple linear regression methods, were employed and their results were compared. The best learning results were obtained by the support vector regression model, which had an accuracy, recall, and precision of 0.94, 0.70, and 0.76, respectively. Further, bias, root mean square error, and correlation coefficient values of 0.04 tenths, 1.45 tenths, and 0.93, respectively, were obtained for the cloud cover calculated using the test set. When the difference between the calculated and observed cloud cover was allowed to range between 0, 1, and 2 tenths, high agreements of approximately 42 %, 79 %, and 91 %, respectively, were obtained. The proposed system involving a ground-based imager and machine learning methods is expected to be suitable for application as an automated system to replace human-eye observations.


Energies ◽  
2021 ◽  
Vol 14 (15) ◽  
pp. 4595
Author(s):  
Parisa Asadi ◽  
Lauren E. Beckingham

X-ray CT imaging provides a 3D view of a sample and is a powerful tool for investigating the internal features of porous rock. Reliable phase segmentation in these images is highly necessary but, like any other digital rock imaging technique, is time-consuming, labor-intensive, and subjective. Combining 3D X-ray CT imaging with machine learning methods that can simultaneously consider several extracted features in addition to color attenuation, is a promising and powerful method for reliable phase segmentation. Machine learning-based phase segmentation of X-ray CT images enables faster data collection and interpretation than traditional methods. This study investigates the performance of several filtering techniques with three machine learning methods and a deep learning method to assess the potential for reliable feature extraction and pixel-level phase segmentation of X-ray CT images. Features were first extracted from images using well-known filters and from the second convolutional layer of the pre-trained VGG16 architecture. Then, K-means clustering, Random Forest, and Feed Forward Artificial Neural Network methods, as well as the modified U-Net model, were applied to the extracted input features. The models’ performances were then compared and contrasted to determine the influence of the machine learning method and input features on reliable phase segmentation. The results showed considering more dimensionality has promising results and all classification algorithms result in high accuracy ranging from 0.87 to 0.94. Feature-based Random Forest demonstrated the best performance among the machine learning models, with an accuracy of 0.88 for Mancos and 0.94 for Marcellus. The U-Net model with the linear combination of focal and dice loss also performed well with an accuracy of 0.91 and 0.93 for Mancos and Marcellus, respectively. In general, considering more features provided promising and reliable segmentation results that are valuable for analyzing the composition of dense samples, such as shales, which are significant unconventional reservoirs in oil recovery.


F1000Research ◽  
2017 ◽  
Vol 6 ◽  
pp. 2012 ◽  
Author(s):  
Hashem Koohy

In the era of explosion in biological data, machine learning techniques are becoming more popular in life sciences, including biology and medicine. This research note examines the rise and fall of the most commonly used machine learning techniques in life sciences over the past three decades.


Author(s):  
Yoshihiro Yamanishi ◽  
Hisashi Kashima

In silico prediction of compound-protein interactions from heterogeneous biological data is critical in the process of drug development. In this chapter the authors review several supervised machine learning methods to predict unknown compound-protein interactions from chemical structure and genomic sequence information simultaneously. The authors review several kernel-based algorithms from two different viewpoints: binary classification and dimension reduction. In the results, they demonstrate the usefulness of the methods on the prediction of drug-target interactions and ligand-protein interactions from chemical structure data and genomic sequence data.


2019 ◽  
Vol 2019 ◽  
pp. 1-12 ◽  
Author(s):  
Yan Wang ◽  
Hao Zhang ◽  
Zhanliang Sang ◽  
Lingwei Xu ◽  
Conghui Cao ◽  
...  

Automatic modulation recognition has successfully used various machine learning methods and achieved certain results. As a subarea of machine learning, deep learning has made great progress in recent years and has made remarkable progress in the field of image and language processing. Deep learning requires a large amount of data support. As a communication field with a large amount of data, there is an inherent advantage of applying deep learning. However, the extensive application of deep learning in the field of communication has not yet been fully developed, especially in underwater acoustic communication. In this paper, we mainly discuss the modulation recognition process which is an important part of communication process by using the deep learning method. Different from the common machine learning methods that require feature extraction, the deep learning method does not require feature extraction and obtains more effects than common machine learning.


2017 ◽  
Author(s):  
Fadhl M Alakwaa ◽  
Kumardeep Chaudhary ◽  
Lana X Garmire

ABSTRACTMetabolomics holds the promise as a new technology to diagnose highly heterogeneous diseases. Conventionally, metabolomics data analysis for diagnosis is done using various statistical and machine learning based classification methods. However, it remains unknown if deep neural network, a class of increasingly popular machine learning methods, is suitable to classify metabolomics data. Here we use a cohort of 271 breast cancer tissues, 204 positive estrogen receptor (ER+) and 67 negative estrogen receptor (ER-), to test the accuracies of autoencoder, a deep learning (DL) framework, as well as six widely used machine learning models, namely Random Forest (RF), Support Vector Machines (SVM), Recursive Partitioning and Regression Trees (RPART), Linear Discriminant Analysis (LDA), Prediction Analysis for Microarrays (PAM), and Generalized Boosted Models (GBM). DL framework has the highest area under the curve (AUC) of 0.93 in classifying ER+/ER-patients, compared to the other six machine learning algorithms. Furthermore, the biological interpretation of the first hidden layer reveals eight commonly enriched significant metabolomics pathways (adjusted P-value<0.05) that cannot be discovered by other machine learning methods. Among them, protein digestion & absorption and ATP-binding cassette (ABC) transporters pathways are also confirmed in integrated analysis between metabolomics and gene expression data in these samples. In summary, deep learning method shows advantages for metabolomics based breast cancer ER status classification, with both the highest prediction accurcy (AUC=0.93) and better revelation of disease biology. We encourage the adoption of autoencoder based deep learning method in the metabolomics research community for classification.


Sign in / Sign up

Export Citation Format

Share Document