Deep Learning in Mining Biological Data

AbstractRecent technological advancements in data acquisition tools allowed life scientists to acquire multimodal data from different biological application domains. Categorized in three broad types (i.e. images, signals, and sequences), these data are huge in amount and complex in nature. Mining such enormous amount of data for pattern recognition is a big challenge and requires sophisticated data-intensive machine learning techniques. Artificial neural network-based learning systems are well known for their pattern recognition capabilities, and lately their deep architectures—known as deep learning (DL)—have been successfully applied to solve many complex pattern recognition problems. To investigate how DL—especially its different architectures—has contributed and been utilized in the mining of biological data pertaining to those three types, a meta-analysis has been performed and the resulting resources have been critically analysed. Focusing on the use of DL to analyse patterns in data from diverse biological domains, this work investigates different DL architectures’ applications to these data. This is followed by an exploration of available open access data sources pertaining to the three data types along with popular open-source DL tools applicable to these data. Also, comparative investigations of these tools from qualitative, quantitative, and benchmarking perspectives are provided. Finally, some open research challenges in using DL to mine biological data are outlined and a number of possible future perspectives are put forward.

Download Full-text

KymoButler, a deep learning software for automated kymograph analysis

eLife ◽

10.7554/elife.42288 ◽

2019 ◽

Vol 8 ◽

Cited By ~ 7

Author(s):

Maximilian AH Jakobs ◽

Andrea Dimitracopoulos ◽

Kristian Franze

Keyword(s):

Deep Learning ◽

Data Analysis ◽

Biological Data ◽

Machine Learning Techniques ◽

Unconscious Bias ◽

Web Based ◽

Complex Particle ◽

Learning Techniques ◽

Biological Data Analysis ◽

Learning Software

Kymographs are graphical representations of spatial position over time, which are often used in biology to visualise the motion of fluorescent particles, molecules, vesicles, or organelles moving along a predictable path. Although in kymographs tracks of individual particles are qualitatively easily distinguished, their automated quantitative analysis is much more challenging. Kymographs often exhibit low signal-to-noise-ratios (SNRs), and available tools that automate their analysis usually require manual supervision. Here we developed KymoButler, a Deep Learning-based software to automatically track dynamic processes in kymographs. We demonstrate that KymoButler performs as well as expert manual data analysis on kymographs with complex particle trajectories from a variety of different biological systems. The software was packaged in a web-based ‘one-click’ application for use by the wider scientific community (http://kymobutler.deepmirror.ai). Our approach significantly speeds up data analysis, avoids unconscious bias, and represents another step towards the widespread adaptation of Machine Learning techniques in biological data analysis.

Download Full-text

KymoButler, a Deep Learning software for automated kymograph analysis

10.1101/405183 ◽

2018 ◽

Author(s):

Maximilian A. H. Jakobs ◽

Andrea Dimitracopoulos ◽

Kristian Franze

Keyword(s):

Deep Learning ◽

Data Analysis ◽

Biological Data ◽

Machine Learning Techniques ◽

Unconscious Bias ◽

Web Based ◽

Complex Particle ◽

Learning Techniques ◽

Biological Data Analysis ◽

Learning Software

AbstractKymographs are graphical representations of spatial position over time, which are often used in biology to visualise the motion of fluorescent particles, molecules, vesicles, or organelles moving along a predictable path. Although in kymographs tracks of individual particles are qualitatively easily distinguished, their automated quantitative analysis is much more challenging. Kymographs often exhibit low signal-to-noise-ratios (SNRs), and available tools that automate their analysis usually require manual supervision. Here we developed KymoButler, a Deep Learning-based software to automatically track dynamic processes in kymographs. We demonstrate that KymoButler performs as well as expert manual data analysis on kymographs with complex particle trajectories from a variety of different biological systems. The software was packaged in a web-based “one-click” application for use by the wider scientific community. Our approach significantly speeds up data analysis, avoids unconscious bias, and represents another step towards the widespread adaptation of Machine Learning techniques in biological data analysis.

Download Full-text

389-P: Ability for Detecting or Predicting Hypoglycemia with the Aid of Machine Learning Techniques: A Meta-analysis

Diabetes ◽

10.2337/db20-389-p ◽

2020 ◽

Vol 69 (Supplement 1) ◽

pp. 389-P

Author(s):

SATORU KODAMA ◽

MAYUKO H. YAMADA ◽

YUTA YAGUCHI ◽

MASARU KITAZAWA ◽

MASANORI KANEKO ◽

...

Keyword(s):

Machine Learning ◽

Meta Analysis ◽

Machine Learning Techniques ◽

Learning Techniques

Download Full-text

Breast Cancer Prediction Using Deep Learning and Machine Learning Techniques

SSRN Electronic Journal ◽

10.2139/ssrn.3558786 ◽

2020 ◽

Cited By ~ 1

Author(s):

MONIKA TIWARI ◽

Rashi Bharuka ◽

Praditi Shah ◽

Reena Lokare

Keyword(s):

Breast Cancer ◽

Machine Learning ◽

Deep Learning ◽

Machine Learning Techniques ◽

Cancer Prediction ◽

Learning Techniques

Download Full-text

Auto-Colorization of Historical Images Using Deep Convolutional Neural Networks

Mathematics ◽

10.3390/math8122258 ◽

2020 ◽

Vol 8 (12) ◽

pp. 2258

Author(s):

Madhab Raj Joshi ◽

Lewis Nkenyereye ◽

Gyanendra Prasad Joshi ◽

S. M. Riazul Islam ◽

Mohammad Abdullah-Al-Wadud ◽

...

Keyword(s):

Neural Network ◽

Deep Learning ◽

User Study ◽

Mean Squared Error ◽

Color Image ◽

Machine Learning Techniques ◽

Global Features ◽

Black And White ◽

Historical Images ◽

Learning Techniques

Enhancement of Cultural Heritage such as historical images is very crucial to safeguard the diversity of cultures. Automated colorization of black and white images has been subject to extensive research through computer vision and machine learning techniques. Our research addresses the problem of generating a plausible colored photograph of ancient, historically black, and white images of Nepal using deep learning techniques without direct human intervention. Motivated by the recent success of deep learning techniques in image processing, a feed-forward, deep Convolutional Neural Network (CNN) in combination with Inception- ResnetV2 is being trained by sets of sample images using back-propagation to recognize the pattern in RGB and grayscale values. The trained neural network is then used to predict two a* and b* chroma channels given grayscale, L channel of test images. CNN vividly colorizes images with the help of the fusion layer accounting for local features as well as global features. Two objective functions, namely, Mean Squared Error (MSE) and Peak Signal-to-Noise Ratio (PSNR), are employed for objective quality assessment between the estimated color image and its ground truth. The model is trained on the dataset created by ourselves with 1.2 K historical images comprised of old and ancient photographs of Nepal, each having 256 × 256 resolution. The loss i.e., MSE, PSNR, and accuracy of the model are found to be 6.08%, 34.65 dB, and 75.23%, respectively. Other than presenting the training results, the public acceptance or subjective validation of the generated images is assessed by means of a user study where the model shows 41.71% of naturalness while evaluating colorization results.

Download Full-text

Detection and Severity Evaluation of Combined Rail Defects Using Deep Learning

Vibration ◽

10.3390/vibration4020022 ◽

2021 ◽

Vol 4 (2) ◽

pp. 341-356

Author(s):

Jessada Sresakoolchai ◽

Sakdirat Kaewunruen

Keyword(s):

Neural Network ◽

Machine Learning ◽

Deep Learning ◽

Mean Absolute Error ◽

Absolute Error ◽

Machine Learning Techniques ◽

Rolling Stock ◽

Raw Data ◽

Learning Techniques ◽

Combined Defects

Various techniques have been developed to detect railway defects. One of the popular techniques is machine learning. This unprecedented study applies deep learning, which is a branch of machine learning techniques, to detect and evaluate the severity of rail combined defects. The combined defects in the study are settlement and dipped joint. Features used to detect and evaluate the severity of combined defects are axle box accelerations simulated using a verified rolling stock dynamic behavior simulation called D-Track. A total of 1650 simulations are run to generate numerical data. Deep learning techniques used in the study are deep neural network (DNN), convolutional neural network (CNN), and recurrent neural network (RNN). Simulated data are used in two ways: simplified data and raw data. Simplified data are used to develop the DNN model, while raw data are used to develop the CNN and RNN model. For simplified data, features are extracted from raw data, which are the weight of rolling stock, the speed of rolling stock, and three peak and bottom accelerations from two wheels of rolling stock. In total, there are 14 features used as simplified data for developing the DNN model. For raw data, time-domain accelerations are used directly to develop the CNN and RNN models without processing and data extraction. Hyperparameter tuning is performed to ensure that the performance of each model is optimized. Grid search is used for performing hyperparameter tuning. To detect the combined defects, the study proposes two approaches. The first approach uses one model to detect settlement and dipped joint, and the second approach uses two models to detect settlement and dipped joint separately. The results show that the CNN models of both approaches provide the same accuracy of 99%, so one model is good enough to detect settlement and dipped joint. To evaluate the severity of the combined defects, the study applies classification and regression concepts. Classification is used to evaluate the severity by categorizing defects into light, medium, and severe classes, and regression is used to estimate the size of defects. From the study, the CNN model is suitable for evaluating dipped joint severity with an accuracy of 84% and mean absolute error (MAE) of 1.25 mm, and the RNN model is suitable for evaluating settlement severity with an accuracy of 99% and mean absolute error (MAE) of 1.58 mm.

Download Full-text

A pattern recognition model for static gestures in malaysian sign language based on machine learning techniques

Computers & Electrical Engineering ◽

10.1016/j.compeleceng.2021.107383 ◽

2021 ◽

Vol 95 ◽

pp. 107383

Author(s):

Ali.H. Alrubayi ◽

M.A. Ahmed ◽

A.A. Zaidan ◽

A.S. Albahri ◽

B.B. Zaidan ◽

...

Keyword(s):

Machine Learning ◽

Pattern Recognition ◽

Sign Language ◽

Machine Learning Techniques ◽

Recognition Model ◽

Learning Techniques

Download Full-text

Assessing the Accuracy of Fault Interpretation using Machine Learning Techniques when Risking Faults for CO2 Storage Site Assessment

Interpretation ◽

10.1190/int-2021-0077.1 ◽

2021 ◽

pp. 1-55

Author(s):

Emma A. H. Michie ◽

Behzad Alaei ◽

Alvar Braathen

Keyword(s):

Deep Learning ◽

New Technologies ◽

Co2 Storage ◽

Fault Reactivation ◽

Machine Learning Techniques ◽

Close Similarity ◽

Storage Site ◽

Site Assessment ◽

Learning Techniques ◽

Fault Interpretation

Generating an accurate model of the subsurface for the purpose of assessing the feasibility of a CO2 storage site is crucial. In particular, how faults are interpreted is likely to influence the predicted capacity and integrity of the reservoir; whether this is through identifying high risk areas along the fault, where fluid is likely to flow across the fault, or by assessing the reactivation potential of the fault with increased pressure, causing fluid to flow up the fault. New technologies allow users to interpret faults effortlessly, and in much quicker time, utilizing methods such as Deep Learning. These Deep Learning techniques use knowledge from Neural Networks to allow end-users to compute areas where faults are likely to occur. Although these new technologies may be attractive due to reduced interpretation time, it is important to understand the inherent uncertainties in their ability to predict accurate fault geometries. Here, we compare Deep Learning fault interpretation versus manual fault interpretation, and can see distinct differences to those faults where significant ambiguity exists due to poor seismic resolution at the fault; we observe an increased irregularity when Deep Learning methods are used over conventional manual interpretation. This can result in significant differences between the resulting analyses, such as fault reactivation potential. Conversely, we observe that well-imaged faults show a close similarity between the resulting fault surfaces when both Deep Learning and manual fault interpretation methods are employed, and hence we also observe a close similarity between any attributes and fault analyses made.

Download Full-text

Sentiment Analysis using various Machine Learning and Deep Learning Techniques

Journal of the Nigerian Society of Physical Sciences ◽

10.46481/jnsps.2021.308 ◽

2021 ◽

pp. 385-394

Author(s):

V Umarani ◽

A Julian ◽

J Deepa

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Sentiment Analysis ◽

Naive Bayes ◽

Naïve Bayes ◽

Supervised Machine Learning ◽

Machine Learning Techniques ◽

Support Vector ◽

Analysis Process ◽

Learning Techniques

Sentiment analysis has gained a lot of attention from researchers in the last year because it has been widely applied to a variety of application domains such as business, government, education, sports, tourism, biomedicine, and telecommunication services. Sentiment analysis is an automated computational method for studying or evaluating sentiments, feelings, and emotions expressed as comments, feedbacks, or critiques. The sentiment analysis process can be automated using machine learning techniques, which analyses text patterns faster. The supervised machine learning technique is the most used mechanism for sentiment analysis. The proposed work discusses the flow of sentiment analysis process and investigates the common supervised machine learning techniques such as multinomial naive bayes, Bernoulli naive bayes, logistic regression, support vector machine, random forest, K-nearest neighbor, decision tree, and deep learning techniques such as Long Short-Term Memory and Convolution Neural Network. The work examines such learning methods using standard data set and the experimental results of sentiment analysis demonstrate the performance of various classifiers taken in terms of the precision, recall, F1-score, RoC-Curve, accuracy, running time and k fold cross validation and helps in appreciating the novelty of the several deep learning techniques and also giving the user an overview of choosing the right technique for their application.

Download Full-text

The rise and fall of machine learning methods in biomedical research

F1000Research ◽

10.12688/f1000research.13016.1 ◽

2017 ◽

Vol 6 ◽

pp. 2012 ◽

Cited By ~ 6

Author(s):

Hashem Koohy

Keyword(s):

Machine Learning ◽

Biomedical Research ◽

Life Sciences ◽

Biological Data ◽

Research Note ◽

Machine Learning Techniques ◽

Learning Methods ◽

The Past ◽

Machine Learning Methods ◽

Learning Techniques

In the era of explosion in biological data, machine learning techniques are becoming more popular in life sciences, including biology and medicine. This research note examines the rise and fall of the most commonly used machine learning techniques in life sciences over the past three decades.

Download Full-text