scholarly journals Deep Learning-based Image Analysis for High-content Screening

2021 ◽  
Author(s):  
◽  
Dylon Zeng

<p><b>High-content screening is an empirical strategy in drug discovery toidentify substances capable of altering cellular phenotype — the set ofobservable characteristics of a cell — in a desired way. Throughout thepast two decades, high-content screening has gathered significant attentionfrom academia and the pharmaceutical industry. However, imageanalysis remains a considerable hindrance to the widespread applicationof high-content screening. Standard image analysis relies on feature engineeringand suffers from inherent drawbacks such as the dependence onannotated inputs. There is an urging need for reliable and more efficientmethods to cope with increasingly large amounts of data produced.</b></p> <p>This thesis centres around the design and implementation of a deeplearning-based image analysis pipeline for high-content screening. Theend goal is to identify and cluster hit compounds that significantly alterthe phenotype of a cell. The proposed pipeline replaces feature engineeringwith a k-nearest neighbour-based similarity analysis. In addition, featureextraction using convolutional autoencoders is applied to reduce thenegative effects of noise on hit selection. As a result, the feature engineeringprocess is circumvented. A novel similarity measure is developed tofacilitate similarity analysis. Moreover, we combine deep learning withstatistical modelling to achieve optimal results. Preliminary explorationssuggest that the choice of hyperparameters have a direct impact on neuralnetwork performance. Generalised estimating equation models are usedto predict the most suitable neural network architecture for the input data.</p> <p>Using the proposed pipeline, we analyse an extensive set of images acquiredfrom a series of cell-based assays examining the effect of 282 FDAapproved drugs. The analysis of this data set produces a shortlist of drugsthat can significantly alter a cell’s phenotype, then further identifies fiveclusters of the shortlisted drugs. The clustering results present groups ofexisting drugs that have the potential to be repurposed for new therapeuticuses. Furthermore, our findings align with published studies. Comparedwith other neural networks, the image analysis pipeline proposedin this thesis provides reliable and better results in a shorter time frame.</p>

2021 ◽  
Author(s):  
◽  
Dylon Zeng

<p><b>High-content screening is an empirical strategy in drug discovery toidentify substances capable of altering cellular phenotype — the set ofobservable characteristics of a cell — in a desired way. Throughout thepast two decades, high-content screening has gathered significant attentionfrom academia and the pharmaceutical industry. However, imageanalysis remains a considerable hindrance to the widespread applicationof high-content screening. Standard image analysis relies on feature engineeringand suffers from inherent drawbacks such as the dependence onannotated inputs. There is an urging need for reliable and more efficientmethods to cope with increasingly large amounts of data produced.</b></p> <p>This thesis centres around the design and implementation of a deeplearning-based image analysis pipeline for high-content screening. Theend goal is to identify and cluster hit compounds that significantly alterthe phenotype of a cell. The proposed pipeline replaces feature engineeringwith a k-nearest neighbour-based similarity analysis. In addition, featureextraction using convolutional autoencoders is applied to reduce thenegative effects of noise on hit selection. As a result, the feature engineeringprocess is circumvented. A novel similarity measure is developed tofacilitate similarity analysis. Moreover, we combine deep learning withstatistical modelling to achieve optimal results. Preliminary explorationssuggest that the choice of hyperparameters have a direct impact on neuralnetwork performance. Generalised estimating equation models are usedto predict the most suitable neural network architecture for the input data.</p> <p>Using the proposed pipeline, we analyse an extensive set of images acquiredfrom a series of cell-based assays examining the effect of 282 FDAapproved drugs. The analysis of this data set produces a shortlist of drugsthat can significantly alter a cell’s phenotype, then further identifies fiveclusters of the shortlisted drugs. The clustering results present groups ofexisting drugs that have the potential to be repurposed for new therapeuticuses. Furthermore, our findings align with published studies. Comparedwith other neural networks, the image analysis pipeline proposedin this thesis provides reliable and better results in a shorter time frame.</p>


Data ◽  
2018 ◽  
Vol 3 (3) ◽  
pp. 28 ◽  
Author(s):  
Kasthurirangan Gopalakrishnan

Deep learning, more specifically deep convolutional neural networks, is fast becoming a popular choice for computer vision-based automated pavement distress detection. While pavement image analysis has been extensively researched over the past three decades or so, recent ground-breaking achievements of deep learning algorithms in the areas of machine translation, speech recognition, and computer vision has sparked interest in the application of deep learning to automated detection of distresses in pavement images. This paper provides a narrative review of recently published studies in this field, highlighting the current achievements and challenges. A comparison of the deep learning software frameworks, network architecture, hyper-parameters employed by each study, and crack detection performance is provided, which is expected to provide a good foundation for driving further research on this important topic in the context of smart pavement or asset management systems. The review concludes with potential avenues for future research; especially in the application of deep learning to not only detect, but also characterize the type, extent, and severity of distresses from 2D and 3D pavement images.


2020 ◽  
Vol 2 (2) ◽  
pp. 32-37
Author(s):  
P. RADIUK ◽  

Over the last decade, a set of machine learning algorithms called deep learning has led to significant improvements in computer vision, natural language recognition and processing. This has led to the widespread use of a variety of commercial, learning-based products in various fields of human activity. Despite this success, the use of deep neural networks remains a black box. Today, the process of setting hyperparameters and designing a network architecture requires experience and a lot of trial and error and is based more on chance than on a scientific approach. At the same time, the task of simplifying deep learning is extremely urgent. To date, no simple ways have been invented to establish the optimal values of learning hyperparameters, namely learning speed, sample size, data set, learning pulse, and weight loss. Grid search and random search of hyperparameter space are extremely resource intensive. The choice of hyperparameters is critical for the training time and the final result. In addition, experts often choose one of the standard architectures (for example, ResNets and ready-made sets of hyperparameters. However, such kits are usually suboptimal for specific practical tasks. The presented work offers an approach to finding the optimal set of hyperparameters of learning ZNM. An integrated approach to all hyperparameters is valuable because there is an interdependence between them. The aim of the work is to develop an approach for setting a set of hyperparameters, which will reduce the time spent during the design of ZNM and ensure the efficiency of its work. In recent decades, the introduction of deep learning methods, in particular convolutional neural networks (CNNs), has led to impressive success in image and video processing. However, the training of CNN has been commonly mostly based on the employment of quasi-optimal hyperparameters. Such an approach usually requires huge computational and time costs to train the network and does not guarantee a satisfactory result. However, hyperparameters play a crucial role in the effectiveness of CNN, as diverse hyperparameters lead to models with significantly different characteristics. Poorly selected hyperparameters generally lead to low model performance. The issue of choosing optimal hyperparameters for CNN has not been resolved yet. The presented work proposes several practical approaches to setting hyperparameters, which allows reducing training time and increasing the accuracy of the model. The article considers the function of training validation loss during underfitting and overfitting. There are guidelines in the end to reach the optimization point. The paper also considers the regulation of learning rate and momentum to accelerate network training. All experiments are based on the widespread CIFAR-10 and CIFAR-100 datasets.


2019 ◽  
Author(s):  
Jean-Baptiste Lugagne ◽  
Haonan Lin ◽  
Mary J. Dunlop

AbstractMicroscopy image analysis is a major bottleneck in quantification of single-cell microscopy data, typically requiring human supervision and curation, which limit both accuracy and throughput. To address this, we developed a deep learning-based image analysis pipeline that performs segmentation, tracking, and lineage reconstruction. Our analysis focuses on time-lapse movies of Escherichia coli cells trapped in a “mother machine” microfluidic device, a scalable platform for long-term single-cell analysis that is widely used in the field. While deep learning has been applied to cell segmentation problems before, our approach is fundamentally innovative in that it also uses machine learning to perform cell tracking and lineage reconstruction. With this framework we are able to get high fidelity results (1% error rate), without human supervision. Further, the algorithm is fast, with complete analysis of a typical frame containing ∼150 cells taking <700msec. The framework is not constrained to a particular experimental set up and has the potential to generalize to time-lapse images of other organisms or different experimental configurations. These advances open the door to a myriad of applications including real-time tracking of gene expression and high throughput analysis of strain libraries at single-cell resolution.Author SummaryAutomated microscopy experiments can generate massive data sets, allowing for detailed analysis of cell physiology and properties such as gene expression. In particular, dynamic measurements of gene expression with time-lapse microscopy have proved invaluable for understanding how gene regulatory networks operate. However, image analysis remains a key bottleneck in the analysis pipeline, typically requiring human supervision and a posteriori processing. Recently, machine learning-based approaches have ushered in a new era of rapid, unsupervised image analysis. In this work, we use and repurpose the U-Net deep learning algorithm to develop an image processing pipeline that can not only accurately identify the location of cells in an image, but also track them over time as they grow and divide. As an application, we focus on multi-hour time-lapse movies of bacteria growing in a microfluidic device. Our algorithm is accurate and fast, with error rates near 1% and requiring less than a second to analyze a typical movie frame. This increase in speed and fidelity has the potential to open new experimental avenues, e.g. where images are analyzed on-the-fly so that experimental conditions can be updated in real time.


2020 ◽  
pp. bjophthalmol-2020-317327
Author(s):  
Zhongwen Li ◽  
Chong Guo ◽  
Duoru Lin ◽  
Danyao Nie ◽  
Yi Zhu ◽  
...  

Background/AimsTo develop a deep learning system for automated glaucomatous optic neuropathy (GON) detection using ultra-widefield fundus (UWF) images.MethodsWe trained, validated and externally evaluated a deep learning system for GON detection based on 22 972 UWF images from 10 590 subjects that were collected at 4 different institutions in China and Japan. The InceptionResNetV2 neural network architecture was used to develop the system. The area under the receiver operating characteristic curve (AUC), sensitivity and specificity were used to assess the performance of detecting GON by the system. The data set from the Zhongshan Ophthalmic Center (ZOC) was selected to compare the performance of the system to that of ophthalmologists who mainly conducted UWF image analysis in clinics.ResultsThe system for GON detection achieved AUCs of 0.983–0.999 with sensitivities of 97.5–98.2% and specificities of 94.3–98.4% in four independent data sets. The most common reasons for false-negative results were confounding optic disc characteristics caused by high myopia or pathological myopia (n=39 (53%)). The leading cause for false-positive results was having other fundus lesions (n=401 (96%)). The performance of the system in the ZOC data set was comparable to that of an experienced ophthalmologist (p>0.05).ConclusionOur deep learning system can accurately detect GON from UWF images in an automated fashion. It may be used as a screening tool to improve the accessibility of screening and promote the early diagnosis and management of glaucoma.


2020 ◽  
Vol 2020 ◽  
pp. 1-7
Author(s):  
Mohd Zulfaezal Che Azemin ◽  
Radhiana Hassan ◽  
Mohd Izzuddin Mohd Tamrin ◽  
Mohd Adli Md Ali

The key component in deep learning research is the availability of training data sets. With a limited number of publicly available COVID-19 chest X-ray images, the generalization and robustness of deep learning models to detect COVID-19 cases developed based on these images are questionable. We aimed to use thousands of readily available chest radiograph images with clinical findings associated with COVID-19 as a training data set, mutually exclusive from the images with confirmed COVID-19 cases, which will be used as the testing data set. We used a deep learning model based on the ResNet-101 convolutional neural network architecture, which was pretrained to recognize objects from a million of images and then retrained to detect abnormality in chest X-ray images. The performance of the model in terms of area under the receiver operating curve, sensitivity, specificity, and accuracy was 0.82, 77.3%, 71.8%, and 71.9%, respectively. The strength of this study lies in the use of labels that have a strong clinical association with COVID-19 cases and the use of mutually exclusive publicly available data for training, validation, and testing.


GigaScience ◽  
2019 ◽  
Vol 8 (11) ◽  
Author(s):  
Robail Yasrab ◽  
Jonathan A Atkinson ◽  
Darren M Wells ◽  
Andrew P French ◽  
Tony P Pridmore ◽  
...  

Abstract Background In recent years quantitative analysis of root growth has become increasingly important as a way to explore the influence of abiotic stress such as high temperature and drought on a plant's ability to take up water and nutrients. Segmentation and feature extraction of plant roots from images presents a significant computer vision challenge. Root images contain complicated structures, variations in size, background, occlusion, clutter and variation in lighting conditions. We present a new image analysis approach that provides fully automatic extraction of complex root system architectures from a range of plant species in varied imaging set-ups. Driven by modern deep-learning approaches, RootNav 2.0 replaces previously manual and semi-automatic feature extraction with an extremely deep multi-task convolutional neural network architecture. The network also locates seeds, first order and second order root tips to drive a search algorithm seeking optimal paths throughout the image, extracting accurate architectures without user interaction. Results We develop and train a novel deep network architecture to explicitly combine local pixel information with global scene information in order to accurately segment small root features across high-resolution images. The proposed method was evaluated on images of wheat (Triticum aestivum L.) from a seedling assay. Compared with semi-automatic analysis via the original RootNav tool, the proposed method demonstrated comparable accuracy, with a 10-fold increase in speed. The network was able to adapt to different plant species via transfer learning, offering similar accuracy when transferred to an Arabidopsis thaliana plate assay. A final instance of transfer learning, to images of Brassica napus from a hydroponic assay, still demonstrated good accuracy despite many fewer training images. Conclusions We present RootNav 2.0, a new approach to root image analysis driven by a deep neural network. The tool can be adapted to new image domains with a reduced number of images, and offers substantial speed improvements over semi-automatic and manual approaches. The tool outputs root architectures in the widely accepted RSML standard, for which numerous analysis packages exist (http://rootsystemml.github.io/), as well as segmentation masks compatible with other automated measurement tools. The tool will provide researchers with the ability to analyse root systems at larget scales than ever before, at a time when large scale genomic studies have made this more important than ever.


2021 ◽  
Author(s):  
Damian J. Matuszewski ◽  
Petter Ranefall

Creating manual annotations in a large number of images is a tedious bottleneck that limits deep learning use in many applications. Here, we present a study in which we used the output of a classical image analysis pipeline as labels when training a convolutional neural network (CNN). This may not only reduce the time experts spend annotating images but it may also lead to an improvement of results when compared to the output from the classical pipeline used in training. In our application, i.e., cell nuclei segmentation, we generated the annotations using CellProfiler (a tool for developing classical image analysis pipelines for biomedical applications) and trained on them a U-Net-based CNN model. The best model achieved a 0.96 dice-coefficient of the segmented Nuclei and a 0.84 object-wise Jaccard index which was better than the classical method used for generating the annotations by 0.02 and 0.34, respectively. Our experimental results show that in this application, not only such training is feasible but also that the deep learning segmentations are a clear improvement compared to the output from the classical pipeline used for generating the annotations.


2019 ◽  
Vol 11 (3) ◽  
pp. 65-89 ◽  
Author(s):  
Vinayakumar R ◽  
Soman KP ◽  
Prabaharan Poornachandran

Recently, due to the advance and impressive results of deep learning techniques in the fields of image recognition, natural language processing and speech recognition for various long-standing artificial intelligence (AI) tasks, there has been a great interest in applying towards security tasks too. This article focuses on applying these deep taxonomy techniques to network intrusion detection system (N-IDS) with the aim to enhance the performance in classifying the network connections as either good or bad. To substantiate this to NIDS, this article models network traffic as a time series data, specifically transmission control protocol / internet protocol (TCP/IP) packets in a predefined time-window with a supervised deep learning methods such as recurrent neural network (RNN), identity matrix of initialized values typically termed as identity recurrent neural network (IRNN), long short-term memory (LSTM), clock-work RNN (CWRNN) and gated recurrent unit (GRU), utilizing connection records of KDDCup-99 challenge data set. The main interest is given to evaluate the performance of RNN over newly introduced method such as LSTM and IRNN to alleviate the vanishing and exploding gradient problem in memorizing the long-term dependencies. The efficient network architecture for all deep models is chosen based on comparing the performance of various network topologies and network parameters. The experiments of such chosen efficient configurations of deep models were run up to 1,000 epochs by varying learning-rates between 0.01-05. The observed results of IRNN are relatively close to the performance of LSTM on KDDCup-99 NIDS data set. In addition to KDDCup-99, the effectiveness of deep model architectures are evaluated on refined version of KDDCup-99: NSL-KDD and most recent one, UNSW-NB15 NIDS datasets.


Sign in / Sign up

Export Citation Format

Share Document