Tens of images can suffice to train neural networks for malignant leukocyte detection

AbstractConvolutional neural networks (CNNs) excel as powerful tools for biomedical image classification. It is commonly assumed that training CNNs requires large amounts of annotated data. This is a bottleneck in many medical applications where annotation relies on expert knowledge. Here, we analyze the binary classification performance of a CNN on two independent cytomorphology datasets as a function of training set size. Specifically, we train a sequential model to discriminate non-malignant leukocytes from blast cells, whose appearance in the peripheral blood is a hallmark of leukemia. We systematically vary training set size, finding that tens of training images suffice for a binary classification with an ROC-AUC over 90%. Saliency maps and layer-wise relevance propagation visualizations suggest that the network learns to increasingly focus on nuclear structures of leukocytes as the number of training images is increased. A low dimensional tSNE representation reveals that while the two classes are separated already for a few training images, the distinction between the classes becomes clearer when more training images are used. To evaluate the performance in a multi-class problem, we annotated single-cell images from a acute lymphoblastic leukemia dataset into six different hematopoietic classes. Multi-class prediction suggests that also here few single-cell images suffice if differences between morphological classes are large enough. The incorporation of deep learning algorithms into clinical practice has the potential to reduce variability and cost, democratize usage of expertise, and allow for early detection of disease onset and relapse. Our approach evaluates the performance of a deep learning based cytology classifier with respect to size and complexity of the training data and the classification task.

Download Full-text

Automatic Detection of Arrhythmia Based on Multi-Resolution Representation of ECG Signal

Sensors ◽

10.3390/s20061579 ◽

2020 ◽

Vol 20 (6) ◽

pp. 1579

Author(s):

Dongqi Wang ◽

Qinghua Meng ◽

Dongming Chen ◽

Hupo Zhang ◽

Lisheng Xu

Keyword(s):

Neural Networks ◽

Deep Learning ◽

Deep Neural Networks ◽

Channel Model ◽

Expert Knowledge ◽

Automatic Detection ◽

Data Representation ◽

Learning Technology ◽

Arrhythmia Detection ◽

Automatic Feature Extraction

Automatic detection of arrhythmia is of great significance for early prevention and diagnosis of cardiovascular disease. Traditional feature engineering methods based on expert knowledge lack multidimensional and multi-view information abstraction and data representation ability, so the traditional research on pattern recognition of arrhythmia detection cannot achieve satisfactory results. Recently, with the increase of deep learning technology, automatic feature extraction of ECG data based on deep neural networks has been widely discussed. In order to utilize the complementary strength between different schemes, in this paper, we propose an arrhythmia detection method based on the multi-resolution representation (MRR) of ECG signals. This method utilizes four different up to date deep neural networks as four channel models for ECG vector representations learning. The deep learning based representations, together with hand-crafted features of ECG, forms the MRR, which is the input of the downstream classification strategy. The experimental results of big ECG dataset multi-label classification confirm that the F1 score of the proposed method is 0.9238, which is 1.31%, 0.62%, 1.18% and 0.6% higher than that of each channel model. From the perspective of architecture, this proposed method is highly scalable and can be employed as an example for arrhythmia recognition.

Download Full-text

A Green Prospective for Learned Post-processing in Sparse-view Tomographic Reconstruction

10.20944/preprints202107.0265.v1 ◽

2021 ◽

Author(s):

Elena Morotti ◽

Davide Evangelista ◽

Elena Loli Piccolomini

Keyword(s):

Neural Network ◽

Neural Networks ◽

Deep Learning ◽

Tomographic Reconstruction ◽

Research Field ◽

Filtered Backprojection ◽

Post Processing ◽

Training Set ◽

Imaging Reconstruction ◽

Active Research

Deep Learning is developing interesting tools which are of great interest for inverse imaging applications. In this work, we consider a medical imaging reconstruction task from subsampled measurements, which is an active research field where Convolutional Neural Networks have already revealed their great potential. However, the commonly used architectures are very deep and, hence, prone to overfitting and unfeasible for clinical usages. Inspired by the ideas of the green-AI literature, we here propose a shallow neural network to perform an efficient Learned Post-Processing on images roughly reconstructed by the filtered backprojection algorithm. The results obtained on images from the training set and on unseen images, using both the non-expensive network and the widely used very deep ResUNet show that the proposed network computes images of comparable or higher quality in about one fourth of time.

Download Full-text

Evaluation of Different Machine Learning Methods and Deep-Learning Convolutional Neural Networks for Landslide Detection

Remote Sensing ◽

10.3390/rs11020196 ◽

2019 ◽

Vol 11 (2) ◽

pp. 196 ◽

Cited By ~ 101

Author(s):

Omid Ghorbanzadeh ◽

Thomas Blaschke ◽

Khalil Gholamnia ◽

Sansar Meena ◽

Dirk Tiede ◽

...

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Deep Learning ◽

Expert Knowledge ◽

Accuracy Assessment ◽

Window Size ◽

Optical Data ◽

Support Vector ◽

Learning Methods ◽

Machine Learning Methods

There is a growing demand for detailed and accurate landslide maps and inventories around the globe, but particularly in hazard-prone regions such as the Himalayas. Most standard mapping methods require expert knowledge, supervision and fieldwork. In this study, we use optical data from the Rapid Eye satellite and topographic factors to analyze the potential of machine learning methods, i.e., artificial neural network (ANN), support vector machines (SVM) and random forest (RF), and different deep-learning convolution neural networks (CNNs) for landslide detection. We use two training zones and one test zone to independently evaluate the performance of different methods in the highly landslide-prone Rasuwa district in Nepal. Twenty different maps are created using ANN, SVM and RF and different CNN instantiations and are compared against the results of extensive fieldwork through a mean intersection-over-union (mIOU) and other common metrics. This accuracy assessment yields the best result of 78.26% mIOU for a small window size CNN, which uses spectral information only. The additional information from a 5 m digital elevation model helps to discriminate between human settlements and landslides but does not improve the overall classification accuracy. CNNs do not automatically outperform ANN, SVM and RF, although this is sometimes claimed. Rather, the performance of CNNs strongly depends on their design, i.e., layer depth, input window sizes and training strategies. Here, we conclude that the CNN method is still in its infancy as most researchers will either use predefined parameters in solutions like Google TensorFlow or will apply different settings in a trial-and-error manner. Nevertheless, deep-learning can improve landslide mapping in the future if the effects of the different designs are better understood, enough training samples exist, and the effects of augmentation strategies to artificially increase the number of existing samples are better understood.

Download Full-text

Knowledge-primed neural networks enable biologically interpretable deep learning on single-cell sequencing data

Genome Biology ◽

10.1186/s13059-020-02100-5 ◽

2020 ◽

Vol 21 (1) ◽

Author(s):

Nikolaus Fortelny ◽

Christoph Bock

Keyword(s):

Neural Networks ◽

Deep Learning ◽

Single Cell ◽

Sequencing Data ◽

Single Cell Sequencing

Download Full-text

Single-Cell Phenotype Classification Using Deep Convolutional Neural Networks

CrossRef Listing of Deleted DOIs ◽

10.1177/1087057116631284 ◽

2016 ◽

Vol 21 (9) ◽

pp. 998-1003 ◽

Cited By ~ 42

Author(s):

Oliver Dürr ◽

Beate Sick

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Deep Learning ◽

Single Cell ◽

Convolutional Neural Networks ◽

State Of The Art ◽

Misclassification Rate ◽

Support Vector ◽

Learning Methods ◽

Phenotype Classification

Deep learning methods are currently outperforming traditional state-of-the-art computer vision algorithms in diverse applications and recently even surpassed human performance in object recognition. Here we demonstrate the potential of deep learning methods to high-content screening–based phenotype classification. We trained a deep learning classifier in the form of convolutional neural networks with approximately 40,000 publicly available single-cell images from samples treated with compounds from four classes known to lead to different phenotypes. The input data consisted of multichannel images. The construction of appropriate feature definitions was part of the training and carried out by the convolutional network, without the need for expert knowledge or handcrafted features. We compare our results against the recent state-of-the-art pipeline in which predefined features are extracted from each cell using specialized software and then fed into various machine learning algorithms (support vector machine, Fisher linear discriminant, random forest) for classification. The performance of all classification approaches is evaluated on an untouched test image set with known phenotype classes. Compared to the best reference machine learning algorithm, the misclassification rate is reduced from 8.9% to 6.6%.

Download Full-text

Modern Approaches to Detect and Classify Comment Toxicity Using Neural Networks

Modeling and Analysis of Information Systems ◽

10.18255/1818-1015-2020-1-48-61 ◽

2020 ◽

Vol 27 (1) ◽

pp. 48-61 ◽

Cited By ~ 1

Author(s):

Sergey V. Morzhov

Keyword(s):

Neural Networks ◽

Deep Learning ◽

Language Processing ◽

Rapid Evolution ◽

Learning Technologies ◽

Training Set ◽

Online Platforms ◽

Processing Algorithms ◽

New Algorithms

The growth of popularity of online platforms which allow users to communicate with each other, share opinions about various events, and leave comments boosted the development of natural language processing algorithms. Tens of millions of messages per day are published by users of a particular social network need to be analyzed in real time for moderation in order to prevent the spread of various illegal or offensive information, threats and other types of toxic comments. Of course, such a large amount of information can be processed quite quickly only automatically. that is why there is a need to and a way to teach computers to “understand” a text written by humans. It is a non-trivial task even if the word “understand” here means only “to classify”. the rapid evolution of machine learning technologies has led to ubiquitous implementation of new algorithms. A lot of tasks, which for many years were considered almost impossible to solve, are now quite successfully solved using deep learning technologies. this article considers algorithms built using deep learning technologies and neural networks which can successfully solve the problem of detection and classification of toxic comments. In addition, the article presents the results of the developed algorithms, as well as the results of the ensemble of all considered algorithms on a large training set collected and tagged by Google and Jigsaw.

Download Full-text

A new strategy for curriculum learning using model distillation

Global Journal of Computer Sciences Theory and Research ◽

10.18844/gjcs.v10i2.5810 ◽

2020 ◽

Vol 10 (2) ◽

pp. 57-65

Author(s):

Kaan Karakose ◽

Metin Bilgin

Keyword(s):

Neural Networks ◽

Computer Vision ◽

Artificial Neural Networks ◽

Deep Learning ◽

Deep Neural Networks ◽

Mixed Data ◽

Training Set ◽

Complex Samples ◽

New Strategy ◽

Sample Sorting

In recent years, deep neural networks have been successful in both industry and academia, especially for computer vision tasks. Humans and animals learn much better when gradually presented in a meaningful order showing more concepts and complex samples rather than randomly presenting the information. The use of such training strategies in the context of artificial neural networks is called curriculum learning. In this study, a strategy was developed for curriculum learning. Using the CIFAR-10 and CIFAR-100 training sets, the last few layers of the pre-trained on ImageNet Xception model were trained to keep the training set knowledge in the model’s weight. Finally, a much smaller model was trained with the sample sorting methods presented using these difficulty levels. The findings obtained in this study show that the accuracy value generated when trained by the method we provided with the accuracy value trained with randomly mixed data was more than 1% for each epoch. Keywords: Curriculum learning, model distillation, deep learning, academia, neural networks.

Download Full-text

Choosing an appropriate training set size when using existing data to train neural networks for land cover segmentation

Annals of GIS ◽

10.1080/19475683.2020.1803402 ◽

2020 ◽

Vol 26 (4) ◽

pp. 329-342

Author(s):

Huan Ning ◽

Zhenlong Li ◽

Cuizhen Wang ◽

Lina Yang

Keyword(s):

Neural Networks ◽

Land Cover ◽

Training Set ◽

Set Size ◽

Existing Data

Download Full-text

Automated sleep stage scoring of the Sleep Heart Health Study using deep neural networks

SLEEP ◽

10.1093/sleep/zsz159 ◽

2019 ◽

Vol 42 (11) ◽

Cited By ~ 16

Author(s):

Linda Zhang ◽

Daniel Fabbri ◽

Raghu Upender ◽

David Kent

Keyword(s):

Neural Network ◽

Neural Networks ◽

Deep Learning ◽

Sleep Stage ◽

Health Study ◽

Sleep Staging ◽

Heart Health ◽

Training Set ◽

Cohen's Kappa ◽

Sleep Stage Scoring

Abstract Study Objectives Polysomnography (PSG) scoring is labor intensive and suffers from variability in inter- and intra-rater reliability. Automated PSG scoring has the potential to reduce the human labor costs and the variability inherent to this task. Deep learning is a form of machine learning that uses neural networks to recognize data patterns by inspecting many examples rather than by following explicit programming. Methods A sleep staging classifier trained using deep learning methods scored PSG data from the Sleep Heart Health Study (SHHS). The training set was composed of 42 560 hours of PSG data from 5213 patients. To capture higher-order data, spectrograms were generated from electroencephalography, electrooculography, and electromyography data and then passed to the neural network. A holdout set of 580 PSGs not included in the training set was used to assess model accuracy and discrimination via weighted F1-score, per-stage accuracy, and Cohen’s kappa (K). Results The optimal neural network model was composed of spectrograms in the input layer feeding into convolutional neural network layers and a long short-term memory layer to achieve a weighted F1-score of 0.87 and K = 0.82. Conclusions The deep learning sleep stage classifier demonstrates excellent accuracy and agreement with expert sleep stage scoring, outperforming human agreement on sleep staging. It achieves comparable or better F1-scores, accuracy, and Cohen’s kappa compared to literature for automated sleep stage scoring of PSG epochs. Accurate automated scoring of other PSG events may eventually allow for fully automated PSG scoring.

Download Full-text

Investigating the Impact of the Training Set Size on Deep Learning-Powered Hyperspectral Unmixing

10.1109/igarss47720.2021.9553477 ◽

2021 ◽

Author(s):

Lukasz Tulczyjew ◽

Jakub Nalepa

Keyword(s):

Deep Learning ◽

Training Set ◽

Hyperspectral Unmixing ◽

Set Size ◽

The Impact

Download Full-text