PathML: A unified framework for whole-slide image analysis with deep learning

The inspection of stained tissue slides by pathologists is essential for the early detection, diagnosis and monitoring of disease. Recently, deep learning methods for the analysis of whole-slide images (WSIs) have shown excellent performance on these tasks, and have the potential to substantially reduce the workload of pathologists. However, successful implementation of deep learning for WSI analysis is complex and requires careful consideration of model hyperparameters, slide and image artefacts, and data augmentation. Here we introduce PathML, a Python library for performing pre- and post-processing of WSIs, which has been designed to interact with the most widely used deep learning libraries, PyTorch and TensorFlow, thus allowing seamless integration into deep learning workflows. We present the current best practices in deep learning for WSI analysis, and give a step-by-step guide using the PathML framework: from annotating and pre-processing of slides, to implementing neural network architectures, to training and post-processing. PathML provides a unified framework in which deep learning methods for WSI analysis can be developed and applied, thus increasing the accessibility of an important new application of deep learning.

Download Full-text

Robustness of convolutional neural networks to physiological electrocardiogram noise

Philosophical Transactions of The Royal Society A Mathematical Physical and Engineering Sciences ◽

10.1098/rsta.2020.0262 ◽

2021 ◽

Vol 379 (2212) ◽

Cited By ~ 1

Author(s):

J. Venton ◽

P. M. Harris ◽

A. Sundar ◽

N. A. S. Smith ◽

P. J. Aston

Keyword(s):

Deep Learning ◽

Careful Consideration ◽

Training Data ◽

Learning Methods ◽

Ecg Signals ◽

Open Questions ◽

Image Transforms ◽

Challenges And Opportunities ◽

Electrocardiogram Ecg ◽

Image Transformations

The electrocardiogram (ECG) is a widespread diagnostic tool in healthcare and supports the diagnosis of cardiovascular disorders. Deep learning methods are a successful and popular technique to detect indications of disorders from an ECG signal. However, there are open questions around the robustness of these methods to various factors, including physiological ECG noise. In this study, we generate clean and noisy versions of an ECG dataset before applying symmetric projection attractor reconstruction (SPAR) and scalogram image transformations. A convolutional neural network is used to classify these image transforms. For the clean ECG dataset, F1 scores for SPAR attractor and scalogram transforms were 0.70 and 0.79, respectively. Scores decreased by less than 0.05 for the noisy ECG datasets. Notably, when the network trained on clean data was used to classify the noisy datasets, performance decreases of up to 0.18 in F1 scores were seen. However, when the network trained on the noisy data was used to classify the clean dataset, the decrease was less than 0.05. We conclude that physiological ECG noise impacts classification using deep learning methods and careful consideration should be given to the inclusion of noisy ECG signals in the training data when developing supervised networks for ECG classification. This article is part of the theme issue ‘Advanced computation in cardiovascular physiology: new challenges and opportunities’.

Download Full-text

Two Birds with One Stone: Series Saliency for Accurate and Interpretable Multivariate Time Series Forecasting

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2021/397 ◽

2021 ◽

Author(s):

Qingyi Pan ◽

Wenbo Hu ◽

Ning Chen

Keyword(s):

Time Series ◽

Deep Learning ◽

Data Augmentation ◽

Multivariate Time Series ◽

Time Series Forecasting ◽

Experimental Results ◽

Learning Methods ◽

Forecasting Accuracy ◽

Sliding Windows ◽

Scheme Of Series

It is important yet challenging to perform accurate and interpretable time series forecasting. Though deep learning methods can boost forecasting accuracy, they often sacrifice interpretability. In this paper, we present a new scheme of series saliency to boost both accuracy and interpretability. By extracting series images from sliding windows of the time series, we design series saliency as a mixup strategy with a learnable mask between the series images and their perturbed versions. Series saliency is model agnostic and performs as an adaptive data augmentation method for training deep models. Moreover, by slightly changing the objective, we optimize series saliency to find a mask for interpretable forecasting in both feature and time dimensions. Experimental results on several real datasets demonstrate that series saliency is effective to produce accurate time-series forecasting results as well as generate temporal interpretations.

Download Full-text

Deep transfer learning and data augmentation improve glucose levels prediction in type 2 diabetes patients

npj Digital Medicine ◽

10.1038/s41746-021-00480-x ◽

2021 ◽

Vol 4 (1) ◽

Author(s):

Yixiang Deng ◽

Lu Lu ◽

Laura Aponte ◽

Angeliki M. Angelidi ◽

Vera Novak ◽

...

Keyword(s):

Type 2 Diabetes ◽

Deep Learning ◽

Blood Glucose ◽

Transfer Learning ◽

Data Augmentation ◽

Generative Models ◽

Patient Specific ◽

Learning Methods ◽

Glucose Levels

AbstractAccurate prediction of blood glucose variations in type 2 diabetes (T2D) will facilitate better glycemic control and decrease the occurrence of hypoglycemic episodes as well as the morbidity and mortality associated with T2D, hence increasing the quality of life of patients. Owing to the complexity of the blood glucose dynamics, it is difficult to design accurate predictive models in every circumstance, i.e., hypo/normo/hyperglycemic events. We developed deep-learning methods to predict patient-specific blood glucose during various time horizons in the immediate future using patient-specific every 30-min long glucose measurements by the continuous glucose monitoring (CGM) to predict future glucose levels in 5 min to 1 h. In general, the major challenges to address are (1) the dataset of each patient is often too small to train a patient-specific deep-learning model, and (2) the dataset is usually highly imbalanced given that hypo- and hyperglycemic episodes are usually much less common than normoglycemia. We tackle these two challenges using transfer learning and data augmentation, respectively. We systematically examined three neural network architectures, different loss functions, four transfer-learning strategies, and four data augmentation techniques, including mixup and generative models. Taken together, utilizing these methodologies we achieved over 95% prediction accuracy and 90% sensitivity for a time period within the clinically useful 1 h prediction horizon that would allow a patient to react and correct either hypoglycemia and/or hyperglycemia. We have also demonstrated that the same network architecture and transfer-learning methods perform well for the type 1 diabetes OhioT1DM public dataset.

Download Full-text

Developing image analysis pipelines of whole-slide images: Pre- and post-processing

Journal of Clinical and Translational Science ◽

10.1017/cts.2020.531 ◽

2020 ◽

pp. 1-11

Author(s):

Byron Smith ◽

Meyke Hermsen ◽

Elizabeth Lesser ◽

Deepak Ravichandar ◽

Walter Kremers

Keyword(s):

Image Analysis ◽

Deep Learning ◽

Digital Pathology ◽

Disruptive Technology ◽

Post Processing ◽

Artifact Detection ◽

The Press ◽

Slide Image ◽

Processing Order ◽

Whole Slide Images

Abstract Deep learning has pushed the scope of digital pathology beyond simple digitization and telemedicine. The incorporation of these algorithms in routine workflow is on the horizon and maybe a disruptive technology, reducing processing time, and increasing detection of anomalies. While the newest computational methods enjoy much of the press, incorporating deep learning into standard laboratory workflow requires many more steps than simply training and testing a model. Image analysis using deep learning methods often requires substantial pre- and post-processing order to improve interpretation and prediction. Similar to any data processing pipeline, images must be prepared for modeling and the resultant predictions need further processing for interpretation. Examples include artifact detection, color normalization, image subsampling or tiling, removal of errant predictions, etc. Once processed, predictions are complicated by image file size – typically several gigabytes when unpacked. This forces images to be tiled, meaning that a series of subsamples from the whole-slide image (WSI) are used in modeling. Herein, we review many of these methods as they pertain to the analysis of biopsy slides and discuss the multitude of unique issues that are part of the analysis of very large images.

Download Full-text

Paris-Lille-3D: A large and high-quality ground-truth urban point cloud dataset for automatic segmentation and classification

The International Journal of Robotics Research ◽

10.1177/0278364918767506 ◽

2018 ◽

Vol 37 (6) ◽

pp. 545-557 ◽

Cited By ~ 30

Author(s):

Xavier Roynard ◽

Jean-Emmanuel Deschaud ◽

François Goulette

Keyword(s):

Deep Learning ◽

Point Cloud ◽

Laser Scanning ◽

Automatic Segmentation ◽

Ground Truth ◽

Classification Algorithms ◽

Post Processing ◽

High Quality ◽

Learning Methods ◽

Two Cities

This paper introduces a new urban point cloud dataset for automatic segmentation and classification acquired by mobile laser scanning (MLS). We describe how the dataset is obtained from acquisition to post-processing and labeling. This dataset can be used to train pointwise classification algorithms; however, given that a great attention has been paid to the split between the different objects, this dataset can also be used to train the detection and segmentation of objects. The dataset consists of around [Formula: see text] of MLS point cloud acquired in two cities. The number of points and range of classes mean that it can be used to train deep-learning methods. In addition, we show some results of automatic segmentation and classification. The dataset is available at: http://caor-mines-paristech.fr/fr/paris-lille-3d-dataset/ .

Download Full-text

Extracting Weld Bead Shapes from Radiographic Testing Images with U-Net

Applied Sciences ◽

10.3390/app112412051 ◽

2021 ◽

Vol 11 (24) ◽

pp. 12051

Author(s):

Gang-soo Jin ◽

Sang-jin Oh ◽

Yeon-seung Lee ◽

Sung-chul Shin

Keyword(s):

Deep Learning ◽

Data Augmentation ◽

Weld Zone ◽

Image Data ◽

Weld Bead ◽

Learning Performance ◽

Quality Inspection ◽

Post Processing ◽

Image Preprocessing ◽

Radiographic Testing

Metals created by melting basic metal and welding rods in welding operations are referred to as weld beads. The weld bead shape allows the observation of pores and defects such as cracks in the weld zone. Radiographic testing images are used to determine the quality of the weld zone. The extraction of only the weld bead to determine the generative pattern of the bead can help efficiently locate defects in the weld zone. However, manual extraction of the weld bead from weld images is not time and cost-effective. Efficient and rapid welding quality inspection can be conducted by automating weld bead extraction through deep learning. As a result, objectivity can be secured in the quality inspection and determination of the weld zone in the shipbuilding and offshore plant industry. This study presents a method for detecting the weld bead shape and location from the weld zone image using image preprocessing and deep learning models, and extracting the weld bead through image post-processing. In addition, to diversify the data and improve the deep learning performance, data augmentation was performed to artificially expand the image data. Contrast limited adaptive histogram equalization (CLAHE) is used as an image preprocessing method, and the bead is extracted using U-Net, a pixel-based deep learning model. Consequently, the mean intersection over union (mIoU) values are found to be 90.58% and 85.44% in the train and test experiments, respectively. Successful extraction of the bead from the radiographic testing image through post-processing is achieved.

Download Full-text

DL-MRI: A Unified Framework of Deep Learning-Based MRI Super Resolution

Journal of Healthcare Engineering ◽

10.1155/2021/5594649 ◽

2021 ◽

Vol 2021 ◽

pp. 1-9

Author(s):

Huanyu Liu ◽

Jiaqi Liu ◽

Junbao Li ◽

Jeng-Shyang Pan ◽

Xiaqiong Yu

Keyword(s):

Deep Learning ◽

High Resolution ◽

Super Resolution ◽

Motion Artifacts ◽

Mr Images ◽

Unified Framework ◽

Resolution Method ◽

Learning Methods ◽

Scanning Time ◽

Detection And Diagnosis

Magnetic resonance imaging (MRI) is widely used in the detection and diagnosis of diseases. High-resolution MR images will help doctors to locate lesions and diagnose diseases. However, the acquisition of high-resolution MR images requires high magnetic field intensity and long scanning time, which will bring discomfort to patients and easily introduce motion artifacts, resulting in image quality degradation. Therefore, the resolution of hardware imaging has reached its limit. Based on this situation, a unified framework based on deep learning super resolution is proposed to transfer state-of-the-art deep learning methods of natural images to MRI super resolution. Compared with the traditional image super-resolution method, the deep learning super-resolution method has stronger feature extraction and characterization ability, can learn prior knowledge from a large number of sample data, and has a more stable and excellent image reconstruction effect. We propose a unified framework of deep learning -based MRI super resolution, which has five current deep learning methods with the best super-resolution effect. In addition, a high-low resolution MR image dataset with the scales of ×2, ×3, and ×4 was constructed, covering 4 parts of the skull, knee, breast, and head and neck. Experimental results show that the proposed unified framework of deep learning super resolution has a better reconstruction effect on the data than traditional methods and provides a standard dataset and experimental benchmark for the application of deep learning super resolution in MR images.

Download Full-text

A unified framework for automated person re-indentification

Transport and Communication Science Journal ◽

10.25073/tcsj.71.7.11 ◽

2020 ◽

Vol 71 (7) ◽

pp. 868-880

Author(s):

Nguyen Hong-Quan ◽

Nguyen Thuy-Binh ◽

Tran Duc-Long ◽

Le Thi-Lan

Keyword(s):

Deep Learning ◽

Video Analysis ◽

Camera Network ◽

Unified Framework ◽

Person Detection ◽

Practical Applications ◽

Detection And Tracking ◽

Analysis System ◽

Bounding Boxes

Along with the strong development of camera networks, a video analysis system has been become more and more popular and has been applied in various practical applications. In this paper, we focus on person re-identification (person ReID) task that is a crucial step of video analysis systems. The purpose of person ReID is to associate multiple images of a given person when moving in a non-overlapping camera network. Many efforts have been made to person ReID. However, most of studies on person ReID only deal with well-alignment bounding boxes which are detected manually and considered as the perfect inputs for person ReID. In fact, when building a fully automated person ReID system the quality of the two previous steps that are person detection and tracking may have a strong effect on the person ReID performance. The contribution of this paper are two-folds. First, a unified framework for person ReID based on deep learning models is proposed. In this framework, the coupling of a deep neural network for person detection and a deep-learning-based tracking method is used. Besides, features extracted from an improved ResNet architecture are proposed for person representation to achieve a higher ReID accuracy. Second, our self-built dataset is introduced and employed for evaluation of all three steps in the fully automated person ReID framework.

Download Full-text

Levenshtein Augmentation Improves Performance of SMILES Based Deep-Learning Synthesis Prediction

10.26434/chemrxiv.12562121 ◽

2020 ◽

Author(s):

Dean Sumner ◽

Jiazhen He ◽

Amol Thakkar ◽

Ola Engkvist ◽

Esben Jannik Bjerrum

Keyword(s):

Neural Networks ◽

Pattern Recognition ◽

Deep Learning ◽

Recurrent Neural Networks ◽

Data Augmentation ◽

State Of The Art ◽

Sequence Similarity ◽

Learning Models ◽

Underlying Network

<p>SMILES randomization, a form of data augmentation, has previously been shown to increase the performance of deep learning models compared to non-augmented baselines. Here, we propose a novel data augmentation method we call “Levenshtein augmentation” which considers local SMILES sub-sequence similarity between reactants and their respective products when creating training pairs. The performance of Levenshtein augmentation was tested using two state of the art models - transformer and sequence-to-sequence based recurrent neural networks with attention. Levenshtein augmentation demonstrated an increase performance over non-augmented, and conventionally SMILES randomization augmented data when used for training of baseline models. Furthermore, Levenshtein augmentation seemingly results in what we define as <i>attentional gain </i>– an enhancement in the pattern recognition capabilities of the underlying network to molecular motifs.</p>

Download Full-text

A Deep Learning based Arabic Script Recognition System: Benchmark on KHAT

The International Arab Journal of Information Technology ◽

10.34028/iajit/17/3/3 ◽

2020 ◽

Vol 17 (3) ◽

pp. 299-305 ◽

Cited By ~ 1

Author(s):

Riaz Ahmad ◽

Saeeda Naz ◽

Muhammad Afzal ◽

Sheikh Rashid ◽

Marcus Liwicki ◽

...

Keyword(s):

Deep Learning ◽

Character Recognition ◽

Data Augmentation ◽

Short Term Memory ◽

Recognition System ◽

Learning Approach ◽

Arabic Text ◽

Data Set ◽

Processing Step ◽

Handwritten Arabic

This paper presents a deep learning benchmark on a complex dataset known as KFUPM Handwritten Arabic TexT (KHATT). The KHATT data-set consists of complex patterns of handwritten Arabic text-lines. This paper contributes mainly in three aspects i.e., (1) pre-processing, (2) deep learning based approach, and (3) data-augmentation. The pre-processing step includes pruning of white extra spaces plus de-skewing the skewed text-lines. We deploy a deep learning approach based on Multi-Dimensional Long Short-Term Memory (MDLSTM) networks and Connectionist Temporal Classification (CTC). The MDLSTM has the advantage of scanning the Arabic text-lines in all directions (horizontal and vertical) to cover dots, diacritics, strokes and fine inflammation. The data-augmentation with a deep learning approach proves to achieve better and promising improvement in results by gaining 80.02% Character Recognition (CR) over 75.08% as baseline.

Download Full-text