Automatic “Ground Truth” Annotation and Industrial Workpiece Dataset Generation for Deep Learning

2020 ◽  
Vol 17 (4) ◽  
pp. 539-550
Author(s):  
Fu-Qiang Liu ◽  
Zong-Yi Wang
Keyword(s):  
2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Christian Crouzet ◽  
Gwangjin Jeong ◽  
Rachel H. Chae ◽  
Krystal T. LoPresti ◽  
Cody E. Dunn ◽  
...  

AbstractCerebral microhemorrhages (CMHs) are associated with cerebrovascular disease, cognitive impairment, and normal aging. One method to study CMHs is to analyze histological sections (5–40 μm) stained with Prussian blue. Currently, users manually and subjectively identify and quantify Prussian blue-stained regions of interest, which is prone to inter-individual variability and can lead to significant delays in data analysis. To improve this labor-intensive process, we developed and compared three digital pathology approaches to identify and quantify CMHs from Prussian blue-stained brain sections: (1) ratiometric analysis of RGB pixel values, (2) phasor analysis of RGB images, and (3) deep learning using a mask region-based convolutional neural network. We applied these approaches to a preclinical mouse model of inflammation-induced CMHs. One-hundred CMHs were imaged using a 20 × objective and RGB color camera. To determine the ground truth, four users independently annotated Prussian blue-labeled CMHs. The deep learning and ratiometric approaches performed better than the phasor analysis approach compared to the ground truth. The deep learning approach had the most precision of the three methods. The ratiometric approach has the most versatility and maintained accuracy, albeit with less precision. Our data suggest that implementing these methods to analyze CMH images can drastically increase the processing speed while maintaining precision and accuracy.


2020 ◽  
Vol 22 (Supplement_3) ◽  
pp. iii359-iii359
Author(s):  
Lydia Tam ◽  
Edward Lee ◽  
Michelle Han ◽  
Jason Wright ◽  
Leo Chen ◽  
...  

Abstract BACKGROUND Brain tumors are the most common solid malignancies in childhood, many of which develop in the posterior fossa (PF). Manual tumor measurements are frequently required to optimize registration into surgical navigation systems or for surveillance of nonresectable tumors after therapy. With recent advances in artificial intelligence (AI), automated MRI-based tumor segmentation is now feasible without requiring manual measurements. Our goal was to create a deep learning model for automated PF tumor segmentation that can register into navigation systems and provide volume output. METHODS 720 pre-surgical MRI scans from five pediatric centers were divided into training, validation, and testing datasets. The study cohort comprised of four PF tumor types: medulloblastoma, diffuse midline glioma, ependymoma, and brainstem or cerebellar pilocytic astrocytoma. Manual segmentation of the tumors by an attending neuroradiologist served as “ground truth” labels for model training and evaluation. We used 2D Unet, an encoder-decoder convolutional neural network architecture, with a pre-trained ResNet50 encoder. We assessed ventricle segmentation accuracy on a held-out test set using Dice similarity coefficient (0–1) and compared ventricular volume calculation between manual and model-derived segmentations using linear regression. RESULTS Compared to the ground truth expert human segmentation, overall Dice score for model performance accuracy was 0.83 for automatic delineation of the 4 tumor types. CONCLUSIONS In this multi-institutional study, we present a deep learning algorithm that automatically delineates PF tumors and outputs volumetric information. Our results demonstrate applied AI that is clinically applicable, potentially augmenting radiologists, neuro-oncologists, and neurosurgeons for tumor evaluation, surveillance, and surgical planning.


2021 ◽  
pp. 1-14
Author(s):  
Waqas Yousaf ◽  
Arif Umar ◽  
Syed Hamad Shirazi ◽  
Zakir Khan ◽  
Imran Razzak ◽  
...  

Automatic logo detection and recognition is significantly growing due to the increasing requirements of intelligent documents analysis and retrieval. The main problem to logo detection is intra-class variation, which is generated by the variation in image quality and degradation. The problem of misclassification also occurs while having tiny logo in large image with other objects. To address this problem, Patch-CNN is proposed for logo recognition which uses small patches of logos for training to solve the problem of misclassification. The classification is accomplished by dividing the logo images into small patches and threshold is applied to drop no logo area according to ground truth. The architectures of AlexNet and ResNet are also used for logo detection. We propose a segmentation free architecture for the logo detection and recognition. In literature, the concept of region proposal generation is used to solve logo detection, but these techniques suffer in case of tiny logos. Proposed CNN is especially designed for extracting the detailed features from logo patches. So far, the technique has attained accuracy equals to 0.9901 with acceptable training and testing loss on the dataset used in this work.


Algorithms ◽  
2021 ◽  
Vol 14 (7) ◽  
pp. 212
Author(s):  
Youssef Skandarani ◽  
Pierre-Marc Jodoin ◽  
Alain Lalande

Deep learning methods are the de facto solutions to a multitude of medical image analysis tasks. Cardiac MRI segmentation is one such application, which, like many others, requires a large number of annotated data so that a trained network can generalize well. Unfortunately, the process of having a large number of manually curated images by medical experts is both slow and utterly expensive. In this paper, we set out to explore whether expert knowledge is a strict requirement for the creation of annotated data sets on which machine learning can successfully be trained. To do so, we gauged the performance of three segmentation models, namely U-Net, Attention U-Net, and ENet, trained with different loss functions on expert and non-expert ground truth for cardiac cine–MRI segmentation. Evaluation was done with classic segmentation metrics (Dice index and Hausdorff distance) as well as clinical measurements, such as the ventricular ejection fractions and the myocardial mass. The results reveal that generalization performances of a segmentation neural network trained on non-expert ground truth data is, to all practical purposes, as good as that trained on expert ground truth data, particularly when the non-expert receives a decent level of training, highlighting an opportunity for the efficient and cost-effective creation of annotations for cardiac data sets.


Author(s):  
Mohamed Estai ◽  
Marc Tennant ◽  
Dieter Gebauer ◽  
Andrew Brostek ◽  
Janardhan Vignarajan ◽  
...  

Objective: This study aimed to evaluate an automated detection system to detect and classify permanent teeth on orthopantomogram (OPG) images using convolutional neural networks (CNNs). Methods: In total, 591 digital OPGs were collected from patients older than 18 years. Three qualified dentists performed individual teeth labelling on images to generate the ground truth annotations. A three-step procedure, relying upon CNNs, was proposed for automated detection and classification of teeth. Firstly, U-Net, a type of CNN, performed preliminary segmentation of tooth regions or detecting regions of interest (ROIs) on panoramic images. Secondly, the Faster R-CNN, an advanced object detection architecture, identified each tooth within the ROI determined by the U-Net. Thirdly, VGG-16 architecture classified each tooth into 32 categories, and a tooth number was assigned. A total of 17,135 teeth cropped from 591 radiographs were used to train and validate the tooth detection and tooth numbering modules. 90% of OPG images were used for training, and the remaining 10% were used for validation. 10-folds cross-validation was performed for measuring the performance. The intersection over union (IoU), F1 score, precision, and recall (i.e. sensitivity) were used as metrics to evaluate the performance of resultant CNNs. Results: The ROI detection module had an IoU of 0.70. The tooth detection module achieved a recall of 0.99 and a precision of 0.99. The tooth numbering module had a recall, precision and F1 score of 0.98. Conclusion: The resultant automated method achieved high performance for automated tooth detection and numbering from OPG images. Deep learning can be helpful in the automatic filing of dental charts in general dentistry and forensic medicine.


Author(s):  
Ryan Lagerquist ◽  
Jebb Q. Stewart ◽  
Imme Ebert-Uphoff ◽  
Christina Kumler

AbstractPredicting the timing and location of thunderstorms (“convection”) allows for preventive actions that can save both lives and property. We have applied U-nets, a deep-learning-based type of neural network, to forecast convection on a grid at lead times up to 120 minutes. The goal is to make skillful forecasts with only present and past satellite data as predictors. Specifically, predictors are multispectral brightness-temperature images from the Himawari-8 satellite, while targets (ground truth) are provided by weather radars in Taiwan. U-nets are becoming popular in atmospheric science due to their advantages for gridded prediction. Furthermore, we use three novel approaches to advance U-nets in atmospheric science. First, we compare three architectures – vanilla, temporal, and U-net++ – and find that vanilla U-nets are best for this task. Second, we train U-nets with the fractions skill score, which is spatially aware, as the loss function. Third, because we do not have adequate ground truth over the full Himawari-8 domain, we train the U-nets with small radar-centered patches, then apply trained U-nets to the full domain. Also, we find that the best predictions are given by U-nets trained with satellite data from multiple lag times, not only the present. We evaluate U-nets in detail – by time of day, month, and geographic location – and compare to persistence models. The U-nets outperform persistence at lead times ≥ 60 minutes, and at all lead times the U-nets provide a more realistic climatology than persistence. Our code is available publicly.


Author(s):  
Artem Nikonorov ◽  
Maksim Petrov ◽  
Sergey Bibikov ◽  
Viktoria Kutikova ◽  
Pavel Yakimov ◽  
...  

eLife ◽  
2020 ◽  
Vol 9 ◽  
Author(s):  
Dennis Segebarth ◽  
Matthias Griebel ◽  
Nikolai Stein ◽  
Cora R von Collenberg ◽  
Corinna Martin ◽  
...  

Bioimage analysis of fluorescent labels is widely used in the life sciences. Recent advances in deep learning (DL) allow automating time-consuming manual image analysis processes based on annotated training data. However, manual annotation of fluorescent features with a low signal-to-noise ratio is somewhat subjective. Training DL models on subjective annotations may be instable or yield biased models. In turn, these models may be unable to reliably detect biological effects. An analysis pipeline integrating data annotation, ground truth estimation, and model training can mitigate this risk. To evaluate this integrated process, we compared different DL-based analysis approaches. With data from two model organisms (mice, zebrafish) and five laboratories, we show that ground truth estimation from multiple human annotators helps to establish objectivity in fluorescent feature annotations. Furthermore, ensembles of multiple models trained on the estimated ground truth establish reliability and validity. Our research provides guidelines for reproducible DL-based bioimage analyses.


2021 ◽  
Vol 263 (2) ◽  
pp. 4441-4445
Author(s):  
Hyunsuk Huh ◽  
Seungchul Lee

Audio data acquired at industrial manufacturing sites often include unexpected background noise. Since the performance of data-driven models can be worse by background noise. Therefore, it is important to get rid of unwanted background noise. There are two main techniques for noise canceling in a traditional manner. One is Active Noise Canceling (ANC), which generates an inverted phase of the sound that we want to remove. The other is Passive Noise Canceling (PNC), which physically blocks the noise. However, these methods require large device size and expensive cost. Thus, we propose a deep learning-based noise canceling method. This technique was developed using audio imaging technique and deep learning segmentation network. However, the proposed model only needs the information on whether the audio contains noise or not. In other words, unlike the general segmentation technique, a pixel-wise ground truth segmentation map is not required for this method. We demonstrate to evaluate the separation using pump sound of MIMII dataset, which is open-source dataset.


Stroke ◽  
2021 ◽  
Vol 52 (Suppl_1) ◽  
Author(s):  
Yannan Yu ◽  
Soren Christensen ◽  
Yuan Xie ◽  
Enhao Gong ◽  
Maarten G Lansberg ◽  
...  

Objective: Ischemic core prediction from CT perfusion (CTP) remains inaccurate compared with gold standard diffusion-weighted imaging (DWI). We evaluated if a deep learning model to predict the DWI lesion from MR perfusion (MRP) could facilitate ischemic core prediction on CTP. Method: Using the multi-center CRISP cohort of acute ischemic stroke patient with CTP before thrombectomy, we included patients with major reperfusion (TICI score≥2b), adequate image quality, and follow-up MRI at 3-7 days. Perfusion parameters including Tmax, mean transient time, cerebral blood flow (CBF), and cerebral blood volume were reconstructed by RAPID software. Core lab experts outlined the stroke lesion on the follow-up MRI. A previously trained MRI model in a separate group of patients was used as a starting point, which used MRP parameters as input and RAPID ischemic core on DWI as ground truth. We fine-tuned this model, using CTP parameters as input, and follow-up MRI as ground truth. Another model was also trained from scratch with only CTP data. 5-fold cross validation was used. Performance of the models was compared with ischemic core (rCBF≤30%) from RAPID software to identify the presence of a large infarct (volume>70 or >100ml). Results: 94 patients in the CRISP trial met the inclusion criteria (mean age 67±15 years, 52% male, median baseline NIHSS 18, median 90-day mRS 2). Without fine-tuning, the MRI model had an agreement of 73% in infarct >70ml, and 69% in >100ml; the MRI model fine-tuned on CT improved the agreement to 77% and 73%; The CT model trained from scratch had agreements of 73% and 71%; All of the deep learning models outperformed the rCBF segmentation from RAPID, which had agreements of 51% and 64%. See Table and figure. Conclusions: It is feasible to apply MRP-based deep learning model to CT. Fine-tuning with CTP data further improves the predictions. All deep learning models predict the stroke lesion after major recanalization better than thresholding approaches based on rCBF.


Sign in / Sign up

Export Citation Format

Share Document