Fully automated contrast and non-contrast cardiac view detection in echocardiography a multi-centre, multi-vendor study

2020 ◽  
Vol 41 (Supplement_2) ◽  
Author(s):  
S Gao ◽  
D Stojanovski ◽  
A Parker ◽  
P Marques ◽  
S Heitner ◽  
...  

Abstract Background Correctly identifying views acquired in a 2D echocardiographic examination is paramount to post-processing and quantification steps often performed as part of most clinical workflows. In many exams, particularly in stress echocardiography, microbubble contrast is used which greatly affects the appearance of the cardiac views. Here we present a bespoke, fully automated convolutional neural network (CNN) which identifies apical 2, 3, and 4 chamber, and short axis (SAX) views acquired with and without contrast. The CNN was tested in a completely independent, external dataset with the data acquired in a different country than that used to train the neural network. Methods Training data comprised of 2D echocardiograms was taken from 1014 subjects from a prospective multisite, multi-vendor, UK trial with the number of frames in each view greater than 17,500. Prior to view classification model training, images were processed using standard techniques to ensure homogenous and normalised image inputs to the training pipeline. A bespoke CNN was built using the minimum number of convolutional layers required with batch normalisation, and including dropout for reducing overfitting. Before processing, the data was split into 90% for model training (211,958 frames), and 10% used as a validation dataset (23,946 frames). Image frames from different subjects were separated out entirely amongst the training and validation datasets. Further, a separate trial dataset of 240 studies acquired in the USA was used as an independent test dataset (39,401 frames). Results Figure 1 shows the confusion matrices for both validation data (left) and independent test data (right), with an overall accuracy of 96% and 95% for the validation and test datasets respectively. The accuracy for the non-contrast cardiac views of >99% exceeds that seen in other works. The combined datasets included images acquired across ultrasound manufacturers and models from 12 clinical sites. Conclusion We have developed a CNN capable of automatically accurately identifying all relevant cardiac views used in “real world” echo exams, including views acquired with contrast. Use of the CNN in a routine clinical workflow could improve efficiency of quantification steps performed after image acquisition. This was tested on an independent dataset acquired in a different country to that used to train the model and was found to perform similarly thus indicating the generalisability of the model. Figure 1. Confusion matrices Funding Acknowledgement Type of funding source: Private company. Main funding source(s): Ultromics Ltd.

2020 ◽  
pp. 15-21
Author(s):  
R. N. Kvetny ◽  
R. V. Masliy ◽  
A. M. Kyrylenko ◽  
V. V. Shcherba

The article is devoted to the study of object detection in ima­ges using neural networks. The structure of convolutional neural networks used for image processing is considered. The formation of the convolutional layer (Fig. 1), the sub-sampling layer (Fig. 2) and the fully connected layer (Fig. 3) are described in detail. An overview of popular high-performance convolutional neural network architectures used to detect R-FCN, Yolo, Faster R-CNN, SSD, DetectNet objects has been made. The basic stages of image processing by the DetectNet neural network, which is designed to detect objects in images, are discussed. NVIDIA DIGITS was used to create and train models, and several DetectNet models were trained using this environment. The parameters of experiments (Table 1) and the compari­son of the quality of the trained models (Table 2) are presented. As training and validation data, we used an image of the KITTI database, which was created to improve self-driving systems that do not go without built-in devices, one of which could be the Jetson TX2. KITTI’s images feature several object classes, including cars and pedestrians. Model training and testing was performed using a Jetson TX2 supercomputer. Five models were trained that differed in the Base learning rate parameter. The results obtained make it possible to find a compromise value for the Base learning rate para­meter to quickly obtain a model with a high mAP value. The qua­lity of the best model obtained on the KITTI validation dataset is mAP = 57.8%.


2021 ◽  
Author(s):  
Yash Chauhan ◽  
Prateek Singh

Coins recognition systems have humungous applications from vending and slot machines to banking and management firms which directly translate to a high volume of research regarding the development of methods for such classification. In recent years, academic research has shifted towards a computer vision approach for sorting coins due to the advancement in the field of deep learning. However, most of the documented work utilizes what is known as ‘Transfer Learning’ in which we reuse a pre-trained model of a fixed architecture as a starting point for our training. While such an approach saves us a lot of time and effort, the generic nature of the pre-trained model can often become a bottleneck for performance on a specialized problem such as coin classification. This study develops a convolutional neural network (CNN) model from scratch and tests it against a widely-used general-purpose architecture known as Googlenet. We have shown in this study by comparing the performance of our model with that of Googlenet (documented in various previous studies) that a more straightforward and specialized architecture is more optimal than a more complex general architecture for the coin classification problem. The model developed in this study is trained and tested on 720 and 180 images of Indian coins of different denominations, respectively. The final accuracy gained by the model is 91.62% on the training data, while the accuracy is 90.55% on the validation data.


2021 ◽  
Vol 23 (Supplement_6) ◽  
pp. vi135-vi136
Author(s):  
Ujjwal Baid ◽  
Sarthak Pati ◽  
Siddhesh Thakur ◽  
Brandon Edwards ◽  
Micah Sheller ◽  
...  

Abstract PURPOSE Robustness and generalizability of artificial intelligent (AI) methods is reliant on the training data size and diversity, which are currently hindered in multi-institutional healthcare collaborations by data ownership and legal concerns. To address these, we introduce the Federated Tumor Segmentation (FeTS) Initiative, as an international consortium using federated learning (FL) for data-private multi-institutional collaborations, where AI models leverage data at participating institutions, without sharing data between them. The initial FeTS use-case focused on detecting brain tumor boundaries in MRI. METHODS The FeTS tool incorporates: 1) MRI pre-processing, including image registration and brain extraction; 2) automatic delineation of tumor sub-regions, by label fusion of pretrained top-performing BraTS methods; 3) tools for manual delineation refinements; 4) model training. 55 international institutions identified local retrospective cohorts of glioblastoma patients. Ground truth was generated using the first 3 FeTS functionality modes as mentioned earlier. Finally, the FL training mode comprises of i) an AI model trained on local data, ii) local model updates shared with an aggregator, which iii) combines updates from all collaborators to generate a consensus model, and iv) circulates the consensus model back to all collaborators for iterative performance improvements. RESULTS The first FeTS consensus model, from 23 institutions with data of 2,200 patients, showed an average improvement of 11.1% in the performance of the model on each collaborator’s validation data, when compared to a model trained on the publicly available BraTS data (n=231). CONCLUSION Our findings support that data increase alone would lead to AI performance improvements without any algorithmic development, hence indicating that the model performance would improve further when trained with all 55 collaborating institutions. FL enables AI model training with knowledge from data of geographically-distinct collaborators, without ever having to share any data, hence overcoming hurdles relating to legal, ownership, and technical concerns of data sharing.


2020 ◽  
Vol 500 (2) ◽  
pp. 1633-1644
Author(s):  
Róbert Beck ◽  
István Szapudi ◽  
Heather Flewelling ◽  
Conrad Holmberg ◽  
Eugene Magnier ◽  
...  

ABSTRACT The Pan-STARRS1 (PS1) 3π survey is a comprehensive optical imaging survey of three quarters of the sky in the grizy broad-band photometric filters. We present the methodology used in assembling the source classification and photometric redshift (photo-z) catalogue for PS1 3π Data Release 1, titled Pan-STARRS1 Source Types and Redshifts with Machine learning (PS1-STRM). For both main data products, we use neural network architectures, trained on a compilation of public spectroscopic measurements that has been cross-matched with PS1 sources. We quantify the parameter space coverage of our training data set, and flag extrapolation using self-organizing maps. We perform a Monte Carlo sampling of the photometry to estimate photo-z uncertainty. The final catalogue contains 2902 054 648 objects. On our validation data set, for non-extrapolated sources, we achieve an overall classification accuracy of $98.1{{\ \rm per\ cent}}$ for galaxies, $97.8{{\ \rm per\ cent}}$ for stars, and $96.6{{\ \rm per\ cent}}$ for quasars. Regarding the galaxy photo-z estimation, we attain an overall bias of 〈Δznorm〉 = 0.0005, a standard deviation of σ(Δznorm) = 0.0322, a median absolute deviation of MAD(Δznorm) = 0.0161, and an outlier fraction of $P\left(|\Delta z_{\mathrm{norm}}|\gt 0.15\right)=1.89{{\ \rm per\ cent}}$. The catalogue will be made available as a high-level science product via the Mikulski Archive for Space Telescopes.


Geophysics ◽  
2001 ◽  
Vol 66 (1) ◽  
pp. 220-236 ◽  
Author(s):  
Daniel P. Hampson ◽  
James S. Schuelke ◽  
John A. Quirein

We describe a new method for predicting well‐log properties from seismic data. The analysis data consist of a series of target logs from wells which tie a 3-D seismic volume. The target logs theoretically may be of any type; however, the greatest success to date has been in predicting porosity logs. From the 3-D seismic volume a series of sample‐based attributes is calculated. The objective is to derive a multiattribute transform, which is a linear or nonlinear transform between a subset of the attributes and the target log values. The selected subset is determined by a process of forward stepwise regression, which derives increasingly larger subsets of attributes. An extension of conventional crossplotting involves the use of a convolutional operator to resolve frequency differences between the target logs and the seismic data. In the linear mode, the transform consists of a series of weights derived by least‐squares minimization. In the nonlinear mode, a neural network is trained, using the selected attributes as inputs. Two types of neural networks have been evaluated: the multilayer feedforward network (MLFN) and the probabilistic neural network (PNN). Because of its mathematical simplicity, the PNN appears to be the network of choice. To estimate the reliability of the derived multiattribute transform, crossvalidation is used. In this process, each well is systematically removed from the training set, and the transform is rederived from the remaining wells. The prediction error for the hidden well is then calculated. The validation error, which is the average error for all hidden wells, is used as a measure of the likely prediction error when the transform is applied to the seismic volume. The method is applied to two real data sets. In each case, we see a continuous improvement in predictive power as we progress from single‐attribute regression to linear multiattribute prediction to neural network prediction. This improvement is evident not only on the training data but, more importantly, on the validation data. In addition, the neural network shows a significant improvement in resolution over that from linear regression.


2020 ◽  
Vol 41 (Supplement_2) ◽  
Author(s):  
I Korsakov ◽  
D Gavrilov ◽  
L Serova ◽  
A Gusev ◽  
R Novitskiy ◽  
...  

Abstract Background The used tools for prediction the individual risk of developing cardiovascular diseases and their complications using machine learning methods have proven better prognostic value in comparison with commonly used scales (e.g., Framingham, SCORE). To create such methods, the long-term accumulation of large amount of qualitative data are required. Moreover, to improve the accuracy of models, it is necessary to take into account regional characteristics that affect health: ethnic, nutritional characteristics, climatic conditions, living standards and medical care. These regional characteristics could significantly affect the development and outcomes of CVDs. However, the amount of regional data is not enough to build a qualitative model. Therefore, it is proposed to create models based on publicly available data and validate them on regional medical data sufficient for validation and calibration. Methods Two models were trained using data from the Framingham study. Model 1 was trained on 2 588 patient data and predicts a 10-year CVD probability according to the following risk factors: age, gender, cholesterol, HDL, smoking, SBP, and BP medications. Model 2 was trained on 4,363 patient data and predicts a 10-year death probability from CVD according to the following criteria: age, gender, cholesterol, smoking, SBP, BMI, heart rate. To retrain the obtained models, we used dataset created from data from patients in the northwestern part of Russia. The dataset consists of 438 patients, including the signs used in the trained models. This dataset includes CVD and death from it during a 10-year follow-up Evaluation We used randomized data splitting: divided the dataset into a training and a test set with an 80/20 proportion. The models was implement with keras convolution neural network (CNN) using 3 hidden layers. For data validation was used a 10 K-fold method. Results We compared the initial model metrics and those obtained after local data retraining. The accuracy of model 1 before retraining is 78%, after – 81.3%, the area under the ROC curve (AUC) before retraining: 0.77 (at 95% CI: 0.72–0.82C), after – 0.803. The accuracy of model 2 before retraining is 79%, after – 85.6%, the area under the ROC-curve (AUC) before retraining: 0.78 (at 95% CI: 0.72–0.82), after – 0.828. Conclusion Using this method of retraining predictive models, we can take into account local characteristics of the population and significantly increase the accuracy of predicting events. Expand the population to use the model according to local characteristics. Funding Acknowledgement Type of funding source: Private company. Main funding source(s): OOO K-SkAI


2021 ◽  
Author(s):  
Vahid Gholami ◽  
Hossein Sahour

Abstract Groundwater drawdown is typically measured using pumping tests and field experiments; however, the traditional methods are time-consuming and costly when applied to extensive areas. In this research, a methodology is introduced based on artificial neural network (ANN)s and field measurements in an alluvial aquifer in the north of Iran. First, the annual drawdown as the output of the ANN models in 250 piezometric wells was measured, and the data were divided into three categories of training data, cross-validation data, and test data. Then, the effective factors in groundwater drawdown including groundwater depth, annual precipitation, annual evaporation, the transmissivity of the aquifer formation, elevation, distance from the sea, distance from water sources (recharge), population density, and groundwater extraction in the influence radius of each well (1000 m) were identified and used as the inputs of the ANN models. Several ANN methods were evaluated, and the predictions were compared with the observations. Results show that, the modular neural network (MNN) showed the highest performance in modeling groundwater drawdown ​​(Training R-sqr = 0.96, test R-sqr = 0.81). The optimum network was fitted to available input data to map the annual drawdown ​​across the entire aquifer. The accuracy assessment of the final map yielded favorable results (R-sqr = 0.8). The adopted methodology can be applied for the prediction of groundwater drawdown in the study site and similar settings elsewhere.


Author(s):  
D. Griffiths ◽  
J. Boehm

<p><strong>Abstract.</strong> Recent developments in the field of deep learning for 3D data have demonstrated promising potential for end-to-end learning directly from point clouds. However, many real-world point clouds contain a large class im-balance due to the natural class im-balance observed in nature. For example, a 3D scan of an urban environment will consist mostly of road and façade, whereas other objects such as poles will be under-represented. In this paper we address this issue by employing a weighted augmentation to increase classes that contain fewer points. By mitigating the class im-balance present in the data we demonstrate that a standard PointNet++ deep neural network can achieve higher performance at inference on validation data. This was observed as an increase of F1 score of 19% and 25% on two test benchmark datasets; ScanNet and Semantic3D respectively where no class im-balance pre-processing had been performed. Our networks performed better on both highly-represented and under-represented classes, which indicates that the network is learning more robust and meaningful features when the loss function is not overly exposed to only a few classes.</p>


2020 ◽  
Vol 13 (6) ◽  
pp. 2631-2644 ◽  
Author(s):  
Georgy Ayzel ◽  
Tobias Scheffer ◽  
Maik Heistermann

Abstract. In this study, we present RainNet, a deep convolutional neural network for radar-based precipitation nowcasting. Its design was inspired by the U-Net and SegNet families of deep learning models, which were originally designed for binary segmentation tasks. RainNet was trained to predict continuous precipitation intensities at a lead time of 5 min, using several years of quality-controlled weather radar composites provided by the German Weather Service (DWD). That data set covers Germany with a spatial domain of 900 km×900 km and has a resolution of 1 km in space and 5 min in time. Independent verification experiments were carried out on 11 summer precipitation events from 2016 to 2017. In order to achieve a lead time of 1 h, a recursive approach was implemented by using RainNet predictions at 5 min lead times as model inputs for longer lead times. In the verification experiments, trivial Eulerian persistence and a conventional model based on optical flow served as benchmarks. The latter is available in the rainymotion library and had previously been shown to outperform DWD's operational nowcasting model for the same set of verification events. RainNet significantly outperforms the benchmark models at all lead times up to 60 min for the routine verification metrics mean absolute error (MAE) and the critical success index (CSI) at intensity thresholds of 0.125, 1, and 5 mm h−1. However, rainymotion turned out to be superior in predicting the exceedance of higher intensity thresholds (here 10 and 15 mm h−1). The limited ability of RainNet to predict heavy rainfall intensities is an undesirable property which we attribute to a high level of spatial smoothing introduced by the model. At a lead time of 5 min, an analysis of power spectral density confirmed a significant loss of spectral power at length scales of 16 km and below. Obviously, RainNet had learned an optimal level of smoothing to produce a nowcast at 5 min lead time. In that sense, the loss of spectral power at small scales is informative, too, as it reflects the limits of predictability as a function of spatial scale. Beyond the lead time of 5 min, however, the increasing level of smoothing is a mere artifact – an analogue to numerical diffusion – that is not a property of RainNet itself but of its recursive application. In the context of early warning, the smoothing is particularly unfavorable since pronounced features of intense precipitation tend to get lost over longer lead times. Hence, we propose several options to address this issue in prospective research, including an adjustment of the loss function for model training, model training for longer lead times, and the prediction of threshold exceedance in terms of a binary segmentation task. Furthermore, we suggest additional input data that could help to better identify situations with imminent precipitation dynamics. The model code, pretrained weights, and training data are provided in open repositories as an input for such future studies.


2022 ◽  
pp. 1-17
Author(s):  
Saleh Albahli ◽  
Ghulam Nabi Ahmad Hassan Yar

Diabetic retinopathy is an eye deficiency that affects retina as a result of the patient having diabetes mellitus caused by high sugar levels, which may eventually lead to macular edema. The objective of this study is to design and compare several deep learning models that detect severity of diabetic retinopathy, determine risk of leading to macular edema, and segment different types of disease patterns using retina images. Indian Diabetic Retinopathy Image Dataset (IDRiD) dataset was used for disease grading and segmentation. Since images of the dataset have different brightness and contrast, we employed three techniques for generating processed images from the original images, which include brightness, color and, contrast (BCC) enhancing, color jitters (CJ), and contrast limited adaptive histogram equalization (CLAHE). After image preporcessing, we used pre-trained ResNet50, VGG16, and VGG19 models on these different preprocessed images both for determining the severity of the retinopathy and also the chances of macular edema. UNet was also applied to segment different types of diseases. To train and test these models, image dataset was divided into training, testing, and validation data at 70%, 20%, and 10% ratios, respectively. During model training, data augmentation method was also applied to increase the number of training images. Study results show that for detecting the severity of retinopathy and macular edema, ResNet50 showed the best accuracy using BCC and original images with an accuracy of 60.2% and 82.5%, respectively, on validation dataset. In segmenting different types of diseases, UNet yielded the highest testing accuracy of 65.22% and 91.09% for microaneurysms and hard exudates using BCC images, 84.83% for optic disc using CJ images, 59.35% and 89.69% for hemorrhages and soft exudates using CLAHE images, respectively. Thus, image preprocessing can play an important role to improve efficacy and performance of deep learning models.


Sign in / Sign up

Export Citation Format

Share Document