scholarly journals Machine Learning Driven Contouring of High-Frequency Four-Dimensional Cardiac Ultrasound Data

2021 ◽  
Vol 11 (4) ◽  
pp. 1690
Author(s):  
Frederick W. Damen ◽  
David T. Newton ◽  
Guang Lin ◽  
Craig J. Goergen

Automatic boundary detection of 4D ultrasound (4DUS) cardiac data is a promising yet challenging application at the intersection of machine learning and medicine. Using recently developed murine 4DUS cardiac imaging data, we demonstrate here a set of three machine learning models that predict left ventricular wall kinematics along both the endo- and epi-cardial boundaries. Each model is fundamentally built on three key features: (1) the projection of raw US data to a lower dimensional subspace, (2) a smoothing spline basis across time, and (3) a strategic parameterization of the left ventricular boundaries. Model 1 is constructed such that boundary predictions are based on individual short-axis images, regardless of their relative position in the ventricle. Model 2 simultaneously incorporates parallel short-axis image data into their predictions. Model 3 builds on the multi-slice approach of model 2, but assists predictions with a single ground-truth position at end-diastole. To assess the performance of each model, Monte Carlo cross validation was used to assess the performance of each model on unseen data. For predicting the radial distance of the endocardium, models 1, 2, and 3 yielded average R2 values of 0.41, 0.49, and 0.71, respectively. Monte Carlo simulations of the endocardial wall showed significantly closer predictions when using model 2 versus model 1 at a rate of 48.67%, and using model 3 versus model 2 at a rate of 83.50%. These finding suggest that a machine learning approach where multi-slice data are simultaneously used as input and predictions are aided by a single user input yields the most robust performance. Subsequently, we explore the how metrics of cardiac kinematics compare between ground-truth contours and predicted boundaries. We observed negligible deviations from ground-truth when using predicted boundaries alone, except in the case of early diastolic strain rate, providing confidence for the use of such machine learning models for rapid and reliable assessments of murine cardiac function. To our knowledge, this is the first application of machine learning to murine left ventricular 4DUS data. Future work will be needed to strengthen both model performance and applicability to different cardiac disease models.

2019 ◽  
Vol 40 (Supplement_1) ◽  
Author(s):  
G Sng ◽  
D Y Z Lim ◽  
C H Sia ◽  
J S W Lee ◽  
X Y Shen ◽  
...  

Abstract Background/Introduction Classic electrocardiographic (ECG) criteria for left ventricular hypertrophy (LVH) have been well studied in Western populations, particularly in hypertensive patients. However, their utility in Asian populations is not well studied, and their applicability to young pre-participation cohorts is unclear. We sought to evaluate the performance of classical criteria against that of machine learning models. Aims We sought to evaluate the performance of classical criteria against the performance of novel machine learning models in the identification of LVH. Methodology Between November 2009 and December 2014, pre-participation screening ECG and subsequent echocardiographic data was collected from 13,954 males aged 16 to 22, who reported for medical screening prior to military conscription. Final diagnosis of LVH was made on echocardiography, with LVH defined as a left ventricular mass index >115g/m2. The continuous and binary forms of classical criteria were compared against machine learning models using receiver-operating characteristics (ROC) curve analysis. An 80:20 split was used to divide the data into training and test sets for the machine learning models, and three fold cross validation was used in training the models. We also compared the important variables identified by machine learning models with the input variables of classical criteria. Results Prevalence of echocardiographic LVH in this population was 0.91% (127 cases). Classical ECG criteria had poor performance in predicting LVH, with the best predictions achieved by the continuous Sokolow-Lyon (AUC = 0.63, 95% CI = 0.58–0.68) and the continuous Modified Cornell (AUC = 0.63, 95% CI = 0.58–0.68). Machine learning methods achieved superior performance – Random Forest (AUC = 0.74, 95% CI = 0.66–0.82), Gradient Boosting Machines (AUC = 0.70, 95% CI = 0.61–0.79), GLMNet (AUC = 0.78, 95% CI = 0.70–0.86). Novel and less recognized ECG parameters identified by the machine learning models as being predictive of LVH included mean QT interval, mean QRS interval, R in V4, and R in I. ROC curves of models studies Conclusion The prevalence of LVH in our population is lower than that previously reported in other similar populations. Classical ECG criteria perform poorly in this context. Machine learning methods show superior predictive performance and demonstrate non-traditional predictors of LVH from ECG data. Further research is required to improve the predictive ability of machine learning models, and to understand the underlying pathology of the novel ECG predictors identified.


2021 ◽  
Author(s):  
Michael Tarasiou

This paper presents DeepSatData a pipeline for automatically generating satellite imagery datasets for training machine learning models. We also discuss design considerations with emphasis on dense classification tasks, e.g. semantic segmentation. The implementation presented makes use of freely available Sentinel-2 data which allows the generation of large scale datasets required for training deep neural networks (DNN). We discuss issues faced from the point of view of DNN training and evaluation such as checking the quality of ground truth data and comment on the scalability of the approach.


Micromachines ◽  
2020 ◽  
Vol 11 (12) ◽  
pp. 1092
Author(s):  
Jae Hyuk Cho ◽  
Hayoun Lee

A computational framework using artificial intelligence (AI) has been suggested in numerous fields, such as medicine, robotics, meteorology, and chemistry. The specificity of each AI model and the relationship between data characteristics and ground truth, allowing their guidance according to each situation, has not been given. Since TVOCs (total volatile organic compounds) cause serious harm to human health and plants, the prevention of such damages with a reduction in their occurrence frequency becomes not an optional process but an essential one in manufacturing, as well as for chemical industries and laboratories. In this study, with consideration of the characteristics of the machine learning technique and ICT (information and communications technology), TVOC sensors are explored as a function of grounded data analysis and the selection of machine learning models, determining their performance in real situations. For representative scenarios, considering features from an ICT semiconductor sensor and one targeting TVOC gas, we investigated suitable analysis methods and machine learning models such as LSTM (long short-term memory), GRU (gated recurrent unit), and RNN (recurrent neural network). Detailed factors for these machine learning models with respect to the concentration of TVOC gas in the atmosphere are compared with original sensory data to obtain their accuracy. From this work, we expect to significantly minimize risk in empirical applications, i.e., maintaining homeostasis or predicting abnormal situations to construct an opportune response.


2020 ◽  
Vol 12 (7) ◽  
pp. 1225 ◽  
Author(s):  
Abdul-Lateef Balogun ◽  
Shamsudeen Temitope Yekeen ◽  
Biswajeet Pradhan ◽  
Omar F. Althuwaynee

Oil spills are a global phenomenon with impacts that cut across socio-economic, health, and environmental dimensions of the coastal ecosystem. However, comprehensive assessment of oil spill impacts and selection of appropriate remediation approaches have been restricted due to reliance on laboratory experiments which offer limited area coverage and classification accuracy. Thus, this study utilizes multispectral Landsat 8-OLI remote sensing imagery and machine learning models to assess the impacts of oil spills on coastal vegetation and wetland and monitor the recovery pattern of polluted vegetation and wetland in a coastal city. The spatial extent of polluted areas was also precisely quantified for effective management of the coastal ecosystem. Using Johor, a coastal city in Malaysia as a case study, a total of 49 oil spill (ground truth) locations, 54 non-oil-spill locations and Landsat 8-OLI data were utilized for the study. The ground truth points were divided into 70% training and 30% validation parts for the classification of polluted vegetation and wetland. Sixteen different indices that have been used to monitor vegetation and wetland stress in literature were adopted for impact and recovery analysis. To eliminate similarities in spectral appearance of oil-spill-affected vegetation, wetland and other elements like burnt and dead vegetation, Support Vector Machine (SVM) and Random Forest (RF) machine learning models were used for the classification of polluted and nonpolluted vegetation and wetlands. Model optimization was performed using a random search method to improve the models’ performance, and accuracy assessments confirmed the effectiveness of the two machine learning models to identify, classify and quantify the area extent of oil pollution on coastal vegetation and wetland. Considering the harmonic mean (F1), overall accuracy (OA), User’s accuracy (UA), and producers’ accuracy (PA), both models have high accuracies. However, the RF outperformed the SVM with F1, OA, PA and UA values of 95.32%, 96.80%, 98.82% and 95.11%, respectively, while the SVM recorded accuracy values of F1 (80.83%), OA (92.87%), PA (95.18%) and UA (93.81%), respectively, highlighting 1205.98 hectares of polluted vegetation and 1205.98 hectares of polluted wetland. Analysis of the vegetation indices revealed that spilled oil had a significant impact on the vegetation and wetland, although steady recovery was observed between 2015-2018. This study concludes that Chlorophyll Vegetation Index, Modified Difference Water Index, Normalized Difference Vegetation Index and Green Chlorophyll Index vegetation indices are more sensitive for impact and recovery assessment of both vegetation and wetland, in addition to Modified Normalized Difference Vegetation Index for wetlands. Thus, remote sensing and Machine Learning models are essential tools capable of providing accurate information for coastal oil spill impact assessment and recovery analysis for appropriate remediation initiatives.


2019 ◽  
Author(s):  
Akshay Agarwal ◽  
Gowri Nayar ◽  
James Kaufman

ABSTRACTComputational learning methods allow researchers to make predictions, draw inferences, and automate generation of mathematical models. These models are crucial to solving real world problems, such as antimicrobial resistance, pathogen detection, and protein evolution. Machine learning methods depend upon ground truth data to achieve specificity and sensitivity. Since the data is limited in this case, as we will show during the course of this paper, and as the size of available data increases super-linearly, it is of paramount importance to understand the distribution of ground truth data and the analyses it is suited and where it may have limitations that bias downstream learning methods. In this paper, we focus on training data required to model antimicrobial resistance (AR). We report an analysis of bacterial biochemical assay data associated with whole genome sequencing (WGS) from the National Center for Biotechnology Information (NCBI), and discuss important implications when making use of assay data, utilizing genetic features as training data for machine learning models. Complete discussion of machine learning model implementation is outside the scope of this paper and the subject to a later publication.The antimicrobial assay data was obtained from NCBI BioSample, which contains descriptive information about the physical biological specimen from which experimental data is obtained and the results of those experiments themselves.[1] Assay data includes minimum inhibitory concentrations (MIC) of antibiotics, links to associated microbial WGS data, and treatment of a particular microorganism with antibiotics.We observe that there is minimal microbial data available for many antibiotics and for targeted taxonomic groups. The antibiotics with the highest number of assays have less than 1500 measurements each. Corresponding bias in available assays makes machine learning problematic for some important microbes and for building more advanced models that can work across microbial genera. In this study we focus, therefore, on the antibiotic with most assay data (tetracycline) and the corresponding genus with the most available sequence (Acinetobacter with 14000 measurements across 49 antibiotic compounds). Using this data for training and testing, we observed contradictions in the distribution of assay outcomes and report methods to identify and resolve such conflicts. Per antibiotic, we find that there can be up to 30% of (resolvable) conflicting measurements. As more data becomes available, automated training data curation will be an important part of creating useful machine learning models to predict antibiotic resistance.CCS CONCEPTS• Applied computing → Computational biology; Computational genomics; Bioinformatics;


Electronics ◽  
2021 ◽  
Vol 10 (23) ◽  
pp. 3045
Author(s):  
Mudabbir Ali ◽  
Asad Masood Khattak ◽  
Zain Ali ◽  
Bashir Hayat ◽  
Muhammad Idrees ◽  
...  

Machine learning has the potential to predict unseen data and thus improve the productivity and processes of daily life activities. Notwithstanding its adaptiveness, several sensitive applications based on such technology cannot compromise our trust in them; thus, highly accurate machine learning models require reason. Such models are black boxes for end-users. Therefore, the concept of interpretability plays the role if assisting users in a couple of ways. Interpretable models are models that possess the quality of explaining predictions. Different strategies have been proposed for the aforementioned concept but some of these require an excessive amount of effort, lack generalization, are not agnostic and are computationally expensive. Thus, in this work, we propose a strategy that can tackle the aforementioned issues. A surrogate model assisted us in building interpretable models. Moreover, it helped us achieve results with accuracy close to that of the black box model but with less processing time. Thus, the proposed technique is computationally cheaper than traditional methods. The significance of such a novel technique is that data science developers will not have to perform strenuous hands-on activities to undertake feature engineering tasks and end-users will have the graphical-based explanation of complex models in a comprehensive way—consequently building trust in a machine.


2021 ◽  
Author(s):  
Michael Tarasiou

This paper presents DeepSatData a pipeline for automatically generating satellite imagery datasets for training machine learning models. We also discuss design considerations with emphasis on dense classification tasks, e.g. semantic segmentation. The implementation presented makes use of freely available Sentinel-2 data which allows the generation of large scale datasets required for training deep neural networks (DNN). We discuss issues faced from the point of view of DNN training and evaluation such as checking the quality of ground truth data and comment on the scalability of the approach.


Circulation ◽  
2020 ◽  
Vol 142 (Suppl_3) ◽  
Author(s):  
Yuji Sakashita ◽  
Hidetsugu Asanoi ◽  
Shigeru Miyagawa ◽  
Satoshi Kainuma ◽  
Ai Kawamura ◽  
...  

Background: Although transplantation of patch-shaped autologous skeletal muscle-derived cells has been introduced to ischemic cardiomyopathy (ICM) or non-ICM patients, we found responders or non-responders to this treatment exist and how to predict the responsiveness to this treatment is crucial for the improvement of the effectiveness. In this study, we revealed the clinical features associated with the responders using a machine learning based-model, which was discriminated between the responder and the non-responders in this treatment. Methods and Results: We used the retrospective databases of 23 ICM patients and 23 non-ICM patients undergoing autologous myoblast patch transplantation to develop machine learning models to discriminate 3-year VAD free survival. Sixty-nine pre-transplantation clinical futures were selected to train the models. In ICM models, there were 4 VADs or deaths, and in non-ICM models, there were 10 VADs or deaths during the 3-year follow-up periods. Using these databases, we trained multiple machine learning models and evaluated the models with the leave-one-out method. In ICM, k nearest neighbor demonstrated the best performance, showing the accuracy was 95.7%, and the AUC was 0.95. The features associated with 3-year VAD free survival in ICM were NYHA classification, cardiac index, and left ventricular ejection fraction. In non-ICM, k nearest neighbor demonstrated the best performance among the trained classifiers, demonstrating the accuracy was 95.7%, and the AUC was 0.93. The features associated with 3-year VAD free survival in non-ICM were pulmonary capillary wedge pressure, pulmonary vascular resistance, and albumin. Conclusion: We found the features associated with 3-year VAD free survival in autologous myoblast patch transplantation in ICM and non-ICM. Focusing on these features may facilitate optimal candidate selection in ICM and non-ICM for regenerative therapy.


2021 ◽  
Author(s):  
_ _

Abstract For the past century, optimization of drilling has caught the eyes of many researchers. The main areas center on ROP, fluid treatment, and bit selection. They all share the same goal of maximizing ROP and reducing NPT. In other to develop an optimal control system, ROP must be predicted accurately, unfortunately, it is a complex parameter that is affected by multiple drilling parameters, rock properties, fluid properties, and bit selection. Models used for prediction have developed from empirical models like Bourgoyne and Young's to more intelligent models such as SVM and ANN. With the continuous increase in data obtained from sensors while drilling, there is still much work to be done in this field. In this research, the improvement of an empirical model and the development of an intelligent model are presented. The Bourgoyne and Young's model uses multiple linear regression to estimate coefficients which it then inserts into an empirical formula to predict ROP. This model was modified using non-linear curve-fitting to estimate the coefficients and make it reduce bias to generalize better. Machine learning models such as Gradient Boosting, Random Forest, ANN, and DNN were used in the development of a predictive model for the ROP. These models were easier to develop compared to the empirical model since they rely more on data rather than statistical formulas. The data used in this research include drilling data from 3 wells drilled in 2 fields within the Niger Delta region in Nigeria. The models were developed and trained on one of the wells, while the remaining two were used for testing the performance of the models. The modified empirical model improved the efficiency of the base model by 14% during validation but performs poorly on unseen data from the other two wells. The Machine learning models outperform the empirical models and perform accurately on unseen data from the other wells. DNN was the best performing model achieving an average accuracy of 0.987 for the 3 wells.


Sign in / Sign up

Export Citation Format

Share Document