scholarly journals Development of an Artificial Intelligence System for the Automatic Evaluation of Cervical Vertebral Maturation Status

Diagnostics ◽  
2021 ◽  
Vol 11 (12) ◽  
pp. 2200
Author(s):  
Jing Zhou ◽  
Hong Zhou ◽  
Lingling Pu ◽  
Yanzi Gao ◽  
Ziwei Tang ◽  
...  

Background: Cervical vertebral maturation (CVM) is widely used to evaluate growth potential in the field of orthodontics. This study is aimed to develop an artificial intelligence (AI) system to automatically determine the CVM status and evaluate the AI performance. Methods: A total of 1080 cephalometric radiographs, with the age of patients ranging from 6 to 22 years old, were included in the dataset (980 in training dataset and 100 in testing dataset). Two reference points and thirteen anatomical points were labelled and the cervical vertebral maturation staging (CS) was assessed by human examiners as gold standard. A convolutional neural network (CNN) model was built to train on 980 images and to test on 100 images. Statistical analysis was conducted to detect labelling differences between AI and human examiners, AI performance was also evaluated. Results: The mean labelling error between human examiners was 0.48 ± 0.12 mm. The mean labelling error between AI and human examiners was 0.36 ± 0.09 mm. In general, the agreement between AI results and the gold standard was good, with the intraclass correlation coefficient (ICC) value being up to 98%. Moreover, the accuracy of CVM staging was 71%. In terms of F1 score, CS6 stage (85%) ranked the highest accuracy. Conclusions: In this study, AI showed a good agreement with human examiners, being a useful and reliable tool in assessing the cervical vertebral maturation.

Author(s):  
Peikai Yan ◽  
Shaohua Li ◽  
Zhou Zhou ◽  
Qian Liu ◽  
Jiahui Wu ◽  
...  

OBJECTIVE Little is known about the efficacy of using artificial intelligence to identify laryngeal carcinoma from images of vocal lesions taken in different hospitals with multiple laryngoscope systems. This multicenter study was aimed to establish an artificial intelligence system and provide a reliable auxiliary tool to screen for laryngeal carcinoma. Study Design: Multicentre case-control study Setting: Six tertiary care centers Participants: The laryngoscopy images were collected from 2179 patients with vocal lesions. Outcome Measures: An automatic detection system of laryngeal carcinoma was established based on Faster R-CNN, which was used to distinguish vocal malignant and benign lesions in 2179 laryngoscopy images acquired from 6 hospitals with 5 types of laryngoscopy systems. Pathology was the gold standard to identify malignant and benign vocal lesions. Results: Among 89 cases of the malignant group, the classifier was able to evaluate the laryngeal carcinoma in 66 patients (74.16%, sensitivity), while the classifier was able to assess the benign laryngeal lesion in 503 cases among 640 cases of the benign group (78.59%, specificity). Furthermore, the CNN-based classifier achieved an overall accuracy of 78.05% with a 95.63% negative prediction for the testing dataset. Conclusion: This automatic diagnostic system has the potential to assist clinical laryngeal carcinoma diagnosis, which may improve and standardize the diagnostic capacity of endoscopists using different laryngoscopes.


2021 ◽  
Vol 11 ◽  
Author(s):  
Dehua Tang ◽  
Jie Zhou ◽  
Lei Wang ◽  
Muhan Ni ◽  
Min Chen ◽  
...  

Background and AimsPrediction of intramucosal gastric cancer (GC) is a big challenge. It is not clear whether artificial intelligence could assist endoscopists in the diagnosis.MethodsA deep convolutional neural networks (DCNN) model was developed via retrospectively collected 3407 endoscopic images from 666 gastric cancer patients from two Endoscopy Centers (training dataset). The DCNN model’s performance was tested with 228 images from 62 independent patients (testing dataset). The endoscopists evaluated the image and video testing dataset with or without the DCNN model’s assistance, respectively. Endoscopists’ diagnostic performance was compared with or without the DCNN model’s assistance and investigated the effects of assistance using correlations and linear regression analyses.ResultsThe DCNN model discriminated intramucosal GC from advanced GC with an AUC of 0.942 (95% CI, 0.915–0.970), a sensitivity of 90.5% (95% CI, 84.1%–95.4%), and a specificity of 85.3% (95% CI, 77.1%–90.9%) in the testing dataset. The diagnostic performance of novice endoscopists was comparable to those of expert endoscopists with the DCNN model’s assistance (accuracy: 84.6% vs. 85.5%, sensitivity: 85.7% vs. 87.4%, specificity: 83.3% vs. 83.0%). The mean pairwise kappa value of endoscopists was increased significantly with the DCNN model’s assistance (0.430–0.629 vs. 0.660–0.861). The diagnostic duration reduced considerably with the assistance of the DCNN model from 4.35s to 3.01s. The correlation between the perseverance of effort and diagnostic accuracy of endoscopists was diminished using the DCNN model (r: 0.470 vs. 0.076).ConclusionsAn AI-assisted system was established and found useful for novice endoscopists to achieve comparable diagnostic performance with experts.


2021 ◽  
Vol 2021 ◽  
pp. 1-8
Author(s):  
Ziming Liu ◽  
Emmanuel Eric Pazo ◽  
Hong Ye ◽  
Cui Yu ◽  
Ling Xu ◽  
...  

Purpose. To assess the repeatability and agreement of refractive measurements using 2WIN-S photoscreening with the gold-standard cycloplegic retinoscope refraction. Design. Single centre, cross-sectional study. Methods. Spherical, cylindrical, axis, and spherical equivalent of 194 bilateral eyes of 97 children were assessed using a retinoscope and 2WIN-S. One week later, another operator repeated the 2WIN-S measurements. The primary outcome measures were to assess the repeatability and agreement between spherical equivalent, J0, and J45 readings of 2WIN-S. The repeatability of measurements was assessed by the within-subject standard deviation (2.77 Sw) and intraclass correlation coefficient (ICC). The agreement between devices was assessed using 95% limits of agreement. The extent of the agreement between cycloplegic retinoscopy and noncycloplegic 2WIN-S measurements was assessed using Bland–Altman analysis. Results. The mean age ± SD was 10.3 ± 2.46 year (range, 4–14 years). The sphere, cylinder, and spherical equivalent measurements were found to be consistent with both apparatus (r value >0.86). ICC for SE, J0, and J45 was 0.900, 0.666, and 0.639, respectively; Sw for SE, J0, and J45 was 0.61D, 0.30D, and 0.31D, respectively; Bland–Altman analysis of retinoscopy with cycloplegia and 2WIN-S for SE was 184/194 (95%) in 95% confidence interval, and the mean value was 0.46. J0 was 184/194 (95%), and the mean value is −0.04. J45 was 181/194 (93%), and the mean value is −0.15. Conclusion. The objective refractive measurement of 2WIN-S had good reliability and high agreement with the gold-standard retinoscopy refraction in children and adolescents. While consistency was observed, it is essential to take into consideration that it is a screening tool.


Author(s):  
Yang Zhang ◽  
Siwa Chan ◽  
Jeon-Hor Chen ◽  
Kai-Ting Chang ◽  
Chin-Yao Lin ◽  
...  

AbstractTo develop a U-net deep learning method for breast tissue segmentation on fat-sat T1-weighted (T1W) MRI using transfer learning (TL) from a model developed for non-fat-sat images. The training dataset (N = 126) was imaged on a 1.5 T MR scanner, and the independent testing dataset (N = 40) was imaged on a 3 T scanner, both using fat-sat T1W pulse sequence. Pre-contrast images acquired in the dynamic-contrast-enhanced (DCE) MRI sequence were used for analysis. All patients had unilateral cancer, and the segmentation was performed using the contralateral normal breast. The ground truth of breast and fibroglandular tissue (FGT) segmentation was generated using a template-based segmentation method with a clustering algorithm. The deep learning segmentation was performed using U-net models trained with and without TL, by using initial values of trainable parameters taken from the previous model for non-fat-sat images. The ground truth of each case was used to evaluate the segmentation performance of the U-net models by calculating the dice similarity coefficient (DSC) and the overall accuracy based on all pixels. Pearson’s correlation was used to evaluate the correlation of breast volume and FGT volume between the U-net prediction output and the ground truth. In the training dataset, the evaluation was performed using tenfold cross-validation, and the mean DSC with and without TL was 0.97 vs. 0.95 for breast and 0.86 vs. 0.80 for FGT. When the final model developed with and without TL from the training dataset was applied to the testing dataset, the mean DSC was 0.89 vs. 0.83 for breast and 0.81 vs. 0.81 for FGT, respectively. Application of TL not only improved the DSC, but also decreased the required training case number. Lastly, there was a high correlation (R2 > 0.90) for both the training and testing datasets between the U-net prediction output and ground truth for breast volume and FGT volume. U-net can be applied to perform breast tissue segmentation on fat-sat images, and TL is an efficient strategy to develop a specific model for each different dataset.


2018 ◽  
Vol 2018 ◽  
pp. 1-6 ◽  
Author(s):  
Munenori Uemura ◽  
Morimasa Tomikawa ◽  
Tiejun Miao ◽  
Ryota Souzaki ◽  
Satoshi Ieiri ◽  
...  

This study investigated whether parameters derived from hand motions of expert and novice surgeons accurately and objectively reflect laparoscopic surgical skill levels using an artificial intelligence system consisting of a three-layer chaos neural network. Sixty-seven surgeons (23 experts and 44 novices) performed a laparoscopic skill assessment task while their hand motions were recorded using a magnetic tracking sensor. Eight parameters evaluated as measures of skill in a previous study were used as inputs to the neural network. Optimization of the neural network was achieved after seven trials with a training dataset of 38 surgeons, with a correct judgment ratio of 0.99. The neural network that prospectively worked with the remaining 29 surgeons had a correct judgment rate of 79% for distinguishing between expert and novice surgeons. In conclusion, our artificial intelligence system distinguished between expert and novice surgeons among surgeons with unknown skill levels.


Author(s):  
James P. Howard ◽  
Catherine C. Stowell ◽  
Graham D. Cole ◽  
Kajaluxy Ananthan ◽  
Camelia D. Demetrescu ◽  
...  

Background: Artificial intelligence (AI) for echocardiography requires training and validation to standards expected of humans. We developed an online platform and established the Unity Collaborative to build a dataset of expertise from 17 hospitals for training, validation, and standardization of such techniques. Methods: The training dataset consisted of 2056 individual frames drawn at random from 1265 parasternal long-axis video-loops of patients undergoing clinical echocardiography in 2015 to 2016. Nine experts labeled these images using our online platform. From this, we trained a convolutional neural network to identify keypoints. Subsequently, 13 experts labeled a validation dataset of the end-systolic and end-diastolic frame from 100 new video-loops, twice each. The 26-opinion consensus was used as the reference standard. The primary outcome was precision SD, the SD of the differences between AI measurement and expert consensus. Results: In the validation dataset, the AI’s precision SD for left ventricular internal dimension was 3.5 mm. For context, precision SD of individual expert measurements against the expert consensus was 4.4 mm. Intraclass correlation coefficient between AI and expert consensus was 0.926 (95% CI, 0.904–0.944), compared with 0.817 (0.778–0.954) between individual experts and expert consensus. For interventricular septum thickness, precision SD was 1.8 mm for AI (intraclass correlation coefficient, 0.809; 0.729–0.967), versus 2.0 mm for individuals (intraclass correlation coefficient, 0.641; 0.568–0.716). For posterior wall thickness, precision SD was 1.4 mm for AI (intraclass correlation coefficient, 0.535 [95% CI, 0.379–0.661]), versus 2.2 mm for individuals (0.366 [0.288–0.462]). We present all images and annotations. This highlights challenging cases, including poor image quality and tapered ventricles. Conclusions: Experts at multiple institutions successfully cooperated to build a collaborative AI. This performed as well as individual experts. Future echocardiographic AI research should use a consensus of experts as a reference. Our collaborative welcomes new partners who share our commitment to publish all methods, code, annotations, and results openly.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Kaori Ishii ◽  
Ryo Asaoka ◽  
Takashi Omoto ◽  
Shingo Mitaki ◽  
Yuri Fujino ◽  
...  

AbstractThe purpose of the current study was to predict intraocular pressure (IOP) using color fundus photography with a deep learning (DL) model, or, systemic variables with a multivariate linear regression model (MLM), along with least absolute shrinkage and selection operator regression (LASSO), support vector machine (SVM), and Random Forest: (RF). Training dataset included 3883 examinations from 3883 eyes of 1945 subjects and testing dataset 289 examinations from 289 eyes from 146 subjects. With the training dataset, MLM was constructed to predict IOP using 35 systemic variables and 25 blood measurements. A DL model was developed to predict IOP from color fundus photographs. The prediction accuracy of each model was evaluated through the absolute error and the marginal R-squared (mR2), using the testing dataset. The mean absolute error with MLM was 2.29 mmHg, which was significantly smaller than that with DL (2.70 dB). The mR2 with MLM was 0.15, whereas that with DL was 0.0066. The mean absolute error (between 2.24 and 2.30 mmHg) and mR2 (between 0.11 and 0.15) with LASSO, SVM and RF were similar to or poorer than MLM. A DL model to predict IOP using color fundus photography proved far less accurate than MLM using systemic variables.


2021 ◽  
Author(s):  
Hung-Chang Chen ◽  
Shin-Shi Tzeng ◽  
Yen-Chang Hsiao ◽  
Ruei-Feng Chen ◽  
Erh-Chien Hung ◽  
...  

BACKGROUND Margin reflex distance 1(MRD1), margin reflex distance 2 (MRD2), and levator muscle function (LF) are crucial for ptosis evaluation and management. Manual measurements of MRD1, MRD2, and LF are time-consuming, subjective, and prone to human error. Smartphone-based artificial intelligence (AI) image processing is a potential solution to overcome these limitations. OBJECTIVE We proposed the first smartphone-based AI-assisted image processing algorithm for MRD1, MRD2, and LF measurements. METHODS This observational study included 822 eyes of 411 volunteers aged over 18 years from August 1, 2020, to April 30, 2021. Six orbital photographs (bilateral primary gaze, up-gaze, and down-gaze) were taken using a smartphone (iPhone 11 pro max). The gold standard measurements and normalized eye photographs were obtained from these orbital photographs and compiled using AI-assisted software to create MRD1, MRD2 and LF models. RESULTS The Pearson correlation coefficients between the gold standard measurements and the predicted values obtained with the MRD1 and MRD2 models were excellent (r = 0.91, and 0.88, respectively) and with the LF model were good (r = 0.73). The intraclass correlation coefficient results showed excellent agreement between the gold standard measurements and the values predicted by the MRD1and MRD2 models (0.90, and 0.84, respectively), and substantial agreement with the LF model (0.69). The mean absolute errors were 0.35 mm, 0.37 mm, and 1.06 mm for MRD1, MRD2, and LF models, respectively. The 95% limits of agreement were -0.94 to 0.94 mm for the MRD1 model; -0.92 to 1.03 mm for the MRD2 model; and -0.63 to 2.53 mm for the LF model. CONCLUSIONS In this study, we proposed the first smartphone-based AI-assisted image processing algorithm for eyelid measurements. MRD1, MRD2, and LF measures can be taken in a quick, objective, and convenient manner. Furthermore, by using a smartphone, the examiner can check these measurements anywhere and at any time, which facilitates data collection.


2017 ◽  
Vol 2 (5) ◽  

In this study, we evaluate ShonitTM, an artificial intelligence (AI) system for automated analysis of images captured from peripheral blood smears, consisting of an automated digital microscope and a cloud based analysis platform. ShonitTM’s performance in classification of WBCs was evaluated by comparing ShonitTM’s results with haematologyanalysers and manual microscopy for manually stained smears. The study was carried out over 100 samples. The cases included both normal and abnormal samples, wherein the abnormal cases were from patients with one or more quantitative or qualitative flagging. All the smears were created using Hemaprep auto-smearer and stained using May Grunwald Giemsa stain. They were scanned and analysed by ShonitTM for WBC differentials under 40X magnification.WBC morphological classification by ShonitTM was verified by an experienced haemato-pathologist. Quantitative parameters were analysed by computing the mean absolute difference of the WBC DC values between ShonitTM and Sysmex XN3000, between ShonitTM and manual microscopy & between ShonitTM and Horiba ES 60. The mean absolute difference between WBC differential values of manual microscopy and ShonitTM were 7.67%, 5.93%, 4.58%, 2.69%, 0.44% for neutrophil, lymphocyte, monocyte, eosinophil and basophil respectively. The mean absolute difference between WBC differential values of Sysmex XN3000 and ShonitTM were 8.73%, 5.55%, 3.63%, 2.12%, 0.45% for neutrophil, lymphocyte, monocyte, eosinophil and basophil respectively. ShonitTM has proven to be effective in locating and examining WBCs. It saves time, accelerates the turnaround-time and increases productivity of pathologists. It has helped to overcome the time-consuming effort associated with traditional microscopy.


2020 ◽  
Vol 29 (2) ◽  
pp. 259-264 ◽  
Author(s):  
Hasan K. Saleh ◽  
Paula Folkeard ◽  
Ewan Macpherson ◽  
Susan Scollie

Purpose The original Connected Speech Test (CST; Cox et al., 1987) is a well-regarded and often utilized speech perception test. The aim of this study was to develop a new version of the CST using a neutral North American accent and to assess the use of this updated CST on participants with normal hearing. Method A female English speaker was recruited to read the original CST passages, which were recorded as the new CST stimuli. A study was designed to assess the newly recorded CST passages' equivalence and conduct normalization. The study included 19 Western University students (11 females and eight males) with normal hearing and with English as a first language. Results Raw scores for the 48 tested passages were converted to rationalized arcsine units, and average passage scores more than 1 rationalized arcsine unit standard deviation from the mean were excluded. The internal reliability of the 32 remaining passages was assessed, and the two-way random effects intraclass correlation was .944. Conclusion The aim of our study was to create new CST stimuli with a more general North American accent in order to minimize accent effects on the speech perception scores. The study resulted in 32 passages of equivalent difficulty for listeners with normal hearing.


Sign in / Sign up

Export Citation Format

Share Document