scholarly journals ACCV: automatic classification algorithm of cataract video based on deep learning

2021 ◽  
Vol 20 (1) ◽  
Author(s):  
Shenming Hu ◽  
Xinze Luan ◽  
Hong Wu ◽  
Xiaoting Wang ◽  
Chunhong Yan ◽  
...  

Abstract Purpose A real-time automatic cataract-grading algorithm based on cataract video is proposed. Materials and methods In this retrospective study, we set the video of the eye lens section as the research target. A method is proposed to use YOLOv3 to assist in positioning, to automatically identify the position of the lens and classify the cataract after color space conversion. The data set is a cataract video file of 38 people's 76 eyes collected by a slit lamp. Data were collected using five random manner, the method aims to reduce the influence on the collection algorithm accuracy. The video length is within 10 s, and the classified picture data are extracted from the video file. A total of 1520 images are extracted from the image data set, and the data set is divided into training set, validation set and test set according to the ratio of 7:2:1. Results We verified it on the 76-segment clinical data test set and achieved the accuracy of 0.9400, with the AUC of 0.9880, and the F1 of 0.9388. In addition, because of the color space recognition method, the detection per frame can be completed within 29 microseconds and thus the detection efficiency has been improved significantly. Conclusion With the efficiency and effectiveness of this algorithm, the lens scan video is used as the research object, which improves the accuracy of the screening. It is closer to the actual cataract diagnosis and treatment process, and can effectively improve the cataract inspection ability of non-ophthalmologists. For cataract screening in poor areas, the accessibility of ophthalmology medical care is also increased.

2021 ◽  
Author(s):  
Xiaobo Wen ◽  
Biao Zhao ◽  
Meifang Yuan ◽  
Jinzhi Li ◽  
Mengzhen Sun ◽  
...  

Abstract Objectives: To explore the performance of Multi-scale Fusion Attention U-net (MSFA-U-net) in thyroid gland segmentation on CT localization images for radiotherapy. Methods: CT localization images for radiotherapy of 80 patients with breast cancer or head and neck tumors were selected; label images were manually delineated by experienced radiologists. The data set was randomly divided into the training set (n=60), the validation set (n=10), and the test set (n=10). Data expansion was performed in the training set, and the performance of the MSFA-U-net model was evaluated using the evaluation indicators Dice similarity coefficient (DSC), Jaccard similarity coefficient (JSC), positive predictive value (PPV), sensitivity (SE), and Hausdorff distance (HD). Results: With the MSFA-U-net model, the DSC, JSC, PPV, SE, and HD indexes of the segmented thyroid gland in the test set were 0.8967±0.0935, 0.8219±0.1115, 0.9065±0.0940, 0.8979±0.1104, and 2.3922±0.5423, respectively. Compared with U-net, HR-net, and Attention U-net, MSFA-U-net showed that DSC increased by 0.052, 0.0376, and 0.0346 respectively; JSC increased by 0.0569, 0.0805, and 0.0433, respectively; SE increased by 0.0361, 0.1091, and 0.0831, respectively; and HD increased by −0.208, −0.1952, and −0.0548, respectively. The test set image results showed that the thyroid edges segmented by the MSFA-U-net model were closer to the standard thyroid delineated by the experts, in comparison with those segmented by the other three models. Moreover, the edges were smoother, over-anti-noise interference was stronger, and oversegmentation and undersegmentation were reduced. Conclusion: The MSFA-U-net model can meet basic clinical requirements and improve the efficiency of physicians' clinical work.


2018 ◽  
pp. 1-8 ◽  
Author(s):  
Okyaz Eminaga ◽  
Nurettin Eminaga ◽  
Axel Semjonow ◽  
Bernhard Breil

Purpose The recognition of cystoscopic findings remains challenging for young colleagues and depends on the examiner’s skills. Computer-aided diagnosis tools using feature extraction and deep learning show promise as instruments to perform diagnostic classification. Materials and Methods Our study considered 479 patient cases that represented 44 urologic findings. Image color was linearly normalized and was equalized by applying contrast-limited adaptive histogram equalization. Because these findings can be viewed via cystoscopy from every possible angle and side, we ultimately generated images rotated in 10-degree grades and flipped them vertically or horizontally, which resulted in 18,681 images. After image preprocessing, we developed deep convolutional neural network (CNN) models (ResNet50, VGG-19, VGG-16, InceptionV3, and Xception) and evaluated these models using F1 scores. Furthermore, we proposed two CNN concepts: 90%-previous-layer filter size and harmonic-series filter size. A training set (60%), a validation set (10%), and a test set (30%) were randomly generated from the study data set. All models were trained on the training set, validated on the validation set, and evaluated on the test set. Results The Xception-based model achieved the highest F1 score (99.52%), followed by models that were based on ResNet50 (99.48%) and the harmonic-series concept (99.45%). All images with cancer lesions were correctly determined by these models. When the focus was on the images misclassified by the model with the best performance, 7.86% of images that showed bladder stones with indwelling catheter and 1.43% of images that showed bladder diverticulum were falsely classified. Conclusion The results of this study show the potential of deep learning for the diagnostic classification of cystoscopic images. Future work will focus on integration of artificial intelligence–aided cystoscopy into clinical routines and possibly expansion to other clinical endoscopy applications.


2021 ◽  
Vol 19 (1) ◽  
Author(s):  
Bo Huang ◽  
Wei Tan ◽  
Zhou Li ◽  
Lei Jin

Abstract Background For the association between time-lapse technology (TLT) and embryo ploidy status, there has not yet been fully understood. TLT has the characteristics of large amount of data and non-invasiveness. If we want to accurately predict embryo ploidy status from TLT, artificial intelligence (AI) technology is a good choice. However, the current work of AI in this field needs to be strengthened. Methods A total of 469 preimplantation genetic testing (PGT) cycles and 1803 blastocysts from April 2018 to November 2019 were included in the study. All embryo images are captured during 5 or 6 days after fertilization before biopsy by time-lapse microscope system. All euploid embryos or aneuploid embryos are used as data sets. The data set is divided into training set, validation set and test set. The training set is mainly used for model training, the validation set is mainly used to adjust the hyperparameters of the model and the preliminary evaluation of the model, and the test set is used to evaluate the generalization ability of the model. For better verification, we used data other than the training data for external verification. A total of 155 PGT cycles from December 2019 to December 2020 and 523 blastocysts were included in the verification process. Results The euploid prediction algorithm (EPA) was able to predict euploid on the testing dataset with an area under curve (AUC) of 0.80. Conclusions The TLT incubator has gradually become the choice of reproductive centers. Our AI model named EPA that can predict embryo ploidy well based on TLT data. We hope that this system can serve all in vitro fertilization and embryo transfer (IVF-ET) patients in the future, allowing embryologists to have more non-invasive aids when selecting the best embryo to transfer.


2019 ◽  
Vol 2019 (1) ◽  
pp. 360-368
Author(s):  
Mekides Assefa Abebe ◽  
Jon Yngve Hardeberg

Different whiteboard image degradations highly reduce the legibility of pen-stroke content as well as the overall quality of the images. Consequently, different researchers addressed the problem through different image enhancement techniques. Most of the state-of-the-art approaches applied common image processing techniques such as background foreground segmentation, text extraction, contrast and color enhancements and white balancing. However, such types of conventional enhancement methods are incapable of recovering severely degraded pen-stroke contents and produce artifacts in the presence of complex pen-stroke illustrations. In order to surmount such problems, the authors have proposed a deep learning based solution. They have contributed a new whiteboard image data set and adopted two deep convolutional neural network architectures for whiteboard image quality enhancement applications. Their different evaluations of the trained models demonstrated their superior performances over the conventional methods.


2011 ◽  
Vol 2 (1) ◽  
Author(s):  
Vina Chovan Epifania ◽  
Eko Sediyono

Abstract. Image File Searching Based on Color Domination. One characteristic of an image that can be used in image searching process is the composition of the colors. Color is a trait that is easily seen by man in the picture. The use of color as a searching parameter can provide a solution in an easier searching for images stored in computer memory. Color images have RGB values that can be computed and converted into HSL color space model. Use of HSL images model is very easy because it can be calculated using a percent, so that in each pixel of the image can be grouped and named, this can give a dominant values of the colors contained in one image. By obtaining these values, the image search can be done quickly just by using these values to a retrieval system image file. This article discusses the use of the HSL color space model to facilitate the searching for a digital image in the digital image data warehouse. From the test results of the application form, a searching is faster by using the colors specified by the user. Obstacles encountered were still searching with a choice of 15 basic colors available, with a limit of 33% dominance of the color image search was not found. This is due to the dominant color in each image has the most dominant value below 33%.   Keywords: RGB, HSL, image searching Abstrak. Salah satu ciri gambar yang dapat dipergunakan dalam proses pencarian gambar adalah komposisi warna. Warna adalah ciri yang mudah dilihat oleh manusia dalam citra gambar. Penggunaan warna sebagai parameter pencarian dapat memberikan solusi dalam memudahkan pencarian gambar yang tersimpan dalam memori komputer. Warna gambar memiliki nilai RGB yang dapat dihitung dan dikonversi ke dalam model HSL color space. Penggunaan model gambar HSL sangat mudah karena dapat dihitung dengan menggunakan persen, sehingga dalam setiap piksel gambar dapat dikelompokan dan diberi nama, hal ini dapat memberikan suatu nilai dominan dari warna yang terdapat dalam satu gambar. Dengan diperolehnya nilai tersebut, pencarian gambar dapat dilakukan dengan cepat hanya dengan menggunakan nilai tersebut pada sistem pencarian file gambar. Artikel ini membahas tentang penggunaan model HSL color space untuk mempermudah pencarian suatu gambar digital didalam gudang data gambar digital. Dari hasil uji aplikasi yang sudah dibuat, diperoleh pencarian yang lebih cepat dengan menggunakan pilihan warna yang ditentukan sendiri oleh pengguna. Kendala yang masih dijumpai adalah pencarian dengan pilihan 15 warna dasar yang tersedia, dengan batas dominasi warna 33% tidak ditemukan gambar yang dicari. Hal ini disebabkan warna dominan disetiap gambar kebanyakan memiliki nilai dominan di bawah 33%. Kata Kunci: RGB, HSL, pencarian gambar


2020 ◽  
Vol 33 (6) ◽  
pp. 838-844
Author(s):  
Jan-Helge Klingler ◽  
Ulrich Hubbe ◽  
Christoph Scholz ◽  
Florian Volz ◽  
Marc Hohenhaus ◽  
...  

OBJECTIVEIntraoperative 3D imaging and navigation is increasingly used for minimally invasive spine surgery. A novel, noninvasive patient tracker that is adhered as a mask on the skin for 3D navigation necessitates a larger intraoperative 3D image set for appropriate referencing. This enlarged 3D image data set can be acquired by a state-of-the-art 3D C-arm device that is equipped with a large flat-panel detector. However, the presumably associated higher radiation exposure to the patient has essentially not yet been investigated and is therefore the objective of this study.METHODSPatients were retrospectively included if a thoracolumbar 3D scan was performed intraoperatively between 2016 and 2019 using a 3D C-arm with a large 30 × 30–cm flat-panel detector (3D scan volume 4096 cm3) or a 3D C-arm with a smaller 20 × 20–cm flat-panel detector (3D scan volume 2097 cm3), and the dose area product was available for the 3D scan. Additionally, the fluoroscopy time and the number of fluoroscopic images per 3D scan, as well as the BMI of the patients, were recorded.RESULTSThe authors compared 62 intraoperative thoracolumbar 3D scans using the 3D C-arm with a large flat-panel detector and 12 3D scans using the 3D C-arm with a small flat-panel detector. Overall, the 3D C-arm with a large flat-panel detector required more fluoroscopic images per scan (mean 389.0 ± 8.4 vs 117.0 ± 4.6, p < 0.0001), leading to a significantly higher dose area product (mean 1028.6 ± 767.9 vs 457.1 ± 118.9 cGy × cm2, p = 0.0044).CONCLUSIONSThe novel, noninvasive patient tracker mask facilitates intraoperative 3D navigation while eliminating the need for an additional skin incision with detachment of the autochthonous muscles. However, the use of this patient tracker mask requires a larger intraoperative 3D image data set for accurate registration, resulting in a 2.25 times higher radiation exposure to the patient. The use of the patient tracker mask should thus be based on an individual decision, especially taking into considering the radiation exposure and extent of instrumentation.


2020 ◽  
Vol 16 (8) ◽  
pp. 1088-1105
Author(s):  
Nafiseh Vahedi ◽  
Majid Mohammadhosseini ◽  
Mehdi Nekoei

Background: The poly(ADP-ribose) polymerases (PARP) is a nuclear enzyme superfamily present in eukaryotes. Methods: In the present report, some efficient linear and non-linear methods including multiple linear regression (MLR), support vector machine (SVM) and artificial neural networks (ANN) were successfully used to develop and establish quantitative structure-activity relationship (QSAR) models capable of predicting pEC50 values of tetrahydropyridopyridazinone derivatives as effective PARP inhibitors. Principal component analysis (PCA) was used to a rational division of the whole data set and selection of the training and test sets. A genetic algorithm (GA) variable selection method was employed to select the optimal subset of descriptors that have the most significant contributions to the overall inhibitory activity from the large pool of calculated descriptors. Results: The accuracy and predictability of the proposed models were further confirmed using crossvalidation, validation through an external test set and Y-randomization (chance correlations) approaches. Moreover, an exhaustive statistical comparison was performed on the outputs of the proposed models. The results revealed that non-linear modeling approaches, including SVM and ANN could provide much more prediction capabilities. Conclusion: Among the constructed models and in terms of root mean square error of predictions (RMSEP), cross-validation coefficients (Q2 LOO and Q2 LGO), as well as R2 and F-statistical value for the training set, the predictive power of the GA-SVM approach was better. However, compared with MLR and SVM, the statistical parameters for the test set were more proper using the GA-ANN model.


2019 ◽  
Vol 11 (10) ◽  
pp. 1157 ◽  
Author(s):  
Jorge Fuentes-Pacheco ◽  
Juan Torres-Olivares ◽  
Edgar Roman-Rangel ◽  
Salvador Cervantes ◽  
Porfirio Juarez-Lopez ◽  
...  

Crop segmentation is an important task in Precision Agriculture, where the use of aerial robots with an on-board camera has contributed to the development of new solution alternatives. We address the problem of fig plant segmentation in top-view RGB (Red-Green-Blue) images of a crop grown under open-field difficult circumstances of complex lighting conditions and non-ideal crop maintenance practices defined by local farmers. We present a Convolutional Neural Network (CNN) with an encoder-decoder architecture that classifies each pixel as crop or non-crop using only raw colour images as input. Our approach achieves a mean accuracy of 93.85% despite the complexity of the background and a highly variable visual appearance of the leaves. We make available our CNN code to the research community, as well as the aerial image data set and a hand-made ground truth segmentation with pixel precision to facilitate the comparison among different algorithms.


Processes ◽  
2021 ◽  
Vol 9 (7) ◽  
pp. 1128
Author(s):  
Chern-Sheng Lin ◽  
Yu-Ching Pan ◽  
Yu-Xin Kuo ◽  
Ching-Kun Chen ◽  
Chuen-Lin Tien

In this study, the machine vision and artificial intelligence algorithms were used to rapidly check the degree of cooking of foods and avoid the over-cooking of foods. Using a smart induction cooker for heating, the image processing program automatically recognizes the color of the food before and after cooking. The new cooking parameters were used to identify the cooking conditions of the food when it is undercooked, cooked, and overcooked. In the research, the camera was used in combination with the software for development, and the real-time image processing technology was used to obtain the information of the color of the food, and through calculation parameters, the cooking status of the food was monitored. In the second year, using the color space conversion, a novel algorithm, and artificial intelligence, the foreground segmentation was used to separate the vegetables from the background, and the cooking ripeness, cooking unevenness, oil glossiness, and sauce absorption were calculated. The image color difference and the distribution were used to judge the cooking conditions of the food, so that the cooking system can identify whether or not to adopt partial tumbling, or to end a cooking operation. A novel artificial intelligence algorithm is used in the relative field, and the error rate can be reduced to 3%. This work will significantly help researchers working in the advanced cooking devices.


Sign in / Sign up

Export Citation Format

Share Document