scholarly journals Automatic Pharyngeal Phase Recognition in Untrimmed Videofluoroscopic Swallowing Study Using Transfer Learning with Deep Convolutional Neural Networks

Diagnostics ◽  
2021 ◽  
Vol 11 (2) ◽  
pp. 300
Author(s):  
Ki-Sun Lee ◽  
Eunyoung Lee ◽  
Bareun Choi ◽  
Sung-Bom Pyun

Background: Video fluoroscopic swallowing study (VFSS) is considered as the gold standard diagnostic tool for evaluating dysphagia. However, it is time consuming and labor intensive for the clinician to manually search the recorded long video image frame by frame to identify the instantaneous swallowing abnormality in VFSS images. Therefore, this study aims to present a deep leaning-based approach using transfer learning with a convolutional neural network (CNN) that automatically annotates pharyngeal phase frames in untrimmed VFSS videos such that frames need not be searched manually. Methods: To determine whether the image frame in the VFSS video is in the pharyngeal phase, a single-frame baseline architecture based the deep CNN framework is used and a transfer learning technique with fine-tuning is applied. Results: Compared with all experimental CNN models, that fine-tuned with two blocks of the VGG-16 (VGG16-FT5) model achieved the highest performance in terms of recognizing the frame of pharyngeal phase, that is, the accuracy of 93.20 (±1.25)%, sensitivity of 84.57 (±5.19)%, specificity of 94.36 (±1.21)%, AUC of 0.8947 (±0.0269) and Kappa of 0.7093 (±0.0488). Conclusions: Using appropriate and fine-tuning techniques and explainable deep learning techniques such as grad CAM, this study shows that the proposed single-frame-baseline-architecture-based deep CNN framework can yield high performances in the full automation of VFSS video analysis.

2020 ◽  
Vol 9 (2) ◽  
pp. 392 ◽  
Author(s):  
Ki-Sun Lee ◽  
Seok-Ki Jung ◽  
Jae-Jun Ryu ◽  
Sang-Wan Shin ◽  
Jinwook Choi

Dental panoramic radiographs (DPRs) provide information required to potentially evaluate bone density changes through a textural and morphological feature analysis on a mandible. This study aims to evaluate the discriminating performance of deep convolutional neural networks (CNNs), employed with various transfer learning strategies, on the classification of specific features of osteoporosis in DPRs. For objective labeling, we collected a dataset containing 680 images from different patients who underwent both skeletal bone mineral density and digital panoramic radiographic examinations at the Korea University Ansan Hospital between 2009 and 2018. Four study groups were used to evaluate the impact of various transfer learning strategies on deep CNN models as follows: a basic CNN model with three convolutional layers (CNN3), visual geometry group deep CNN model (VGG-16), transfer learning model from VGG-16 (VGG-16_TF), and fine-tuning with the transfer learning model (VGG-16_TF_FT). The best performing model achieved an overall area under the receiver operating characteristic of 0.858. In this study, transfer learning and fine-tuning improved the performance of a deep CNN for screening osteoporosis in DPR images. In addition, using the gradient-weighted class activation mapping technique, a visual interpretation of the best performing deep CNN model indicated that the model relied on image features in the lower left and right border of the mandibular. This result suggests that deep learning-based assessment of DPR images could be useful and reliable in the automated screening of osteoporosis patients.


2019 ◽  
Vol 9 (8) ◽  
pp. 1717-1724 ◽  
Author(s):  
Li Zhu ◽  
Weike Chang

Attention-deficit/hyperactivity disorder (ADHD) is one of the most common and controversial diseases in paediatric psychiatry. Recently, computer-aided diagnosis methods become increasingly popular in clinical diagnosis of ADHD. In this paper, we introduced the latest powerful method—deep convolutional neural networks (CNNs). Some data augmentation methods and CNN transfer learning technique were used to address the application problem of deep CNNs in the ADHD classification task, given the limited annotated data. In addition, we previously encoded all gray-scale images into 3-channel images via two image enhancement methods to leverage the pre-trained CNN models designed for 3-channel images. All CNN models were evaluated on the published testing dataset from the ADHD-200 sample. Evaluation results show that our proposed deep CNN method achieves a state-of-the-art accuracy of 66.67% by using data augmentation methods and CNN transfer learning technique, and outperforms existing methods in the literature. The result can be improved by building a special CNN structure. Furthermore, the trained deep CNN model can be used to clinically diagnose ADHD in real-time. We suggest that the use of CNN transfer learning and data augmentation will be an effective solution in the application problem of deep CNNs in medical image analysis.


Sensors ◽  
2019 ◽  
Vol 19 (22) ◽  
pp. 4850 ◽  
Author(s):  
Carlos S. Pereira ◽  
Raul Morais ◽  
Manuel J. C. S. Reis

Frequently, the vineyards in the Douro Region present multiple grape varieties per parcel and even per row. An automatic algorithm for grape variety identification as an integrated software component was proposed that can be applied, for example, to a robotic harvesting system. However, some issues and constraints in its development were highlighted, namely, the images captured in natural environment, low volume of images, high similarity of the images among different grape varieties, leaf senescence, and significant changes on the grapevine leaf and bunch images in the harvest seasons, mainly due to adverse climatic conditions, diseases, and the presence of pesticides. In this paper, the performance of the transfer learning and fine-tuning techniques based on AlexNet architecture were evaluated when applied to the identification of grape varieties. Two natural vineyard image datasets were captured in different geographical locations and harvest seasons. To generate different datasets for training and classification, some image processing methods, including a proposed four-corners-in-one image warping algorithm, were used. The experimental results, obtained from the application of an AlexNet-based transfer learning scheme and trained on the image dataset pre-processed through the four-corners-in-one method, achieved a test accuracy score of 77.30%. Applying this classifier model, an accuracy of 89.75% on the popular Flavia leaf dataset was reached. The results obtained by the proposed approach are promising and encouraging in helping Douro wine growers in the automatic task of identifying grape varieties.


Author(s):  
Ali Fakhry

The applications of Deep Q-Networks are seen throughout the field of reinforcement learning, a large subsect of machine learning. Using a classic environment from OpenAI, CarRacing-v0, a 2D car racing environment, alongside a custom based modification of the environment, a DQN, Deep Q-Network, was created to solve both the classic and custom environments. The environments are tested using custom made CNN architectures and applying transfer learning from Resnet18. While DQNs were state of the art years ago, using it for CarRacing-v0 appears somewhat unappealing and not as effective as other reinforcement learning techniques. Overall, while the model did train and the agent learned various parts of the environment, attempting to reach the reward threshold for the environment with this reinforcement learning technique seems problematic and difficult as other techniques would be more useful.


2020 ◽  
Vol 10 (4) ◽  
pp. 213 ◽  
Author(s):  
Ki-Sun Lee ◽  
Jae Young Kim ◽  
Eun-tae Jeon ◽  
Won Suk Choi ◽  
Nan Hee Kim ◽  
...  

According to recent studies, patients with COVID-19 have different feature characteristics on chest X-ray (CXR) than those with other lung diseases. This study aimed at evaluating the layer depths and degree of fine-tuning on transfer learning with a deep convolutional neural network (CNN)-based COVID-19 screening in CXR to identify efficient transfer learning strategies. The CXR images used in this study were collected from publicly available repositories, and the collected images were classified into three classes: COVID-19, pneumonia, and normal. To evaluate the effect of layer depths of the same CNN architecture, CNNs called VGG-16 and VGG-19 were used as backbone networks. Then, each backbone network was trained with different degrees of fine-tuning and comparatively evaluated. The experimental results showed the highest AUC value to be 0.950 concerning COVID-19 classification in the experimental group of a fine-tuned with only 2/5 blocks of the VGG16 backbone network. In conclusion, in the classification of medical images with a limited number of data, a deeper layer depth may not guarantee better results. In addition, even if the same pre-trained CNN architecture is used, an appropriate degree of fine-tuning can help to build an efficient deep learning model.


Sensors ◽  
2020 ◽  
Vol 20 (9) ◽  
pp. 2639
Author(s):  
Quan T. Ngo ◽  
Seokhoon Yoon

Facial expression recognition (FER) is a challenging problem in the fields of pattern recognition and computer vision. The recent success of convolutional neural networks (CNNs) in object detection and object segmentation tasks has shown promise in building an automatic deep CNN-based FER model. However, in real-world scenarios, performance degrades dramatically owing to the great diversity of factors unrelated to facial expressions, and due to a lack of training data and an intrinsic imbalance in the existing facial emotion datasets. To tackle these problems, this paper not only applies deep transfer learning techniques, but also proposes a novel loss function called weighted-cluster loss, which is used during the fine-tuning phase. Specifically, the weighted-cluster loss function simultaneously improves the intra-class compactness and the inter-class separability by learning a class center for each emotion class. It also takes the imbalance in a facial expression dataset into account by giving each emotion class a weight based on its proportion of the total number of images. In addition, a recent, successful deep CNN architecture, pre-trained in the task of face identification with the VGGFace2 database from the Visual Geometry Group at Oxford University, is employed and fine-tuned using the proposed loss function to recognize eight basic facial emotions from the AffectNet database of facial expression, valence, and arousal computing in the wild. Experiments on an AffectNet real-world facial dataset demonstrate that our method outperforms the baseline CNN models that use either weighted-softmax loss or center loss.


Algorithms ◽  
2021 ◽  
Vol 14 (11) ◽  
pp. 334
Author(s):  
Nicola Landro ◽  
Ignazio Gallo ◽  
Riccardo La Grassa

Nowadays, the transfer learning technique can be successfully applied in the deep learning field through techniques that fine-tune the CNN’s starting point so it may learn over a huge dataset such as ImageNet and continue to learn on a fixed dataset to achieve better performance. In this paper, we designed a transfer learning methodology that combines the learned features of different teachers to a student network in an end-to-end model, improving the performance of the student network in classification tasks over different datasets. In addition to this, we tried to answer the following questions which are in any case directly related to the transfer learning problem addressed here. Is it possible to improve the performance of a small neural network by using the knowledge gained from a more powerful neural network? Can a deep neural network outperform the teacher using transfer learning? Experimental results suggest that neural networks can transfer their learning to student networks using our proposed architecture, designed to bring to light a new interesting approach for transfer learning techniques. Finally, we provide details of the code and the experimental settings.


2020 ◽  
Author(s):  
Bo Hu ◽  
Lin-Feng Yan ◽  
Yang Yang ◽  
Ying-Zhi Sun ◽  
Cui Yue ◽  
...  

Abstract Background The diagnosis of prostate transition zone cancers (PTZC) remains a clinical challenge due to its similarity to benign prostatic hyperplasia (BPH) on MRI. The Deep Convolutional Neural Networks showed high efficacy in medical imaging but was limited by the small data size. A transfer learning method was combined with deep learning to overcome this challenge.Methods A retrospective investigation was conducted on 217 patients enrolled from our hospital database (208 patients) and The Cancer Imaging Archive (9 patients). Based on the T2 weighted images (T2WIs) and apparent diffusion coefficient (ADC) maps of these patients, DCNN models were trained and compared between different TL database (ImageNet vs. disease-related images) and protocols (from scratch, fine-tuning or transductive transferring).Results PTZC and BPH can be classified through traditional DCNN. The efficacy of transfer learning from ImageNet was limited but improved by transferring knowledge from the disease-related images. Furthermore, transductive transfer learning from disease-related images had the comparable efficacies with the fine-tuning method. Limitations include retrospective design and relatively small sample size.Conclusion For PTZC with a small sample size, the accurate diagnosis can be achieved via the deep transfer learning from disease-related images.


Author(s):  
Jaisakthi Seetharani Murugaiyan ◽  
Mirunalini Palaniappan ◽  
Thenmozhi Durairaj ◽  
Vigneshkumar Muthukumar

Marine species recognition is the process of identifying various species that help in population estimation and identifying the endangered types for taking further remedies and actions. The superior performance of deep learning for classification is due to the property of estimating millions of parameters that have to be extracted from many annotated datasets. However, many types of fish species are becoming extinct, which may reduce the number of samples. The unavailability of a large dataset is a significant hurdle for applying a deep neural network that can be overcome using transfer learning techniques. To overcome this problem, we propose a transfer learning technique using a pre-trained model that uses underwater fish images as input and applies a transfer learning technique to detect the fish species using a pre-trained Google Inception-v3 model. We have evaluated our proposed method on the Fish4knowledge(F4K) dataset and obtained an accuracy of 95.37%. The research would be helpful to identify fish existence and quantity for marine biologists to understand the underwater environment to encourage its preservation and study the behavior and interactions of marine animals.


Electronics ◽  
2019 ◽  
Vol 8 (7) ◽  
pp. 783 ◽  
Author(s):  
Edoardo Ragusa ◽  
Erik Cambria ◽  
Rodolfo Zunino ◽  
Paolo Gastaldo

Deep convolutional neural networks (CNNs) provide an effective tool to extract complex information from images. In the area of image polarity detection, CNNs are customarily utilized in combination with transfer learning techniques to tackle a major problem: the unavailability of large sets of labeled data. Thus, polarity predictors in general exploit a pre-trained CNN as the feature extractor that in turn feeds a classification unit. While the latter unit is trained from scratch, the pre-trained CNN is subject to fine-tuning. As a result, the specific CNN architecture employed as the feature extractor strongly affects the overall performance of the model. This paper analyses state-of-the-art literature on image polarity detection and identifies the most reliable CNN architectures. Moreover, the paper provides an experimental protocol that should allow assessing the role played by the baseline architecture in the polarity detection task. Performance is evaluated in terms of both generalization abilities and computational complexity. The latter attribute becomes critical as polarity predictors, in the era of social networks, might need to be updated within hours or even minutes. In this regard, the paper gives practical hints on the advantages and disadvantages of the examined architectures both in terms of generalization and computational cost.


Sign in / Sign up

Export Citation Format

Share Document