scholarly journals New polyp image classification technique using transfer learning of network-in-network structure in endoscopic images

2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Young Jae Kim ◽  
Jang Pyo Bae ◽  
Jun-Won Chung ◽  
Dong Kyun Park ◽  
Kwang Gi Kim ◽  
...  

AbstractWhile colorectal cancer is known to occur in the gastrointestinal tract. It is the third most common form of cancer of 27 major types of cancer in South Korea and worldwide. Colorectal polyps are known to increase the potential of developing colorectal cancer. Detected polyps need to be resected to reduce the risk of developing cancer. This research improved the performance of polyp classification through the fine-tuning of Network-in-Network (NIN) after applying a pre-trained model of the ImageNet database. Random shuffling is performed 20 times on 1000 colonoscopy images. Each set of data are divided into 800 images of training data and 200 images of test data. An accuracy evaluation is performed on 200 images of test data in 20 experiments. Three compared methods were constructed from AlexNet by transferring the weights trained by three different state-of-the-art databases. A normal AlexNet based method without transfer learning was also compared. The accuracy of the proposed method was higher in statistical significance than the accuracy of four other state-of-the-art methods, and showed an 18.9% improvement over the normal AlexNet based method. The area under the curve was approximately 0.930 ± 0.020, and the recall rate was 0.929 ± 0.029. An automatic algorithm can assist endoscopists in identifying polyps that are adenomatous by considering a high recall rate and accuracy. This system can enable the timely resection of polyps at an early stage.

Electronics ◽  
2021 ◽  
Vol 10 (15) ◽  
pp. 1807
Author(s):  
Sascha Grollmisch ◽  
Estefanía Cano

Including unlabeled data in the training process of neural networks using Semi-Supervised Learning (SSL) has shown impressive results in the image domain, where state-of-the-art results were obtained with only a fraction of the labeled data. The commonality between recent SSL methods is that they strongly rely on the augmentation of unannotated data. This is vastly unexplored for audio data. In this work, SSL using the state-of-the-art FixMatch approach is evaluated on three audio classification tasks, including music, industrial sounds, and acoustic scenes. The performance of FixMatch is compared to Convolutional Neural Networks (CNN) trained from scratch, Transfer Learning, and SSL using the Mean Teacher approach. Additionally, a simple yet effective approach for selecting suitable augmentation methods for FixMatch is introduced. FixMatch with the proposed modifications always outperformed Mean Teacher and the CNNs trained from scratch. For the industrial sounds and music datasets, the CNN baseline performance using the full dataset was reached with less than 5% of the initial training data, demonstrating the potential of recent SSL methods for audio data. Transfer Learning outperformed FixMatch only for the most challenging dataset from acoustic scene classification, showing that there is still room for improvement.


2019 ◽  
Vol 2019 ◽  
pp. 1-14 ◽  
Author(s):  
Yikui Zhai ◽  
He Cao ◽  
Wenbo Deng ◽  
Junying Gan ◽  
Vincenzo Piuri ◽  
...  

Because of the lack of discriminative face representations and scarcity of labeled training data, facial beauty prediction (FBP), which aims at assessing facial attractiveness automatically, has become a challenging pattern recognition problem. Inspired by recent promising work on fine-grained image classification using the multiscale architecture to extend the diversity of deep features, BeautyNet for unconstrained facial beauty prediction is proposed in this paper. Firstly, a multiscale network is adopted to improve the discriminative of face features. Secondly, to alleviate the computational burden of the multiscale architecture, MFM (max-feature-map) is utilized as an activation function which can not only lighten the network and speed network convergence but also benefit the performance. Finally, transfer learning strategy is introduced here to mitigate the overfitting phenomenon which is caused by the scarcity of labeled facial beauty samples and improves the proposed BeautyNet’s performance. Extensive experiments performed on LSFBD demonstrate that the proposed scheme outperforms the state-of-the-art methods, which can achieve 67.48% classification accuracy.


2021 ◽  
pp. 1-10
Author(s):  
Gayatri Pattnaik ◽  
Vimal K. Shrivastava ◽  
K. Parvathi

Pests are major threat to economic growth of a country. Application of pesticide is the easiest way to control the pest infection. However, excessive utilization of pesticide is hazardous to environment. The recent advances in deep learning have paved the way for early detection and improved classification of pest in tomato plants which will benefit the farmers. This paper presents a comprehensive analysis of 11 state-of-the-art deep convolutional neural network (CNN) models with three configurations: transfers learning, fine-tuning and scratch learning. The training in transfer learning and fine tuning initiates from pre-trained weights whereas random weights are used in case of scratch learning. In addition, the concept of data augmentation has been explored to improve the performance. Our dataset consists of 859 tomato pest images from 10 categories. The results demonstrate that the highest classification accuracy of 94.87% has been achieved in the transfer learning approach by DenseNet201 model with data augmentation.


Sensors ◽  
2020 ◽  
Vol 20 (9) ◽  
pp. 2639
Author(s):  
Quan T. Ngo ◽  
Seokhoon Yoon

Facial expression recognition (FER) is a challenging problem in the fields of pattern recognition and computer vision. The recent success of convolutional neural networks (CNNs) in object detection and object segmentation tasks has shown promise in building an automatic deep CNN-based FER model. However, in real-world scenarios, performance degrades dramatically owing to the great diversity of factors unrelated to facial expressions, and due to a lack of training data and an intrinsic imbalance in the existing facial emotion datasets. To tackle these problems, this paper not only applies deep transfer learning techniques, but also proposes a novel loss function called weighted-cluster loss, which is used during the fine-tuning phase. Specifically, the weighted-cluster loss function simultaneously improves the intra-class compactness and the inter-class separability by learning a class center for each emotion class. It also takes the imbalance in a facial expression dataset into account by giving each emotion class a weight based on its proportion of the total number of images. In addition, a recent, successful deep CNN architecture, pre-trained in the task of face identification with the VGGFace2 database from the Visual Geometry Group at Oxford University, is employed and fine-tuned using the proposed loss function to recognize eight basic facial emotions from the AffectNet database of facial expression, valence, and arousal computing in the wild. Experiments on an AffectNet real-world facial dataset demonstrate that our method outperforms the baseline CNN models that use either weighted-softmax loss or center loss.


Author(s):  
Michael Schrempf ◽  
Diether Kramer ◽  
Stefanie Jauk ◽  
Sai P. K. Veeranki ◽  
Werner Leodolter ◽  
...  

Background: Patients with major adverse cardiovascular events (MACE) such as myocardial infarction or stroke suffer from frequent hospitalizations and have high mortality rates. By identifying patients at risk at an early stage, MACE can be prevented with the right interventions. Objectives: The aim of this study was to develop machine learning-based models for the 5-year risk prediction of MACE. Methods: The data used for modelling included electronic medical records of more than 128,000 patients including 29,262 patients with MACE. A feature selection based on filter and embedded methods resulted in 826 features for modelling. Different machine learning methods were used for modelling on the training data. Results: A random forest model achieved the best calibration and discriminative performance on a separate test data set with an AUROC of 0.88. Conclusion: The developed risk prediction models achieved an excellent performance in the test data. Future research is needed to determine the performance of these models and their clinical benefit in prospective settings.


This research is aimed to achieve high-precision accuracy and for face recognition system. Convolution Neural Network is one of the Deep Learning approaches and has demonstrated excellent performance in many fields, including image recognition of a large amount of training data (such as ImageNet). In fact, hardware limitations and insufficient training data-sets are the challenges of getting high performance. Therefore, in this work the Deep Transfer Learning method using AlexNet pre-trained CNN is proposed to improve the performance of the face-recognition system even for a smaller number of images. The transfer learning method is used to fine-tuning on the last layer of AlexNet CNN model for new classification tasks. The data augmentation (DA) technique also proposed to minimize the over-fitting problem during Deep transfer learning training and to improve accuracy. The results proved the improvement in over-fitting and in performance after using the data augmentation technique. All the experiments were tested on UTeMFD, GTFD, and CASIA-Face V5 small data-sets. As a result, the proposed system achieved a high accuracy as 100% on UTeMFD, 96.67% on GTFD, and 95.60% on CASIA-Face V5 in less than 0.05 seconds of recognition time.


2021 ◽  
Author(s):  
Geoffrey F. Schau ◽  
Hassan Ghani ◽  
Erik A. Burlingame ◽  
Guillaume Thibault ◽  
Joe W. Gray ◽  
...  

AbstractAccurate diagnosis of metastatic cancer is essential for prescribing optimal control strategies to halt further spread of metastasizing disease. While pathological inspection aided by immunohistochemistry staining provides a valuable gold standard for clinical diagnostics, deep learning methods have emerged as powerful tools for identifying clinically relevant features of whole slide histology relevant to a tumor’s metastatic origin. Although deep learning models require significant training data to learn effectively, transfer learning paradigms provide mechanisms to circumvent limited training data by first training a model on related data prior to fine-tuning on smaller data sets of interest. In this work we propose a transfer learning approach that trains a convolutional neural network to infer the metastatic origin of tumor tissue from whole slide images of hematoxylin and eosin (H&E) stained tissue sections and illustrate the advantages of pre-training network on whole slide images of primary tumor morphology. We further characterize statistical dissimilarity between primary and metastatic tumors of various indications on patch-level images to highlight limitations of our indication-specific transfer learning approach. Using a primary-to-metastatic transfer learning approach, we achieved mean class-specific areas under receiver operator characteristics curve (AUROC) of 0.779, which outperformed comparable models trained on only images of primary tumor (mean AUROC of 0.691) or trained on only images of metastatic tumor (mean AUROC of 0.675), supporting the use of large scale primary tumor imaging data in developing computer vision models to characterize metastatic origin of tumor lesions.


Sensors ◽  
2021 ◽  
Vol 21 (24) ◽  
pp. 8219
Author(s):  
Amin Ul Haq ◽  
Jian Ping Li ◽  
Sultan Ahmad ◽  
Shakir Khan ◽  
Mohammed Ali Alshara ◽  
...  

COVID-19 is a transferable disease that is also a leading cause of death for a large number of people worldwide. This disease, caused by SARS-CoV-2, spreads very rapidly and quickly affects the respiratory system of the human being. Therefore, it is necessary to diagnosis this disease at the early stage for proper treatment, recovery, and controlling the spread. The automatic diagnosis system is significantly necessary for COVID-19 detection. To diagnose COVID-19 from chest X-ray images, employing artificial intelligence techniques based methods are more effective and could correctly diagnosis it. The existing diagnosis methods of COVID-19 have the problem of lack of accuracy to diagnosis. To handle this problem we have proposed an efficient and accurate diagnosis model for COVID-19. In the proposed method, a two-dimensional Convolutional Neural Network (2DCNN) is designed for COVID-19 recognition employing chest X-ray images. Transfer learning (TL) pre-trained ResNet-50 model weight is transferred to the 2DCNN model to enhanced the training process of the 2DCNN model and fine-tuning with chest X-ray images data for final multi-classification to diagnose COVID-19. In addition, the data augmentation technique transformation (rotation) is used to increase the data set size for effective training of the R2DCNNMC model. The experimental results demonstrated that the proposed (R2DCNNMC) model obtained high accuracy and obtained 98.12% classification accuracy on CRD data set, and 99.45% classification accuracy on CXI data set as compared to baseline methods. This approach has a high performance and could be used for COVID-19 diagnosis in E-Healthcare systems.


Author(s):  
Jonathan Boigne ◽  
Biman Liyanage ◽  
Ted Östrem

We propose a novel transfer learning method for speech emotion recognition allowing us to obtain promising results when only few training data is available. With as low as 125 examples per emotion class, we were able to reach a higher accuracy than a strong baseline trained on 8 times more data. Our method leverages knowledge contained in pre-trained speech representations extracted from models trained on a more general self-supervised task which doesn’t require human annotations, such as the wav2vec model. We provide detailed insights on the benefits of our approach by varying the training data size, which can help labeling teams to work more efficiently. We compare performance with other popular methods on the IEMOCAP dataset, a well-benchmarked dataset among the Speech Emotion Recognition (SER) research community. Furthermore, we demonstrate that results can be greatly improved by combining acoustic and linguistic knowledge from transfer learning. We align acoustic pre-trained representations with semantic representations from the BERT model through an attention-based recurrent neural network. Performance improves significantly when combining both modalities and scales with the amount of data. When trained on the full IEMOCAP dataset, we reach a new state-of-the-art of 73.9% unweighted accuracy (UA).


Author(s):  
Kasikrit Damkliang ◽  
Thakerng Wongsirichot ◽  
Paramee Thongsuksai

Since the introduction of image pattern recognition and computer vision processing, the classification of cancer tissues has been a challenge at pixel-level, slide-level, and patient-level. Conventional machine learning techniques have given way to Deep Learning (DL), a contemporary, state-of-the-art approach to texture classification and localization of cancer tissues. Colorectal Cancer (CRC) is the third ranked cause of death from cancer worldwide. This paper proposes image-level texture classification of a CRC dataset by deep convolutional neural networks (CNN). Simple DL techniques consisting of transfer learning and fine-tuning were exploited. VGG-16, a Keras pre-trained model with initial weights by ImageNet, was applied. The transfer learning architecture and methods responding to VGG-16 are proposed. The training, validation, and testing sets included 5000 images of 150 × 150 pixels. The application set for detection and localization contained 10 large original images of 5000 × 5000 pixels. The model achieved F1-score and accuracy of 0.96 and 0.99, respectively, and produced a false positive rate of 0.01. AUC-based evaluation was also measured. The model classified ten large previously unseen images from the application set represented in false color maps. The reported results show the satisfactory performance of the model. The simplicity of the architecture, configuration, and implementation also contributes to the outcome this work.


Sign in / Sign up

Export Citation Format

Share Document