Analysis of Encrypted Image Data with Deep Learning Models

Image data is one of the primary sources of ecological data used in biodiversity conservation and management worldwide. However, classifying and interpreting large numbers of images is time and resource expensive, particularly in the context of camera trapping. Deep learning models have been used to achieve this task but are often not suited to specific applications due to their inability to generalise to new environments and inconsistent performance. Models need to be developed for specific species cohorts and environments, but the technical skills required to achieve this are a key barrier to the accessibility of this technology to ecologists. Thus, there is a strong need to democratize access to deep learning technologies by providing an easy-to-use software application allowing non-technical users to train custom object detectors. U-Infuse addresses this issue by providing ecologists with the ability to train customised models using publicly available images and/or their own images without specific technical expertise. Auto-annotation and annotation editing functionalities minimize the constraints of manually annotating and pre-processing large numbers of images. U-Infuse is a free and open-source software solution that supports both multiclass and single class training and object detection, allowing ecologists to access deep learning technologies usually only available to computer scientists, on their own device, customised for their application, without sharing intellectual property or sensitive data. It provides ecological practitioners with the ability to (i) easily achieve object detection within a user-friendly GUI, generating a species distribution report, and other useful statistics, (ii) custom train deep learning models using publicly available and custom training data, (iii) achieve supervised auto-annotation of images for further training, with the benefit of editing annotations to ensure quality datasets. Broad adoption of U-Infuse by ecological practitioners will improve ecological image analysis and processing by allowing significantly more image data to be processed with minimal expenditure of time and resources, particularly for camera trap images. Ease of training and use of transfer learning means domain-specific models can be trained rapidly, and frequently updated without the need for computer science expertise, or data sharing, protecting intellectual property and privacy.

Download Full-text

Pansformers: Transformer-Based Self-Attention Network for Pansharpening

10.36227/techrxiv.17153228.v1 ◽

2021 ◽

Author(s):

Nithin G R ◽

Nitish Kumar M ◽

Venkateswaran Narasimhan ◽

Rajanikanth Kakani ◽

Ujjwal Gupta ◽

...

Keyword(s):

Deep Learning ◽

High Resolution ◽

Performance Metrics ◽

Satellite Image ◽

Image Data ◽

Landsat 8 ◽

Learning Models ◽

Attention Network ◽

Panchromatic Image ◽

Recent Success

Pansharpening is the task of creating a High-Resolution Multi-Spectral Image (HRMS) by extracting and infusing pixel details from the High-Resolution Panchromatic Image into the Low-Resolution Multi-Spectral (LRMS). With the boom in the amount of satellite image data, researchers have replaced traditional approaches with deep learning models. However, existing deep learning models are not built to capture intricate pixel-level relationships. Motivated by the recent success of self-attention mechanisms in computer vision tasks, we propose Pansformers, a transformer-based self-attention architecture, that computes band-wise attention. A further improvement is proposed in the attention network by introducing a Multi-Patch Attention mechanism, which operates on non-overlapping, local patches of the image. Our model is successful in infusing relevant local details from the Panchromatic image while preserving the spectral integrity of the MS image. We show that our Pansformer model significantly improves the performance metrics and the output image quality on imagery from two satellite distributions IKONOS and LANDSAT-8.

Download Full-text

Automated Analysis of Three-dimensional CBCT Images Taken in Natural Head Position That Combines Facial Profile Processing and Multiple Deep-learning Models

10.21203/rs.3.rs-795041/v1 ◽

2021 ◽

Author(s):

Janghoon Ahn ◽

Thong Phi Nguyen ◽

Yoon-Ji Kim ◽

Taeyong Kim ◽

Jonghun Yoon

Keyword(s):

Deep Learning ◽

Profile Analysis ◽

Traditional Approach ◽

Three Dimensional ◽

Image Data ◽

Automated Analysis ◽

Head Position ◽

X Rays ◽

Learning Models ◽

Facial Profile

Abstract Analysing cephalometric X-rays, which is mostly performed by orthodontists or dentists, is an indispensable procedure for diagnosis and treatment planning with orthodontic patients. Artificial intelligence, especially deep-learning techniques for analysing image data, shows great potential for medical and dental image analysis and diagnosis. To explore the feasibility of automating measurement of 13 geometric parameters from three-dimensional cone beam computed tomography (CBCT) images taken in a natural head position, we here describe a smart system that combines a facial profile analysis algorithm with deep-learning models. Using multiple views extracted from the CBCT data as the dataset, our proposed method partitions and detects regions of interest by extracting the facial profile and applying Mask-RCNN, a trained decentralized convolutional neural network (CNN) that positions the key parameters. All the techniques are integrated into a software application with a graphical user interface designed for user convenience. To demonstrate the system’s ability to replace human experts, we validated the performance of the proposed method by comparing it with measurements made by two orthodontists and one advanced general dentist using a commercial dental program. The time savings compared with the traditional approach was remarkable, reducing the processing time from about 30 minutes to about 30 seconds.

Download Full-text

Novel Transfer Learning Approach for Medical Imaging with Limited Labeled Data

Cancers ◽

10.3390/cancers13071590 ◽

2021 ◽

Vol 13 (7) ◽

pp. 1590

Author(s):

Laith Alzubaidi ◽

Muthana Al-Amidie ◽

Ahmed Al-Asadi ◽

Amjad J. Humaidi ◽

Omran Al-Shamma ◽

...

Keyword(s):

Deep Learning ◽

Medical Imaging ◽

Transfer Learning ◽

Medical Image ◽

Medical Images ◽

Image Data ◽

Learning Approach ◽

Learning Models ◽

Annotation Process ◽

Deep Learning Model

Deep learning requires a large amount of data to perform well. However, the field of medical image analysis suffers from a lack of sufficient data for training deep learning models. Moreover, medical images require manual labeling, usually provided by human annotators coming from various backgrounds. More importantly, the annotation process is time-consuming, expensive, and prone to errors. Transfer learning was introduced to reduce the need for the annotation process by transferring the deep learning models with knowledge from a previous task and then by fine-tuning them on a relatively small dataset of the current task. Most of the methods of medical image classification employ transfer learning from pretrained models, e.g., ImageNet, which has been proven to be ineffective. This is due to the mismatch in learned features between the natural image, e.g., ImageNet, and medical images. Additionally, it results in the utilization of deeply elaborated models. In this paper, we propose a novel transfer learning approach to overcome the previous drawbacks by means of training the deep learning model on large unlabeled medical image datasets and by next transferring the knowledge to train the deep learning model on the small amount of labeled medical images. Additionally, we propose a new deep convolutional neural network (DCNN) model that combines recent advancements in the field. We conducted several experiments on two challenging medical imaging scenarios dealing with skin and breast cancer classification tasks. According to the reported results, it has been empirically proven that the proposed approach can significantly improve the performance of both classification scenarios. In terms of skin cancer, the proposed model achieved an F1-score value of 89.09% when trained from scratch and 98.53% with the proposed approach. Secondly, it achieved an accuracy value of 85.29% and 97.51%, respectively, when trained from scratch and using the proposed approach in the case of the breast cancer scenario. Finally, we concluded that our method can possibly be applied to many medical imaging problems in which a substantial amount of unlabeled image data is available and the labeled image data is limited. Moreover, it can be utilized to improve the performance of medical imaging tasks in the same domain. To do so, we used the pretrained skin cancer model to train on feet skin to classify them into two classes—either normal or abnormal (diabetic foot ulcer (DFU)). It achieved an F1-score value of 86.0% when trained from scratch, 96.25% using transfer learning, and 99.25% using double-transfer learning.

Download Full-text

Generalizability of deep learning models for dental image analysis

Scientific Reports ◽

10.1038/s41598-021-85454-5 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Joachim Krois ◽

Anselmo Garcia Cantu ◽

Akhilanand Chaurasia ◽

Ranjitkumar Patil ◽

Prabhat Kumar Chaudhari ◽

...

Keyword(s):

Deep Learning ◽

Root Canal ◽

Model Performance ◽

Image Data ◽

Dental Status ◽

Learning Models ◽

Panoramic Radiographs ◽

Image Characteristics ◽

Pixel Value ◽

The Impact

AbstractWe assessed the generalizability of deep learning models and how to improve it. Our exemplary use-case was the detection of apical lesions on panoramic radiographs. We employed two datasets of panoramic radiographs from two centers, one in Germany (Charité, Berlin, n = 650) and one in India (KGMU, Lucknow, n = 650): First, U-Net type models were trained on images from Charité (n = 500) and assessed on test sets from Charité and KGMU (each n = 150). Second, the relevance of image characteristics was explored using pixel-value transformations, aligning the image characteristics in the datasets. Third, cross-center training effects on generalizability were evaluated by stepwise replacing Charite with KGMU images. Last, we assessed the impact of the dental status (presence of root-canal fillings or restorations). Models trained only on Charité images showed a (mean ± SD) F1-score of 54.1 ± 0.8% on Charité and 32.7 ± 0.8% on KGMU data (p < 0.001/t-test). Alignment of image data characteristics between the centers did not improve generalizability. However, by gradually increasing the fraction of KGMU images in the training set (from 0 to 100%) the F1-score on KGMU images improved (46.1 ± 0.9%) at a moderate decrease on Charité images (50.9 ± 0.9%, p < 0.01). Model performance was good on KGMU images showing root-canal fillings and/or restorations, but much lower on KGMU images without root-canal fillings and/or restorations. Our deep learning models were not generalizable across centers. Cross-center training improved generalizability. Noteworthy, the dental status, but not image characteristics were relevant. Understanding the reasons behind limits in generalizability helps to mitigate generalizability problems.

Download Full-text

Pansformers: Transformer-Based Self-Attention Network for Pansharpening

10.36227/techrxiv.17153228 ◽

2021 ◽

Author(s):

Nithin G R ◽

Nitish Kumar M ◽

Venkateswaran Narasimhan ◽

Rajanikanth Kakani ◽

Ujjwal Gupta ◽

...

Keyword(s):

Deep Learning ◽

High Resolution ◽

Performance Metrics ◽

Satellite Image ◽

Image Data ◽

Landsat 8 ◽

Learning Models ◽

Attention Network ◽

Panchromatic Image ◽

Recent Success

Pansharpening is the task of creating a High-Resolution Multi-Spectral Image (HRMS) by extracting and infusing pixel details from the High-Resolution Panchromatic Image into the Low-Resolution Multi-Spectral (LRMS). With the boom in the amount of satellite image data, researchers have replaced traditional approaches with deep learning models. However, existing deep learning models are not built to capture intricate pixel-level relationships. Motivated by the recent success of self-attention mechanisms in computer vision tasks, we propose Pansformers, a transformer-based self-attention architecture, that computes band-wise attention. A further improvement is proposed in the attention network by introducing a Multi-Patch Attention mechanism, which operates on non-overlapping, local patches of the image. Our model is successful in infusing relevant local details from the Panchromatic image while preserving the spectral integrity of the MS image. We show that our Pansformer model significantly improves the performance metrics and the output image quality on imagery from two satellite distributions IKONOS and LANDSAT-8.

Download Full-text

On Determining Suitable Embedded Devices for Deep Learning Models

10.3233/faia210147 ◽

2021 ◽

Author(s):

Daniel Padilla ◽

Hatem A. Rashwan ◽

Domènec Savi Puig

Keyword(s):

Computer Vision ◽

Deep Learning ◽

Embedded Systems ◽

State Of The Art ◽

Image Data ◽

Important Change ◽

Learning Models ◽

Embedded Devices ◽

And Performance ◽

High Level

Deep learning (DL) networks have proven to be crucial in commercial solutions with computer vision challenges due to their abilities to extract high-level abstractions of the image data and their capabilities of being easily adapted to many applications. As a result, DL methodologies had become a de facto standard for computer vision problems yielding many new kinds of research, approaches and applications. Recently, the commercial sector is also driving to use of embedded systems to be able to execute DL models, which has caused an important change on the DL panorama and the embedded systems themselves. Consequently, in this paper, we attempt to study the state of the art of embedded systems, such as GPUs, FPGAs and Mobile SoCs, that are able to use DL techniques, to modernize the stakeholders with the new systems available in the market. Besides, we aim at helping them to determine which of these systems can be beneficial and suitable for their applications in terms of upgradeability, price, deployment and performance.

Download Full-text

Cervical Precancerous Lesion Detection Based on Deep Learning of Colposcopy Images

Journal of Medical Imaging and Health Informatics ◽

10.1166/jmihi.2020.3051 ◽

2020 ◽

Vol 10 (5) ◽

pp. 1234-1241

Author(s):

Yongliang Zhang ◽

Ling Li ◽

Jia Gu ◽

Tiexiang Wen ◽

Qiang Xu

Keyword(s):

Cervical Cancer ◽

Deep Learning ◽

Rapid Development ◽

Precancerous Lesion ◽

Image Data ◽

Intraepithelial Neoplasia ◽

Lesion Detection ◽

Traditional Learning ◽

Learning Models

With the rapid development of deep learning, automatic lesion detection is widely used in clinical screening. In this paper, we make use of convolutional neural network (CNN) algorithm to help medical experts detect cervical precancerous lesion during the colposcopic screening, especially in the classification of cervical intraepithelial neoplasia (CIN). Firstly, the original image data is classified into six categories: normal, cervical cancer, mild (CIN1), moderate (CIN2), severe (CIN3) and cervicitis, which are further augmented to solve the problem of few samples of endoscopic images and non-uniformity for each category. Then, a CNN-based model is built and trained for the multi-classification of the six categories, we have added some optimization algorithms to this CNN model to make the training parameters more effective. For the test dataset, the accuracy of the proposed CNN model algorithm is 89.36%, and the area under the receiver operating characteristic (ROC) curve is 0.954. Among them, the accuracy is increased by 18%–32% compared with other traditional learning methods, which is 9%–20% higher than several commonly used deep learning models. At the same number of iterations, the time consumption of proposed algorithm is only one quarter of other deep learning models. Our study has demonstrated that cervical colposcopic image classification based on artificial intelligence has high clinical applicability, and can facilitate the early diagnosis of cervical cancer.

Download Full-text

Detection of coronavirus disease from X-ray images using deep learning and transfer learning algorithms

Journal of X-Ray Science and Technology ◽

10.3233/xst-200720 ◽

2020 ◽

Vol 28 (5) ◽

pp. 841-850

Author(s):

Saleh Albahli ◽

Waleed Albattah

Keyword(s):

Deep Learning ◽

Transfer Learning ◽

Data Augmentation ◽

Medical Image Analysis ◽

Image Data ◽

Network Models ◽

Learning Models ◽

X Ray ◽

Study Results ◽

Chest X Ray

OBJECTIVE: This study aims to employ the advantages of computer vision and medical image analysis to develop an automated model that has the clinical potential for early detection of novel coronavirus (COVID-19) infected disease. METHOD: This study applied transfer learning method to develop deep learning models for detecting COVID-19 disease. Three existing state-of-the-art deep learning models namely, Inception ResNetV2, InceptionNetV3 and NASNetLarge, were selected and fine-tuned to automatically detect and diagnose COVID-19 disease using chest X-ray images. A dataset involving 850 images with the confirmed COVID-19 disease, 500 images of community-acquired (non-COVID-19) pneumonia cases and 915 normal chest X-ray images was used in this study. RESULTS: Among the three models, InceptionNetV3 yielded the best performance with accuracy levels of 98.63% and 99.02% with and without using data augmentation in model training, respectively. All the performed networks tend to overfitting (with high training accuracy) when data augmentation is not used, this is due to the limited amount of image data used for training and validation. CONCLUSION: This study demonstrated that a deep transfer learning is feasible to detect COVID-19 disease automatically from chest X-ray by training the learning model with chest X-ray images mixed with COVID-19 patients, other pneumonia affected patients and people with healthy lungs, which may help doctors more effectively make their clinical decisions. The study also gives an insight to how transfer learning was used to automatically detect the COVID-19 disease. In future studies, as the amount of available dataset increases, different convolution neutral network models could be designed to achieve the goal more efficiently.

Download Full-text

Stacked Convolutional Sparse Auto-Encoders for Representation Learning

ACM Transactions on Knowledge Discovery from Data ◽

10.1145/3434767 ◽

2021 ◽

Vol 15 (2) ◽

pp. 1-21

Author(s):

Yi Zhu ◽

Lei Li ◽

Xindong Wu

Keyword(s):

Deep Learning ◽

Image Data ◽

Representation Learning ◽

Classification Performance ◽

Support Vector ◽

Learning Models ◽

Feature Representations ◽

Learning Framework ◽

Label Information ◽

Unsupervised Deep Learning

Deep learning seeks to achieve excellent performance for representation learning in image datasets. However, supervised deep learning models such as convolutional neural networks require a large number of labeled image data, which is intractable in applications, while unsupervised deep learning models like stacked denoising auto-encoder cannot employ label information. Meanwhile, the redundancy of image data incurs performance degradation on representation learning for aforementioned models. To address these problems, we propose a semi-supervised deep learning framework called stacked convolutional sparse auto-encoder, which can learn robust and sparse representations from image data with fewer labeled data records. More specifically, the framework is constructed by stacking layers. In each layer, higher layer feature representations are generated by features of lower layers in a convolutional way with kernels learned by a sparse auto-encoder. Meanwhile, to solve the data redundance problem, the algorithm of Reconstruction Independent Component Analysis is designed to train on patches for sphering the input data. The label information is encoded using a Softmax Regression model for semi-supervised learning. With this framework, higher level representations are learned by layers mapping from image data. It can boost the performance of the base subsequent classifiers such as support vector machines. Extensive experiments demonstrate the superior classification performance of our framework compared to several state-of-the-art representation learning methods.

Download Full-text