Robust Cylindrical Panorama Stitching for Low-Texture Scenes Based on Image Alignment Using Deep Learning and Iterative Optimization

Cylindrical panorama stitching is able to generate high resolution images of a scene with a wide field-of-view (FOV), making it a useful scene representation for applications like environmental sensing and robot localization. Traditional image stitching methods based on hand-crafted features are effective for constructing a cylindrical panorama from a sequence of images in the case when there are sufficient reliable features in the scene. However, these methods are unable to handle low-texture environments where no reliable feature correspondence can be established. This paper proposes a novel two-step image alignment method based on deep learning and iterative optimization to address the above issue. In particular, a light-weight end-to-end trainable convolutional neural network (CNN) architecture called ShiftNet is proposed to estimate the initial shifts between images, which is further optimized in a sub-pixel refinement procedure based on a specified camera motion model. Extensive experiments on a synthetic dataset, rendered photo-realistic images, and real images were carried out to evaluate the performance of our proposed method. Both qualitative and quantitative experimental results demonstrate that cylindrical panorama stitching based on our proposed image alignment method leads to significant improvements over traditional feature based methods and recent deep learning based methods for challenging low-texture environments.

Download Full-text

A Heterogeneous Feature-based Image Alignment Method

18th International Conference on Pattern Recognition (ICPR'06) ◽

10.1109/icpr.2006.77 ◽

2006 ◽

Author(s):

Cen Rao ◽

Yanlin Guo ◽

H. Sawhney ◽

R. Kumar

Keyword(s):

Image Alignment ◽

Alignment Method ◽

Feature Based ◽

Heterogeneous Feature

Download Full-text

Human emotion recognition based on facial expressions via deep learning on high-resolution images

Multimedia Tools and Applications ◽

10.1007/s11042-021-10918-9 ◽

2021 ◽

Author(s):

Yahia Said ◽

Mohammad Barr

Keyword(s):

Deep Learning ◽

High Resolution ◽

Emotion Recognition ◽

Facial Expressions ◽

Human Emotion ◽

High Resolution Images

Download Full-text

Interpretable deep learning for the remote characterisation of ambulation in multiple sclerosis using smartphones

Scientific Reports ◽

10.1038/s41598-021-92776-x ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Andrew P. Creagh ◽

Florian Lipsmeier ◽

Michael Lindemann ◽

Maarten De Vos

Keyword(s):

Multiple Sclerosis ◽

Deep Learning ◽

Inertial Sensor ◽

Heterogeneous Data ◽

Fine Tuning ◽

Sensor Data ◽

Support Vector ◽

Deep Convolutional Neural Networks ◽

Healthcare Applications ◽

Feature Based

AbstractThe emergence of digital technologies such as smartphones in healthcare applications have demonstrated the possibility of developing rich, continuous, and objective measures of multiple sclerosis (MS) disability that can be administered remotely and out-of-clinic. Deep Convolutional Neural Networks (DCNN) may capture a richer representation of healthy and MS-related ambulatory characteristics from the raw smartphone-based inertial sensor data than standard feature-based methodologies. To overcome the typical limitations associated with remotely generated health data, such as low subject numbers, sparsity, and heterogeneous data, a transfer learning (TL) model from similar large open-source datasets was proposed. Our TL framework leveraged the ambulatory information learned on human activity recognition (HAR) tasks collected from wearable smartphone sensor data. It was demonstrated that fine-tuning TL DCNN HAR models towards MS disease recognition tasks outperformed previous Support Vector Machine (SVM) feature-based methods, as well as DCNN models trained end-to-end, by upwards of 8–15%. A lack of transparency of “black-box” deep networks remains one of the largest stumbling blocks to the wider acceptance of deep learning for clinical applications. Ensuing work therefore aimed to visualise DCNN decisions attributed by relevance heatmaps using Layer-Wise Relevance Propagation (LRP). Through the LRP framework, the patterns captured from smartphone-based inertial sensor data that were reflective of those who are healthy versus people with MS (PwMS) could begin to be established and understood. Interpretations suggested that cadence-based measures, gait speed, and ambulation-related signal perturbations were distinct characteristics that distinguished MS disability from healthy participants. Robust and interpretable outcomes, generated from high-frequency out-of-clinic assessments, could greatly augment the current in-clinic assessment picture for PwMS, to inform better disease management techniques, and enable the development of better therapeutic interventions.

Download Full-text

Two Ensemble-CNN Approaches for Colorectal Cancer Tissue Type Classification

Journal of Imaging ◽

10.3390/jimaging7030051 ◽

2021 ◽

Vol 7 (3) ◽

pp. 51

Author(s):

Emanuela Paladini ◽

Edoardo Vantaggiato ◽

Fares Bougourzi ◽

Cosimo Distante ◽

Abdenour Hadid ◽

...

Keyword(s):

Colorectal Cancer ◽

Deep Learning ◽

Digital Pathology ◽

Texture Features ◽

Tissue Type ◽

Cancer Tissue ◽

Learning Methods ◽

Feature Based ◽

Type Classification ◽

Whole Slide Images

In recent years, automatic tissue phenotyping has attracted increasing interest in the Digital Pathology (DP) field. For Colorectal Cancer (CRC), tissue phenotyping can diagnose the cancer and differentiate between different cancer grades. The development of Whole Slide Images (WSIs) has provided the required data for creating automatic tissue phenotyping systems. In this paper, we study different hand-crafted feature-based and deep learning methods using two popular multi-classes CRC-tissue-type databases: Kather-CRC-2016 and CRC-TP. For the hand-crafted features, we use two texture descriptors (LPQ and BSIF) and their combination. In addition, two classifiers are used (SVM and NN) to classify the texture features into distinct CRC tissue types. For the deep learning methods, we evaluate four Convolutional Neural Network (CNN) architectures (ResNet-101, ResNeXt-50, Inception-v3, and DenseNet-161). Moreover, we propose two Ensemble CNN approaches: Mean-Ensemble-CNN and NN-Ensemble-CNN. The experimental results show that the proposed approaches outperformed the hand-crafted feature-based methods, CNN architectures and the state-of-the-art methods in both databases.

Download Full-text

Detection of Diabetic Retinopathy from Ultra-Wide Field Scanning Laser Ophthalmoscope Images: A Multi-Center Deep-Learning Analysis

Ophthalmology Retina ◽

10.1016/j.oret.2021.01.013 ◽

2021 ◽

Author(s):

Fangyao Tang ◽

Phoomraphee Luenam ◽

An Ran Ran ◽

Ahmed Abdul Quadeer ◽

Rajiv Raman ◽

...

Keyword(s):

Diabetic Retinopathy ◽

Deep Learning ◽

Scanning Laser Ophthalmoscope ◽

Scanning Laser ◽

Wide Field ◽

Learning Analysis

Download Full-text

Facing Erosion Identification in Railway Lines Using Pixel-Wise Deep-Based Approaches

Remote Sensing ◽

10.3390/rs12040739 ◽

2020 ◽

Vol 12 (4) ◽

pp. 739

Author(s):

Keiller Nogueira ◽

Gabriel L. S. Machado ◽

Pedro H. T. Gama ◽

Caio C. V. da Silva ◽

Remis Balaniuk ◽

...

Keyword(s):

Machine Learning ◽

Deep Learning ◽

High Impact ◽

Automatic Machine ◽

Feature Representation ◽

Data Driven ◽

Maintenance Costs ◽

Crucial Step ◽

Machine Learning Methods ◽

High Resolution Images

Soil erosion is considered one of the most expensive natural hazards with a high impact on several infrastructure assets. Among them, railway lines are one of the most likely constructions for the appearance of erosion and, consequently, one of the most troublesome due to the maintenance costs, risks of derailments, and so on. Therefore, it is fundamental to identify and monitor erosion in railway lines to prevent major consequences. Currently, erosion identification is manually performed by humans using huge image sets, a time-consuming and slow task. Hence, automatic machine learning methods appear as an appealing alternative. A crucial step for automatic erosion identification is to create a good feature representation. Towards such objective, deep learning can learn data-driven features and classifiers. In this paper, we propose a novel deep learning-based framework capable of performing erosion identification in railway lines. Six techniques were evaluated and the best one, Dynamic Dilated ConvNet, was integrated into this framework that was then encapsulated into a new ArcGIS plugin to facilitate its use by non-programmer users. To analyze such techniques, we also propose a new dataset, composed of almost 2000 high-resolution images.

Download Full-text

Performance Evaluation of Single-Label and Multi-Label Remote Sensing Image Retrieval Using a Dense Labeling Dataset

Remote Sensing ◽

10.3390/rs10060964 ◽

2018 ◽

Vol 10 (6) ◽

pp. 964 ◽

Cited By ~ 34

Author(s):

Zhenfeng Shao ◽

Ke Yang ◽

Weixun Zhou

Keyword(s):

Remote Sensing ◽

Performance Evaluation ◽

Deep Learning ◽

Image Retrieval ◽

Semantic Segmentation ◽

Semantic Content ◽

Remote Sensing Image ◽

Remote Sensing Images ◽

Benchmark Datasets ◽

Feature Based

Benchmark datasets are essential for developing and evaluating remote sensing image retrieval (RSIR) approaches. However, most of the existing datasets are single-labeled, with each image in these datasets being annotated by a single label representing the most significant semantic content of the image. This is sufficient for simple problems, such as distinguishing between a building and a beach, but multiple labels and sometimes even dense (pixel) labels are required for more complex problems, such as RSIR and semantic segmentation.We therefore extended the existing multi-labeled dataset collected for multi-label RSIR and presented a dense labeling remote sensing dataset termed "DLRSD". DLRSD contained a total of 17 classes, and the pixels of each image were assigned with 17 pre-defined labels. We used DLRSD to evaluate the performance of RSIR methods ranging from traditional handcrafted feature-based methods to deep learning-based ones. More specifically, we evaluated the performances of RSIR methods from both single-label and multi-label perspectives. These results demonstrated the advantages of multiple labels over single labels for interpreting complex remote sensing images. DLRSD provided the literature a benchmark for RSIR and other pixel-based problems such as semantic segmentation.

Download Full-text

Combining Deep Learning and (Structural) Feature-Based Classification Methods for Copyright-Protected PDF Documents

Artificial Neural Networks and Machine Learning – ICANN 2019: Text and Time Series - Lecture Notes in Computer Science ◽

10.1007/978-3-030-30490-4_7 ◽

2019 ◽

pp. 69-75

Author(s):

Renato Garita Figueiredo ◽

Kai-Uwe Kühnberger ◽

Gordon Pipa ◽

Tobias Thelen

Keyword(s):

Deep Learning ◽

Structural Feature ◽

Classification Methods ◽

Feature Based ◽

Feature Based Classification

Download Full-text

Accelerating Super-Resolution and Visual Task Analysis in Medical Images

Applied Sciences ◽

10.3390/app10124282 ◽

2020 ◽

Vol 10 (12) ◽

pp. 4282

Author(s):

Ghada Zamzmi ◽

Sivaramakrishnan Rajaraman ◽

Sameer Antani

Keyword(s):

Deep Learning ◽

High Resolution ◽

Task Analysis ◽

Multiple Scales ◽

Medical Images ◽

Computational Cost ◽

Super Resolution ◽

Visual Task ◽

Learning Networks ◽

High Resolution Images

Medical images are acquired at different resolutions based on clinical goals or available technology. In general, however, high-resolution images with fine structural details are preferred for visual task analysis. Recognizing this significance, several deep learning networks have been proposed to enhance medical images for reliable automated interpretation. These deep networks are often computationally complex and require a massive number of parameters, which restrict them to highly capable computing platforms with large memory banks. In this paper, we propose an efficient deep learning approach, called Hydra, which simultaneously reduces computational complexity and improves performance. The Hydra consists of a trunk and several computing heads. The trunk is a super-resolution model that learns the mapping from low-resolution to high-resolution images. It has a simple architecture that is trained using multiple scales at once to minimize a proposed learning-loss function. We also propose to append multiple task-specific heads to the trained Hydra trunk for simultaneous learning of multiple visual tasks in medical images. The Hydra is evaluated on publicly available chest X-ray image collections to perform image enhancement, lung segmentation, and abnormality classification. Our experimental results support our claims and demonstrate that the proposed approach can improve the performance of super-resolution and visual task analysis in medical images at a remarkably reduced computational cost.

Download Full-text

Comparing word embedding models for Arabic aspect category detection using a deep learning-based approach

E3S Web of Conferences ◽

10.1051/e3sconf/202129701072 ◽

2021 ◽

Vol 297 ◽

pp. 01072

Author(s):

Rajae Bensoltane ◽

Taher Zaki

Keyword(s):

Deep Learning ◽

Vector Representation ◽

Rule Based ◽

External Resources ◽

Unit Model ◽

Proposed Model ◽

Feature Based ◽

The Impact ◽

Gated Recurrent Unit ◽

Machine Learning Models

Aspect category detection (ACD) is a task of aspect-based sentiment analysis (ABSA) that aims to identify the discussed category in a given review or sentence from a predefined list of categories. ABSA tasks were widely studied in English; however, studies in other low-resource languages such as Arabic are still limited. Moreover, most of the existing Arabic ABSA work is based on rule-based or feature-based machine learning models, which require a tedious task of feature-engineering and the use of external resources like lexicons. Therefore, the aim of this paper is to overcome these shortcomings by handling the ACD task using a deep learning method based on a bidirectional gated recurrent unit model. Additionally, we examine the impact of using different vector representation models on the performance of the proposed model. The experimental results show that our model outperforms the baseline and related work models significantly by achieving an enhanced F1-score of more than 7%.

Download Full-text