From Auto-encoders to Capsule Networks: A Survey

Convolutional Neural Networks are a very powerful Deep Learning structure used in image processing, object classification and segmentation. They are very robust in extracting features from data and largely used in several domains. Nonetheless, they require a large number of training datasets and relations between features get lost in the Max-pooling step, which can lead to a wrong classification. Capsule Networks(CapsNets) were introduced to overcome these limitations by extracting features and their pose using capsules instead of neurons. This technique shows an impressive performance in one-dimensional, two-dimensional and three-dimensional datasets as well as in sparse datasets. In this paper, we present an initial understanding of CapsNets, their concept, structure and learning algorithm. We introduce the progress made by CapsNets from their introduction in 2011 until 2020. We compare different CapsNets series architectures to demonstrate strengths and challenges. Finally, we quote different implementations of Capsule Networks and show their robustness in a variety of domains. This survey provides the state-of-theartof Capsule Networks and allows other researchers to get a clear view of this new field. Besides, we discuss the open issues and the promising directions of future research, which may lead to a new generation of CapsNets.

Download Full-text

From Auto-encoders to Capsule Networks: A Survey

E3S Web of Conferences ◽

10.1051/e3sconf/202122901048 ◽

2021 ◽

Vol 229 ◽

pp. 01048

Author(s):

Omaima El Alaoui-Elfels ◽

Taoufiq Gadi

Keyword(s):

Neural Networks ◽

Learning Algorithm ◽

State Of The Art ◽

Three Dimensional ◽

Future Research ◽

Two Dimensional ◽

One Dimensional ◽

Deep Learning Algorithm ◽

New Generation ◽

Open Issues

Convolutional Neural Networks are a very powerful Deep Learning algorithm used in image processing, object classification and segmentation. They are very robust in extracting features from data and largely used in several domains. Nonetheless, they require a large number of training datasets and relations between features get lost in the Max-pooling step, which can lead to a wrong classification. Capsule Networks (CapsNets) were introduced to overcome these limitations by extracting features and their pose using capsules instead of neurons. This technique shows an impressive performance in one-dimensional, two-dimensional and three-dimensional datasets as well as in sparse datasets. In this paper, we present an initial understanding of CapsNets, their concept, structure and learning algorithm. We introduce the progress made by CapsNets from their introduction in 2011 until 2020. We compare different CapsNets series to demonstrate strengths and challenges. Finally, we quote different implementations of Capsule Networks and show their robustness in a variety of domains. This survey provides the state-of-the-art of Capsule Networks and allows other researchers to get a clear view of this new field. Besides, we discuss the open issues and the promising directions of future research, which may lead to a new generation of CapsNets.

Download Full-text

3D Mesh Model Classification with a Capsule Network

Algorithms ◽

10.3390/a14030099 ◽

2021 ◽

Vol 14 (3) ◽

pp. 99

Author(s):

Yang Zheng ◽

Jieyu Zhao ◽

Yu Chen ◽

Chen Tang ◽

Shushi Yu

Keyword(s):

Neural Networks ◽

Deep Learning ◽

Traditional Method ◽

Three Dimensional ◽

Learning Method ◽

Two Dimensional ◽

3D Mesh ◽

Mesh Model ◽

3D Mesh Model ◽

Dimensional Object

With the widespread success of deep learning in the two-dimensional field, how to apply deep learning methods from two-dimensional to three-dimensional field has become a current research hotspot. Among them, the polygon mesh structure in the three-dimensional representation as a complex data structure provides an effective shape approximate representation for the three-dimensional object. Although the traditional method can extract the characteristics of the three-dimensional object through the graphical method, it cannot be applied to more complex objects. However, due to the complexity and irregularity of the mesh data, it is difficult to directly apply convolutional neural networks to 3D mesh data processing. Considering this problem, we propose a deep learning method based on a capsule network to effectively classify mesh data. We first design a polynomial convolution template. Through a sliding operation similar to a two-dimensional image convolution window, we directly sample on the grid surface, and use the window sampling surface as the minimum unit of calculation. Because a high-order polynomial can effectively represent a surface, we fit the approximate shape of the surface through the polynomial, use the polynomial parameter as the shape feature of the surface, and add the center point coordinates and normal vector of the surface as the pose feature of the surface. The feature is used as the feature vector of the surface. At the same time, to solve the problem of the introduction of a large number of pooling layers in traditional convolutional neural networks, the capsule network is introduced. For the problem of nonuniform size of the input grid model, the capsule network attitude parameter learning method is improved by sharing the weight of the attitude matrix. The amount of model parameters is reduced, and the training efficiency of the 3D mesh model is further improved. The experiment is compared with the traditional method and the latest two methods on the SHREC15 data set. Compared with the MeshNet and MeshCNN, the average recognition accuracy in the original test set is improved by 3.4% and 2.1%, and the average after fusion of features the accuracy reaches 93.8%. At the same time, under the premise of short training time, this method can also achieve considerable recognition results through experimental verification. The three-dimensional mesh classification method proposed in this paper combines the advantages of graphics and deep learning methods, and effectively improves the classification effect of 3D mesh model.

Download Full-text

DeepSynth: Three-dimensional nuclear segmentation of biological images using neural networks trained with synthetic data

Scientific Reports ◽

10.1038/s41598-019-54244-5 ◽

2019 ◽

Vol 9 (1) ◽

Cited By ~ 8

Author(s):

Kenneth W. Dunn ◽

Chichen Fu ◽

David Joon Ho ◽

Soonam Lee ◽

Shuo Han ◽

...

Keyword(s):

Neural Networks ◽

Image Processing ◽

Deep Learning ◽

Three Dimensional ◽

Synthetic Data ◽

Training Data ◽

Three Dimensions ◽

Great Promise ◽

Biological Images ◽

Nuclear Segmentation

AbstractThe scale of biological microscopy has increased dramatically over the past ten years, with the development of new modalities supporting collection of high-resolution fluorescence image volumes spanning hundreds of microns if not millimeters. The size and complexity of these volumes is such that quantitative analysis requires automated methods of image processing to identify and characterize individual cells. For many workflows, this process starts with segmentation of nuclei that, due to their ubiquity, ease-of-labeling and relatively simple structure, make them appealing targets for automated detection of individual cells. However, in the context of large, three-dimensional image volumes, nuclei present many challenges to automated segmentation, such that conventional approaches are seldom effective and/or robust. Techniques based upon deep-learning have shown great promise, but enthusiasm for applying these techniques is tempered by the need to generate training data, an arduous task, particularly in three dimensions. Here we present results of a new technique of nuclear segmentation using neural networks trained on synthetic data. Comparisons with results obtained using commonly-used image processing packages demonstrate that DeepSynth provides the superior results associated with deep-learning techniques without the need for manual annotation.

Download Full-text

Comparison of Two-Dimensional- and Three-Dimensional-Based U-Net Architectures for Brain Tissue Classification in One-Dimensional Brain CT

Frontiers in Computational Neuroscience ◽

10.3389/fncom.2021.785244 ◽

2022 ◽

Vol 15 ◽

Author(s):

Meera Srikrishna ◽

Rolf A. Heckemann ◽

Joana B. Pereira ◽

Giovanni Volpe ◽

Anna Zettergren ◽

...

Keyword(s):

Deep Learning ◽

Brain Tissue ◽

Three Dimensional ◽

Two Dimensional ◽

3D Segmentation ◽

Tissue Classification ◽

Tissue Segmentation ◽

One Dimensional ◽

Ct Brain ◽

2D And 3D

Brain tissue segmentation plays a crucial role in feature extraction, volumetric quantification, and morphometric analysis of brain scans. For the assessment of brain structure and integrity, CT is a non-invasive, cheaper, faster, and more widely available modality than MRI. However, the clinical application of CT is mostly limited to the visual assessment of brain integrity and exclusion of copathologies. We have previously developed two-dimensional (2D) deep learning-based segmentation networks that successfully classified brain tissue in head CT. Recently, deep learning-based MRI segmentation models successfully use patch-based three-dimensional (3D) segmentation networks. In this study, we aimed to develop patch-based 3D segmentation networks for CT brain tissue classification. Furthermore, we aimed to compare the performance of 2D- and 3D-based segmentation networks to perform brain tissue classification in anisotropic CT scans. For this purpose, we developed 2D and 3D U-Net-based deep learning models that were trained and validated on MR-derived segmentations from scans of 744 participants of the Gothenburg H70 Cohort with both CT and T1-weighted MRI scans acquired timely close to each other. Segmentation performance of both 2D and 3D models was evaluated on 234 unseen datasets using measures of distance, spatial similarity, and tissue volume. Single-task slice-wise processed 2D U-Nets performed better than multitask patch-based 3D U-Nets in CT brain tissue classification. These findings provide support to the use of 2D U-Nets to segment brain tissue in one-dimensional (1D) CT. This could increase the application of CT to detect brain abnormalities in clinical settings.

Download Full-text

Shifting perspectives: method, media and the complex image

History and Computing ◽

10.3366/hac.1998.10.1-3.100 ◽

1998 ◽

Vol 10 (1-3) ◽

pp. 100-108 ◽

Cited By ~ 1

Author(s):

Alicia Colson ◽

Ross Parry

Keyword(s):

Image Processing ◽

Three Dimensional ◽

Geographical Location ◽

Spatial Relationship ◽

Two Dimensional ◽

Dimensional Approach ◽

Complex Image ◽

Cultural Resonance

This article argues that the analysis of a threedimensional image demanded a three-dimensional approach. The authors realise that discussions of images and image processing inveterately conceptualise representation as being flat, static, and finite. The authors recognise the need for a fresh acuteness to three-dimensionality as a meaningful – although problematic – element of visual sources. Two dramatically different examples are used to expose the shortcomings of an ingrained two-dimensional approach and to facilitate a demonstration of how modern (digital) techniques could sanction new historical/anthropological perspectives on subjects that have become all too familiar. Each example could not be more different in their temporal and geographical location, their cultural resonance, and their historiography. However, in both these visual spectacles meaning is polysemic. It is dependent upon the viewer's spatial relationship to the artifice as well as the spirito-intellectual viewer within the community. The authors postulate that the multi- faceted and multi-layered arrangement of meaning in a complex image could be assessed by working beyond the limitations of the two-dimensional methodological paradigm and by using methods and media that accommodated this type of interconnectivity and representation.

Download Full-text

CONCENTRATION ESTIMATION IN TWO-DIMENSIONAL BLUFF BODY WAKES USING IMAGE PROCESSING AND NEURAL NETWORKS

Journal of Flow Visualization and Image Processing ◽

10.1615/jflowvisimageproc.v8.i2-3.30 ◽

2001 ◽

Vol 8 (2-3) ◽

pp. 19 ◽

Cited By ~ 1

Author(s):

Murthy Balu ◽

Ram Balachandar ◽

Hugh Wood

Keyword(s):

Neural Networks ◽

Image Processing ◽

Bluff Body ◽

Two Dimensional ◽

Concentration Estimation

Download Full-text

Zero-dimensional, one-dimensional, two-dimensional and three-dimensional nanostructured materials for advanced electrochemical energy devices

Progress in Materials Science ◽

10.1016/j.pmatsci.2011.08.003 ◽

2012 ◽

Vol 57 (4) ◽

pp. 724-803 ◽

Cited By ~ 504

Author(s):

Jitendra N. Tiwari ◽

Rajanish N. Tiwari ◽

Kwang S. Kim

Keyword(s):

Nanostructured Materials ◽

Three Dimensional ◽

Two Dimensional ◽

One Dimensional ◽

Energy Devices ◽

Electrochemical Energy

Download Full-text

Three-Dimensional Thematic Map Imaging of the Yacht Port on the Example of the Polish National Sailing Centre Marina in Gdańsk

Applied Sciences ◽

10.3390/app11157016 ◽

2021 ◽

Vol 11 (15) ◽

pp. 7016

Author(s):

Pawel S. Dabrowski ◽

Cezary Specht ◽

Mariusz Specht ◽

Artur Makar

Keyword(s):

Spatial Data ◽

Convex Surface ◽

Dimensional Space ◽

Three Dimensional ◽

Future Research ◽

Two Dimensional ◽

Thematic Maps ◽

Research Directions ◽

Future Research Directions ◽

The Many

The theory of cartographic projections is a tool which can present the convex surface of the Earth on the plane. Of the many types of maps, thematic maps perform an important function due to the wide possibilities of adapting their content to current needs. The limitation of classic maps is their two-dimensional nature. In the era of rapidly growing methods of mass acquisition of spatial data, the use of flat images is often not enough to reveal the level of complexity of certain objects. In this case, it is necessary to use visualization in three-dimensional space. The motivation to conduct the study was the use of cartographic projections methods, spatial transformations, and the possibilities offered by thematic maps to create thematic three-dimensional map imaging (T3DMI). The authors presented a practical verification of the adopted methodology to create a T3DMI visualization of the marina of the National Sailing Centre of the Gdańsk University of Physical Education and Sport (Poland). The profiled characteristics of the object were used to emphasize the key elements of its function. The results confirmed the increase in the interpretative capabilities of the T3DMI method, relative to classic two-dimensional maps. Additionally, the study suggested future research directions of the presented solution.

Download Full-text

Convolutional Neural Network for the Semantic Segmentation of Remote Sensing Images

Mobile Networks and Applications ◽

10.1007/s11036-020-01703-3 ◽

2021 ◽

Vol 26 (1) ◽

pp. 200-215

Author(s):

Muhammad Alam ◽

Jian-Feng Wang ◽

Cong Guangpei ◽

LV Yunrong ◽

Yuanfang Chen

Keyword(s):

Neural Network ◽

Remote Sensing ◽

Neural Networks ◽

Image Processing ◽

Deep Learning ◽

Semantic Segmentation ◽

Natural Scene ◽

Remote Sensing Images ◽

Advantages And Disadvantages ◽

Target Segmentation

AbstractIn recent years, the success of deep learning in natural scene image processing boosted its application in the analysis of remote sensing images. In this paper, we applied Convolutional Neural Networks (CNN) on the semantic segmentation of remote sensing images. We improve the Encoder- Decoder CNN structure SegNet with index pooling and U-net to make them suitable for multi-targets semantic segmentation of remote sensing images. The results show that these two models have their own advantages and disadvantages on the segmentation of different objects. In addition, we propose an integrated algorithm that integrates these two models. Experimental results show that the presented integrated algorithm can exploite the advantages of both the models for multi-target segmentation and achieve a better segmentation compared to these two models.

Download Full-text