Improving Registration of Augmented Reality by Incorporating DCNNS into Visual SLAM

Augmented reality (AR) by analyzing the characteristics of the scene, the computer-generated geometric information which can be added to the real environment in the way of visual fusion, reinforces the perception of the world. Three-dimensional (3D) registration is one of the core issues of in AR. The key issue is to estimate the visual sensor’s posture in the 3D environment and figure out the objects in the scene. Recently, computer vision has made significant progress, but the registration based on natural feature points in 3D space for AR system is still a severe problem. There is the difficulty of working out the mobile camera’s posture in the 3D scene precisely due to the unstable factors, such as the image noise, changing light and the complex background pattern. Therefore, to design a stable, reliable and efficient scene recognition algorithm is still very challenging work. In this paper, we propose an algorithm which combines Visual Simultaneous Localization and Mapping (SLAM) and Deep Convolutional Neural Networks (DCNNS) to boost the performance of AR registration. Semantic segmentation is a dense prediction task which aims to predict categories for each pixel in an image when applying to AR registration, and it will be able to narrow the searching range of the feature point between the two frames thus enhancing the stability of the system. Comparative experiments in this paper show that the semantic scene information will bring a revolutionary breakthrough to the AR interaction.

Download Full-text

Boosting Multilabel Semantic Segmentation for Somata and Vessels in Mouse Brain

Frontiers in Neuroscience ◽

10.3389/fnins.2021.610122 ◽

2021 ◽

Vol 15 ◽

Author(s):

Xinglong Wu ◽

Yuhang Tao ◽

Guangzhi He ◽

Dun Liu ◽

Meiling Fan ◽

...

Keyword(s):

Neural Networks ◽

Mouse Brain ◽

Light And Electron Microscopy ◽

Three Dimensional ◽

Image Data ◽

Semantic Segmentation ◽

Poor Quality ◽

Deep Convolutional Neural Networks ◽

Segmentation Task

Deep convolutional neural networks (DCNNs) are widely utilized for the semantic segmentation of dense nerve tissues from light and electron microscopy (EM) image data; the goal of this technique is to achieve efficient and accurate three-dimensional reconstruction of the vasculature and neural networks in the brain. The success of these tasks heavily depends on the amount, and especially the quality, of the human-annotated labels fed into DCNNs. However, it is often difficult to acquire the gold standard of human-annotated labels for dense nerve tissues; human annotations inevitably contain discrepancies or even errors, which substantially impact the performance of DCNNs. Thus, a novel boosting framework consisting of a DCNN for multilabel semantic segmentation with a customized Dice-logarithmic loss function, a fusion module combining the annotated labels and the corresponding predictions from the DCNN, and a boosting algorithm to sequentially update the sample weights during network training iterations was proposed to systematically improve the quality of the annotated labels; this framework eventually resulted in improved segmentation task performance. The microoptical sectioning tomography (MOST) dataset was then employed to assess the effectiveness of the proposed framework. The result indicated that the framework, even trained with a dataset including some poor-quality human-annotated labels, achieved state-of-the-art performance in the segmentation of somata and vessels in the mouse brain. Thus, the proposed technique of artificial intelligence could advance neuroscience research.

Download Full-text

SEMANTIC SCENE UNDERSTANDING FOR THE AUTONOMOUS PLATFORM

ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-archives-xliii-b2-2020-637-2020 ◽

2020 ◽

Vol XLIII-B2-2020 ◽

pp. 637-644

Author(s):

B. Vishnyakov ◽

Y. Blokhinov ◽

I. Sgibnev ◽

V. Sheverdin ◽

A. Sorokin ◽

...

Keyword(s):

Data Collection ◽

Autonomous Vehicles ◽

Three Dimensional ◽

Scene Understanding ◽

Semantic Segmentation ◽

Lidar Data ◽

Point Cloud Segmentation ◽

Detection Point ◽

Sensor Platform ◽

Semantic Scene

Abstract. In this paper we describe a new multi-sensor platform for data collection and algorithm testing. We propose a couple of methods for solution of semantic scene understanding problem for land autonomous vehicles. We describe our approaches for automatic camera and LiDAR calibration; three-dimensional scene reconstruction and odometry calculation; semantic segmentation that provides obstacle recognition and underlying surface classification; object detection; point cloud segmentation. Also, we describe our virtual simulation complex based on Unreal Engine, that can be used for both data collection and algorithm testing. We collected a large database of field and virtual data: more than 1,000,000 real images with corresponding LiDAR data and more than 3,500,000 simulated images with corresponding LiDAR data. All proposed methods were implemented and tested on our autonomous platform; accuracy estimates were obtained on the collected database.

Download Full-text

REAL-TIME SEMANTIC SLAM WITH DCNN-BASED FEATURE POINT DETECTION, MATCHING AND DENSE POINT CLOUD AGGREGATION

ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-archives-xliii-b2-2021-399-2021 ◽

2021 ◽

Vol XLIII-B2-2021 ◽

pp. 399-404

Author(s):

B. Vishnyakov ◽

I. Sgibnev ◽

V. Sheverdin ◽

A. Sorokin ◽

P. Masalov ◽

...

Keyword(s):

Neural Networks ◽

Real Time ◽

Semantic Segmentation ◽

Autonomous Driving ◽

Scene Reconstruction ◽

Deep Convolutional Neural Networks ◽

Dense Point ◽

Robotic Vehicle ◽

Semantic Scene ◽

Point Detection

Abstract. In this paper we present the semantic SLAM method based on a bundle of deep convolutional neural networks. It provides real-time dense semantic scene reconstruction for the autonomous driving system of an off-road robotic vehicle. Most state-of-the-art neural networks require large computing resources that go beyond the capabilities of many robotic platforms. We propose an architecture for 3D semantic scene reconstruction on top of the recent progress in computer vision by integrating SuperPoint, SuperGlue, Bi3D, DeepLabV3+, RTM3D and additional module with pre-processing, inference and postprocessing operations performed on GPU. We also updated our simulated dataset for semantic segmentation and added disparity images.

Download Full-text

Improving 3D-3D facial registration methods: potential role of three-dimensional models in personal identification of the living

International Journal of Legal Medicine ◽

10.1007/s00414-021-02655-3 ◽

2021 ◽

Author(s):

Daniele Gibelli ◽

Andrea Palamenghi ◽

Pasquale Poppa ◽

Chiarella Sforza ◽

Cristina Cattaneo ◽

...

Keyword(s):

Three Dimensional ◽

3D Models ◽

Personal Identification ◽

Surveillance Systems ◽

3D Registration ◽

Three Dimensional Models ◽

Point To Point ◽

Point Distance ◽

Male Subjects

AbstractPersonal identification of the living from video surveillance systems usually involves 2D images. However, the potentiality of three-dimensional facial models in gaining personal identification through 3D-3D comparison still needs to be verified. This study aims at testing the reliability of a protocol for 3D-3D registration of facial models, potentially useful for personal identification. Fifty male subjects aged between 18 and 45 years were randomly chosen from a database of 3D facial models acquired through stereophotogrammetry. For each subject, two acquisitions were available; the 3D models of faces were then registered onto other models belonging to the same and different individuals according to the least point-to-point distance on the entire facial surface, for a total of 50 matches and 50 mismatches. RMS value (root mean square) of point-to-point distance between the two models was then calculated through the VAM® software. Intra- and inter-observer errors were assessed through calculation of relative technical error of measurement (rTEM). Possible statistically significant differences between matches and mismatches were assessed through Mann–Whitney test (p < 0.05). Both for intra- and inter-observer repeatability rTEM was between 2.2 and 5.2%. Average RMS point-to-point distance was 0.50 ± 0.28 mm in matches, 2.62 ± 0.56 mm in mismatches (p < 0.01). An RMS threshold of 1.50 mm could distinguish matches and mismatches in 100% of cases. This study provides an improvement to existing 3D-3D superimposition methods and confirms the great advantages which may derive to personal identification of the living from 3D facial analysis.

Download Full-text

Virtual and augmented reality in anatomy education: Need for comparison with other three-dimensional visualization methods

Morphologie ◽

10.1016/j.morpho.2021.02.006 ◽

2021 ◽

Author(s):

G.P. Skandalakis ◽

D. Chytas ◽

G. Paraskevas ◽

G. Noussios ◽

M. Salmas ◽

...

Keyword(s):

Augmented Reality ◽

Three Dimensional ◽

Anatomy Education ◽

Virtual And Augmented Reality

Download Full-text

A Novel Suture Training System for Open Surgery Replicating Procedures Performed by Experts Using Augmented Reality

Journal of Medical Systems ◽

10.1007/s10916-021-01735-6 ◽

2021 ◽

Vol 45 (5) ◽

Author(s):

Yuri Nagayo ◽

Toki Saito ◽

Hiroshi Oyama

Keyword(s):

Augmented Reality ◽

Motion Capture ◽

Open Surgery ◽

Three Dimensional ◽

Training System ◽

Surgical Skills ◽

Surgical Instruments ◽

Interrupted Suture ◽

Education Environment ◽

Surgical Field

AbstractThe surgical education environment has been changing significantly due to restricted work hours, limited resources, and increasing public concern for safety and quality, leading to the evolution of simulation-based training in surgery. Of the various simulators, low-fidelity simulators are widely used to practice surgical skills such as sutures because they are portable, inexpensive, and easy to use without requiring complicated settings. However, since low-fidelity simulators do not offer any teaching information, trainees do self-practice with them, referring to textbooks or videos, which are insufficient to learn open surgical procedures. This study aimed to develop a new suture training system for open surgery that provides trainees with the three-dimensional information of exemplary procedures performed by experts and allows them to observe and imitate the procedures during self-practice. The proposed system consists of a motion capture system of surgical instruments and a three-dimensional replication system of captured procedures on the surgical field. Motion capture of surgical instruments was achieved inexpensively by using cylindrical augmented reality (AR) markers, and replication of captured procedures was realized by visualizing them three-dimensionally at the same position and orientation as captured, using an AR device. For subcuticular interrupted suture, it was confirmed that the proposed system enabled users to observe experts’ procedures from any angle and imitate them by manipulating the actual surgical instruments during self-practice. We expect that this training system will contribute to developing a novel surgical training method that enables trainees to learn surgical skills by themselves in the absence of experts.

Download Full-text

A Deep Learning based Scene Recognition Algorithm for Indoor Localization

2021 International Conference on Artificial Intelligence in Information and Communication (ICAIIC) ◽

10.1109/icaiic51459.2021.9415278 ◽

2021 ◽

Author(s):

Boney Labinghisa ◽

Dong Myung Lee

Keyword(s):

Deep Learning ◽

Indoor Localization ◽

Recognition Algorithm ◽

Scene Recognition

Download Full-text

Fast Nonsupervised 3D Registration of PET and MR Images of the Brain

Journal of Cerebral Blood Flow & Metabolism ◽

10.1038/jcbfm.1994.96 ◽

1994 ◽

Vol 14 (5) ◽

pp. 749-762 ◽

Cited By ~ 66

Author(s):

Jean-François Mangin ◽

Vincent Frouin ◽

Isabelle Bloch ◽

Bernard Bendriem ◽

Jaime Lopez-Krahe

Keyword(s):

Three Dimensional ◽

Magnetic Resonance Images ◽

Surface Matching ◽

3D Registration ◽

Matching Algorithm ◽

Distance Map ◽

Brain Surface ◽

Discrete Surfaces ◽

Positron Emission ◽

The Brain

We propose a fully nonsupervised methodology dedicated to the fast registration of positron emission tomography (PET) and magnetic resonance images of the brain. First, discrete representations of the surfaces of interest (head or brain surface) are automatically extracted from both images. Then, a shape-independent surface-matching algorithm gives a rigid body transformation, which allows the transfer of information between both modalities. A three-dimensional (3D) extension of the chamfer-matching principle makes up the core of this surface-matching algorithm. The optimal transformation is inferred from the minimization of a quadratic generalized distance between discrete surfaces, taking into account between-modality differences in the localization of the segmented surfaces. The minimization process is efficiently performed via the precomputation of a 3D distance map. Validation studies using a dedicated brain-shaped phantom have shown that the maximum registration error was of the order of the PET pixel size (2 mm) for the wide variety of tested configurations. The software is routinely used today in a clinical context by the physicians of the Service Hospitalier Frédéric Joliot (>150 registrations performed). The entire registration process requires ∼5 min on a conventional workstation.

Download Full-text

Three-dimensional augmented reality transperitoneal robot assisted partial nephrectomy (3d ar-rapn): A new tool to identify the hidden tumours

European Urology Supplements ◽

10.1016/s1569-9056(19)32789-7 ◽

2019 ◽

Vol 18 (6) ◽

pp. e2690 ◽

Cited By ~ 2

Author(s):

F. Porpiglia ◽

E. Checcucci ◽

D. Amparore ◽

F. Piramide ◽

P. Verri ◽

...

Keyword(s):

Augmented Reality ◽

Partial Nephrectomy ◽

Three Dimensional ◽

Robot Assisted

Download Full-text

Real-time deep learning semantic segmentation during intra-operative surgery for 3D augmented reality assistance

International Journal of Computer Assisted Radiology and Surgery ◽

10.1007/s11548-021-02432-y ◽

2021 ◽

Author(s):

Leonardo Tanzi ◽

Pietro Piazzolla ◽

Francesco Porpiglia ◽

Enrico Vezzetti

Keyword(s):

Neural Network ◽

Deep Learning ◽

Augmented Reality ◽

Ad Hoc ◽

Geodesic Distance ◽

Semantic Segmentation ◽

Endoscopic Image ◽

Operative Surgery ◽

Endoscopic Videos

Abstract Purpose The current study aimed to propose a Deep Learning (DL) and Augmented Reality (AR) based solution for a in-vivo robot-assisted radical prostatectomy (RARP), to improve the precision of a published work from our group. We implemented a two-steps automatic system to align a 3D virtual ad-hoc model of a patient’s organ with its 2D endoscopic image, to assist surgeons during the procedure. Methods This approach was carried out using a Convolutional Neural Network (CNN) based structure for semantic segmentation and a subsequent elaboration of the obtained output, which produced the needed parameters for attaching the 3D model. We used a dataset obtained from 5 endoscopic videos (A, B, C, D, E), selected and tagged by our team’s specialists. We then evaluated the most performing couple of segmentation architecture and neural network and tested the overlay performances. Results U-Net stood out as the most effecting architectures for segmentation. ResNet and MobileNet obtained similar Intersection over Unit (IoU) results but MobileNet was able to elaborate almost twice operations per seconds. This segmentation technique outperformed the results from the former work, obtaining an average IoU for the catheter of 0.894 (σ = 0.076) compared to 0.339 (σ = 0.195). This modifications lead to an improvement also in the 3D overlay performances, in particular in the Euclidean Distance between the predicted and actual model’s anchor point, from 12.569 (σ= 4.456) to 4.160 (σ = 1.448) and in the Geodesic Distance between the predicted and actual model’s rotations, from 0.266 (σ = 0.131) to 0.169 (σ = 0.073). Conclusion This work is a further step through the adoption of DL and AR in the surgery domain. In future works, we will overcome the limits of this approach and finally improve every step of the surgical procedure.

Download Full-text