scholarly journals Real-to-virtual domain transfer-based depth estimation for real-time 3D annotation in transnasal surgery: a study of annotation accuracy and stability

Author(s):  
Hon-Sing Tong ◽  
Yui-Lun Ng ◽  
Zhiyu Liu ◽  
Justin D. L. Ho ◽  
Po-Ling Chan ◽  
...  

Abstract Purpose Surgical annotation promotes effective communication between medical personnel during surgical procedures. However, existing approaches to 2D annotations are mostly static with respect to a display. In this work, we propose a method to achieve 3D annotations that anchor rigidly and stably to target structures upon camera movement in a transnasal endoscopic surgery setting. Methods This is accomplished through intra-operative endoscope tracking and monocular depth estimation. A virtual endoscopic environment is utilized to train a supervised depth estimation network. An adversarial network transfers the style from the real endoscopic view to a synthetic-like view for input into the depth estimation network, wherein framewise depth can be obtained in real time. Results (1) Accuracy: Framewise depth was predicted from images captured from within a nasal airway phantom and compared with ground truth, achieving a SSIM value of 0.8310 ± 0.0655. (2) Stability: mean absolute error (MAE) between reference and predicted depth of a target point was 1.1330 ± 0.9957 mm. Conclusion Both the accuracy and stability evaluations demonstrated the feasibility and practicality of our proposed method for achieving 3D annotations.

Sensors ◽  
2019 ◽  
Vol 19 (20) ◽  
pp. 4434 ◽  
Author(s):  
Sangwon Kim ◽  
Jaeyeal Nam ◽  
Byoungchul Ko

Depth estimation is a crucial and fundamental problem in the computer vision field. Conventional methods re-construct scenes using feature points extracted from multiple images; however, these approaches require multiple images and thus are not easily implemented in various real-time applications. Moreover, the special equipment required by hardware-based approaches using 3D sensors is expensive. Therefore, software-based methods for estimating depth from a single image using machine learning or deep learning are emerging as new alternatives. In this paper, we propose an algorithm that generates a depth map in real time using a single image and an optimized lightweight efficient neural network (L-ENet) algorithm instead of physical equipment, such as an infrared sensor or multi-view camera. Because depth values have a continuous nature and can produce locally ambiguous results, pixel-wise prediction with ordinal depth range classification was applied in this study. In addition, in our method various convolution techniques are applied to extract a dense feature map, and the number of parameters is greatly reduced by reducing the network layer. By using the proposed L-ENet algorithm, an accurate depth map can be generated from a single image quickly and, in a comparison with the ground truth, we can produce depth values closer to those of the ground truth with small errors. Experiments confirmed that the proposed L-ENet can achieve a significantly improved estimation performance over the state-of-the-art algorithms in depth estimation based on a single image.


2021 ◽  
Vol 13 (14) ◽  
pp. 2778
Author(s):  
Zhengchao Lai ◽  
Fei Liu ◽  
Shangwei Guo ◽  
Xiantong Meng ◽  
Shaokun Han ◽  
...  

Using unmanned aerial vehicles (UAVs) for remote sensing has the advantages of high flexibility, convenient operation, low cost, and wide application range. It fills the need for rapid acquisition of high-resolution aerial images in modern photogrammetry applications. Due to the insufficient parallaxes and the computation-intensive process, dense real-time reconstruction for large terrain scenes is a considerable challenge. To address these problems, we proposed a novel SLAM-based MVS (Multi-View-Stereo) approach, which can incrementally generate a dense 3D (three-dimensional) model of the terrain by using the continuous image stream during the flight. The pipeline of the proposed methodology starts with pose estimation based on SLAM algorithm. The tracked frames were then selected by a novel scene-adaptive keyframe selection method to construct a sliding window frame-set. This was followed by depth estimation using a flexible search domain approach, which can improve accuracy without increasing the iterate time or memory consumption. The whole system proposed in this study was implemented on the embedded GPU based on an UAV platform. We proposed a highly parallel and memory-efficient CUDA-based depth computing architecture, enabling the system to achieve good real-time performance. The evaluation experiments were carried out in both simulation and real-world environments. A virtual large terrain scene was built using the Gazebo simulator. The simulated UAV equipped with an RGB-D camera was used to obtain synthetic evaluation datasets, which were divided by flight altitudes (800-, 1000-, 1200 m) and terrain height difference (100-, 200-, 300 m). In addition, the system has been extensively tested on various types of real scenes. Comparison with commercial 3D reconstruction software is carried out to evaluate the precision in real-world data. According to the results on the synthetic datasets, over 93.462% of the estimation with absolute error distance of less then 0.9%. In the real-world dataset captured at 800 m flight height, more than 81.27% of our estimated point cloud are less then 5 m difference with the results of Photoscan. All evaluation experiments show that the proposed approach outperforms the state-of-the-art ones in terms of accuracy and efficiency.


2016 ◽  
Vol 2016 (19) ◽  
pp. 1-6 ◽  
Author(s):  
Bart Goossens ◽  
Simon Donné ◽  
Jan Aelterman ◽  
Jonas De Vylder ◽  
Dirk Van Haerenborgh ◽  
...  

2020 ◽  
Author(s):  
Jingbai Li ◽  
Patrick Reiser ◽  
André Eberhard ◽  
Pascal Friederich ◽  
Steven Lopez

<p>Photochemical reactions are being increasingly used to construct complex molecular architectures with mild and straightforward reaction conditions. Computational techniques are increasingly important to understand the reactivities and chemoselectivities of photochemical isomerization reactions because they offer molecular bonding information along the excited-state(s) of photodynamics. These photodynamics simulations are resource-intensive and are typically limited to 1–10 picoseconds and 1,000 trajectories due to high computational cost. Most organic photochemical reactions have excited-state lifetimes exceeding 1 picosecond, which places them outside possible computational studies. Westermeyr <i>et al.</i> demonstrated that a machine learning approach could significantly lengthen photodynamics simulation times for a model system, methylenimmonium cation (CH<sub>2</sub>NH<sub>2</sub><sup>+</sup>).</p><p>We have developed a Python-based code, Python Rapid Artificial Intelligence <i>Ab Initio</i> Molecular Dynamics (PyRAI<sup>2</sup>MD), to accomplish the unprecedented 10 ns <i>cis-trans</i> photodynamics of <i>trans</i>-hexafluoro-2-butene (CF<sub>3</sub>–CH=CH–CF<sub>3</sub>) in 3.5 days. The same simulation would take approximately 58 years with ground-truth multiconfigurational dynamics. We proposed an innovative scheme combining Wigner sampling, geometrical interpolations, and short-time quantum chemical trajectories to effectively sample the initial data, facilitating the adaptive sampling to generate an informative and data-efficient training set with 6,232 data points. Our neural networks achieved chemical accuracy (mean absolute error of 0.032 eV). Our 4,814 trajectories reproduced the S<sub>1</sub> half-life (60.5 fs), the photochemical product ratio (<i>trans</i>: <i>cis</i> = 2.3: 1), and autonomously discovered a pathway towards a carbene. The neural networks have also shown the capability of generalizing the full potential energy surface with chemically incomplete data (<i>trans</i> → <i>cis</i> but not <i>cis</i> → <i>trans</i> pathways) that may offer future automated photochemical reaction discoveries.</p>


Sensors ◽  
2020 ◽  
Vol 21 (1) ◽  
pp. 15
Author(s):  
Filippo Aleotti ◽  
Giulio Zaccaroni ◽  
Luca Bartolomei ◽  
Matteo Poggi ◽  
Fabio Tosi ◽  
...  

Depth perception is paramount for tackling real-world problems, ranging from autonomous driving to consumer applications. For the latter, depth estimation from a single image would represent the most versatile solution since a standard camera is available on almost any handheld device. Nonetheless, two main issues limit the practical deployment of monocular depth estimation methods on such devices: (i) the low reliability when deployed in the wild and (ii) the resources needed to achieve real-time performance, often not compatible with low-power embedded systems. Therefore, in this paper, we deeply investigate all these issues, showing how they are both addressable by adopting appropriate network design and training strategies. Moreover, we also outline how to map the resulting networks on handheld devices to achieve real-time performance. Our thorough evaluation highlights the ability of such fast networks to generalize well to new environments, a crucial feature required to tackle the extremely varied contexts faced in real applications. Indeed, to further support this evidence, we report experimental results concerning real-time, depth-aware augmented reality and image blurring with smartphones in the wild.


Author(s):  
Shreyas S. Shivakumar ◽  
Kartik Mohta ◽  
Bernd Pfrommer ◽  
Vijay Kumar ◽  
Camillo J. Taylor

2021 ◽  
pp. 102164
Author(s):  
Artur Banach ◽  
Franklin King ◽  
Fumitaro Masaki ◽  
Hisashi Tsukada ◽  
Nobuhiko Hata

2011 ◽  
Vol 94-96 ◽  
pp. 38-42
Author(s):  
Qin Liu ◽  
Jian Min Xu

In order to improve the prediction precision of the short-term traffic flow, a prediction method of short-term traffic flow based on cloud model was proposed. The traffic flow was fit by cloud model. The history cloud and the present cloud were built by historical traffic flow and present traffic flow. The forecast cloud is produced by both clouds. Then, combining with the volume of the short-term traffic flow of an intersection in Guangzhou City, the model was calculated and simulated through programming. Max Absolute Error (MAE) and Mean Absolute percent Error (MAPE) were used to estimate the effect of prediction. The simulation results indicate that this prediction method is effective and advanced. The change of the historical and real time traffic flow is taken into account in this method. Because the short-term traffic flow is dealt with as a whole, the error of prediction is avoided. The prediction precision and real-time prediction are satisfied.


Sign in / Sign up

Export Citation Format

Share Document