Convolutional Neural Networks for 3D Vision System Data : A review

Author(s):  
Niall OrMahony ◽  
Sean Campbell ◽  
Lenka Krpalkova ◽  
Anderson Carvalho ◽  
Gustavo Adolfo Velasco-Hernandez ◽  
...  
Author(s):  
Melani Sanchez Garcia ◽  
Rubén Martínez Cantín ◽  
José J. Guerrero

We present a new approach to build a schematic representation of indoor environments for phosphene images. The proposed method combines a variety of convolutional neural networks for extracting and conveying relevant information about the scene such as structural informative edges of the environment and silhouettes of segmented objects. Experiments were conducted with normal sighted subjects with a Simulated Prosthetic Vision system.


Author(s):  
Jana Busshoff ◽  
Rabi R. Datta ◽  
Thomas Bruns ◽  
Robert Kleinert ◽  
Bernd Morgenstern ◽  
...  

Abstract Background The use of 3D technique compared to high-resolution 2D-4K-display technique has been shown to optimize spatial orientation and surgical performance in laparoscopic surgery. Since women make up an increasing amount of medical students and surgeons, this study was designed to investigate whether one gender has a greater benefit from using a 3D compared to a 4K-display system. Methods In a randomized cross-over trial, the surgical performance of male and female medical students (MS), non-board certified surgeons (NBCS), and board certified surgeons (BCS) was compared using 3D- vs. 4K-display technique at a minimally invasive training parkour with multiple surgical tasks and repetitions. Results 128 participants (56 women, 72 men) were included. Overall parkour time in seconds was 3D vs. 4K for all women 770.7 ± 31.9 vs. 1068.1 ± 50.0 (p < 0.001) and all men 664.5 ± 19.9 vs. 889.7 ± 31.2 (p < 0.001). Regarding overall mistakes, participants tend to commit less mistakes while using the 3D-vision system, showing 10.2 ± 1.1 vs. 13.3 ± 1.3 (p = 0.005) for all women and 9.6 ± 0.7 vs. 12.2 ± 1.0 (p = 0.001) for all men. The benefit of using a 3D system, measured by the difference in seconds, was for women 297.3 ± 41.8 (27.84%) vs. 225.2 ± 23.3 (25.31%) for men (p = 0.005). This can be confirmed in the MS group with 327.6 ± 65.5 (35.82%) vs. 249.8 ± 33.7 (32.12%), p = 0.041 and in the NBCS group 359 ± 52.4 (28.25%) vs. 198.2 ± 54.2 (18.62%), p = 0.003. There was no significant difference in the BCS group. Conclusion 3D laparoscopic display technique optimizes surgical performance compared to the 2D-4K technique for both women and men. The greatest 3D benefit was found for women with less surgical experience. As a possible result of surgical education, this gender specific difference disappears with higher grade of experience. Using a 3D-vision system could facilitate surgical apprenticeship, especially for women.


Author(s):  
I. G. Zubov

Introduction. Computer vision systems are finding widespread application in various life domains. Monocularcamera based systems can be used to solve a wide range of problems. The availability of digital cameras and large sets of annotated data, as well as the power of modern computing technologies, render monocular image analysis a dynamically developing direction in the field of machine vision. In order for any computer vision system to describe objects and predict their actions in the physical space of a scene, the image under analysis should be interpreted from the standpoint of the basic 3D scene. This can be achieved by analysing a rigid object as a set of mutually arranged parts, which represents a powerful framework for reasoning about physical interaction.Objective. Development of an automatic method for detecting interest points of an object in an image.Materials and methods. An automatic method for identifying interest points of vehicles, such as license plates, in an image is proposed. This method allows localization of interest points by analysing the inner layers of convolutional neural networks trained for the classification of images and detection of objects in an image. The proposed method allows identification of interest points without incurring additional costs of data annotation and training.Results. The conducted experiments confirmed the correctness of the proposed method in identifying interest points. Thus, the accuracy of identifying a point on a license plate achieved 97%.Conclusion. A new method for detecting interest points of an object by analysing the inner layers of convolutional neural networks is proposed. This method provides an accuracy similar to or exceeding that of other modern methods.


2017 ◽  
Vol 2645 (1) ◽  
pp. 113-122 ◽  
Author(s):  
Yaw Okyere Adu-Gyamfi ◽  
Sampson Kwasi Asare ◽  
Anuj Sharma ◽  
Tienaah Titus

In recent years there has been growing interest in the use of nonintrusive systems such as radar and infrared systems for vehicle recognition. State-of-the-art nonintrusive systems can report up to eight classes of vehicle types. Video-based systems, which arguably are the most popular nonintrusive detection systems, can report only very coarse classification levels (up to four classes), even with the best-performing vision systems. The present study developed a vision system that can report finer vehicle classifications according to FHWA’s scheme and is also comparable to other nonintrusive recognition systems. The proposed system decoupled object recognition into two main tasks: localization and classification. It began with localization by generating class-independent region proposals for each video frame, then it used deep convolutional neural networks to extract feature descriptors for each proposed region, and, finally, the system scored and classified the proposed regions by using a linear support vector machines template on the feature descriptors. The precision of the system varied by vehicle class. Passenger cars and SUVs were detected at a precision rate of 95%. The precision rates for single-unit, single-trailer, and double-trailer trucks ranged between 92% and 94%. According to receiver operating characteristic curves, the best system performance can be achieved under free flow, daytime or nighttime, and with good video resolution.


2021 ◽  
pp. PP. 18-50
Author(s):  
Ahmed A. Elngar ◽  
◽  
◽  
◽  
◽  
...  

Computer vision is one of the fields of computer science that is one of the most powerful and persuasive types of artificial intelligence. It is similar to the human vision system, as it enables computers to recognize and process objects in pictures and videos in the same way as humans do. Computer vision technology has rapidly evolved in many fields and contributed to solving many problems, as computer vision contributed to self-driving cars, and cars were able to understand their surroundings. The cameras record video from different angles around the car, then a computer vision system gets images from the video, and then processes the images in real-time to find roadside ends, detect other cars, and read traffic lights, pedestrians, and objects. Computer vision also contributed to facial recognition; this technology enables computers to match images of people’s faces to their identities. which these algorithms detect facial features in images and then compare them with databases. Computer vision also play important role in Healthcare, in which algorithms can help automate tasks such as detecting Breast cancer, finding symptoms in x-ray, cancerous moles in skin images, and MRI scans. Computer vision also contributed to many fields such as image classification, object discovery, motion recognition, subject tracking, and medicine. The rapid development of artificial intelligence is making machine learning more important in his field of research. Use algorithms to find out every bit of data and predict the outcome. This has become an important key to unlocking the door to AI. If we had looked to deep learning concept, we find deep learning is a subset of machine learning, algorithms inspired by structure and function of the human brain called artificial neural networks, learn from large amounts of data. Deep learning algorithm perform a task repeatedly, each time tweak it a little to improve the outcome. So, the development of computer vision was due to deep learning. Now we'll take a tour around the convolution neural networks, let us say that convolutional neural networks are one of the most powerful supervised deep learning models (abbreviated as CNN or ConvNet). This name ;convolutional ; is a token from a mathematical linear operation between matrixes called convolution. CNN structure can be used in a variety of real-world problems including, computer vision, image recognition, natural language processing (NLP), anomaly detection, video analysis, drug discovery, recommender systems, health risk assessment, and time-series forecasting. If we look at convolutional neural networks, we see that CNN are similar to normal neural networks, the only difference between CNN and ANN is that CNNs are used in the field of pattern recognition within images mainly. This allows us to encode the features of an image into the structure, making the network more suitable for image-focused tasks, with reducing the parameters required to set-up the model. One of the advantages of CNN that it has an excellent performance in machine learning problems. So, we will use CNN as a classifier for image classification. So, the objective of this paper is that we will talk in detail about image classification in the following sections.


Sign in / Sign up

Export Citation Format

Share Document