scholarly journals Deep Learning for Generic Object Detection: A Survey

2019 ◽  
Vol 128 (2) ◽  
pp. 261-318 ◽  
Author(s):  
Li Liu ◽  
Wanli Ouyang ◽  
Xiaogang Wang ◽  
Paul Fieguth ◽  
Jie Chen ◽  
...  

Abstract Object detection, one of the most fundamental and challenging problems in computer vision, seeks to locate object instances from a large number of predefined categories in natural images. Deep learning techniques have emerged as a powerful strategy for learning feature representations directly from data and have led to remarkable breakthroughs in the field of generic object detection. Given this period of rapid evolution, the goal of this paper is to provide a comprehensive survey of the recent achievements in this field brought about by deep learning techniques. More than 300 research contributions are included in this survey, covering many aspects of generic object detection: detection frameworks, object feature representation, object proposal generation, context modeling, training strategies, and evaluation metrics. We finish the survey by identifying promising directions for future research.

2019 ◽  
Vol 14 (4) ◽  
pp. 450-469 ◽  
Author(s):  
Jiechao Ma ◽  
Yang Song ◽  
Xi Tian ◽  
Yiting Hua ◽  
Rongguo Zhang ◽  
...  

AbstractAs a promising method in artificial intelligence, deep learning has been proven successful in several domains ranging from acoustics and images to natural language processing. With medical imaging becoming an important part of disease screening and diagnosis, deep learning-based approaches have emerged as powerful techniques in medical image areas. In this process, feature representations are learned directly and automatically from data, leading to remarkable breakthroughs in the medical field. Deep learning has been widely applied in medical imaging for improved image analysis. This paper reviews the major deep learning techniques in this time of rapid evolution and summarizes some of its key contributions and state-of-the-art outcomes. The topics include classification, detection, and segmentation tasks on medical image analysis with respect to pulmonary medical images, datasets, and benchmarks. A comprehensive overview of these methods implemented on various lung diseases consisting of pulmonary nodule diseases, pulmonary embolism, pneumonia, and interstitial lung disease is also provided. Lastly, the application of deep learning techniques to the medical image and an analysis of their future challenges and potential directions are discussed.


2021 ◽  
Vol 22 (15) ◽  
pp. 7911
Author(s):  
Eugene Lin ◽  
Chieh-Hsin Lin ◽  
Hsien-Yuan Lane

A growing body of evidence currently proposes that deep learning approaches can serve as an essential cornerstone for the diagnosis and prediction of Alzheimer’s disease (AD). In light of the latest advancements in neuroimaging and genomics, numerous deep learning models are being exploited to distinguish AD from normal controls and/or to distinguish AD from mild cognitive impairment in recent research studies. In this review, we focus on the latest developments for AD prediction using deep learning techniques in cooperation with the principles of neuroimaging and genomics. First, we narrate various investigations that make use of deep learning algorithms to establish AD prediction using genomics or neuroimaging data. Particularly, we delineate relevant integrative neuroimaging genomics investigations that leverage deep learning methods to forecast AD on the basis of incorporating both neuroimaging and genomics data. Moreover, we outline the limitations as regards to the recent AD investigations of deep learning with neuroimaging and genomics. Finally, we depict a discussion of challenges and directions for future research. The main novelty of this work is that we summarize the major points of these investigations and scrutinize the similarities and differences among these investigations.


Electronics ◽  
2021 ◽  
Vol 10 (4) ◽  
pp. 517
Author(s):  
Seong-heum Kim ◽  
Youngbae Hwang

Owing to recent advancements in deep learning methods and relevant databases, it is becoming increasingly easier to recognize 3D objects using only RGB images from single viewpoints. This study investigates the major breakthroughs and current progress in deep learning-based monocular 3D object detection. For relatively low-cost data acquisition systems without depth sensors or cameras at multiple viewpoints, we first consider existing databases with 2D RGB photos and their relevant attributes. Based on this simple sensor modality for practical applications, deep learning-based monocular 3D object detection methods that overcome significant research challenges are categorized and summarized. We present the key concepts and detailed descriptions of representative single-stage and multiple-stage detection solutions. In addition, we discuss the effectiveness of the detection models on their baseline benchmarks. Finally, we explore several directions for future research on monocular 3D object detection.


2021 ◽  
Vol 54 (4) ◽  
pp. 1-34
Author(s):  
Pengzhen Ren ◽  
Yun Xiao ◽  
Xiaojun Chang ◽  
Po-yao Huang ◽  
Zhihui Li ◽  
...  

Deep learning has made substantial breakthroughs in many fields due to its powerful automatic representation capabilities. It has been proven that neural architecture design is crucial to the feature representation of data and the final performance. However, the design of the neural architecture heavily relies on the researchers’ prior knowledge and experience. And due to the limitations of humans’ inherent knowledge, it is difficult for people to jump out of their original thinking paradigm and design an optimal model. Therefore, an intuitive idea would be to reduce human intervention as much as possible and let the algorithm automatically design the neural architecture. Neural Architecture Search ( NAS ) is just such a revolutionary algorithm, and the related research work is complicated and rich. Therefore, a comprehensive and systematic survey on the NAS is essential. Previously related surveys have begun to classify existing work mainly based on the key components of NAS: search space, search strategy, and evaluation strategy. While this classification method is more intuitive, it is difficult for readers to grasp the challenges and the landmark work involved. Therefore, in this survey, we provide a new perspective: beginning with an overview of the characteristics of the earliest NAS algorithms, summarizing the problems in these early NAS algorithms, and then providing solutions for subsequent related research work. In addition, we conduct a detailed and comprehensive analysis, comparison, and summary of these works. Finally, we provide some possible future research directions.


Electronics ◽  
2021 ◽  
Vol 10 (10) ◽  
pp. 1205
Author(s):  
Zhiyu Wang ◽  
Li Wang ◽  
Bin Dai

Object detection in 3D point clouds is still a challenging task in autonomous driving. Due to the inherent occlusion and density changes of the point cloud, the data distribution of the same object will change dramatically. Especially, the incomplete data with sparsity or occlusion can not represent the complete characteristics of the object. In this paper, we proposed a novel strong–weak feature alignment algorithm between complete and incomplete objects for 3D object detection, which explores the correlations within the data. It is an end-to-end adaptive network that does not require additional data and can be easily applied to other object detection networks. Through a complete object feature extractor, we achieve a robust feature representation of the object. It serves as a guarding feature to help the incomplete object feature generator to generate effective features. The strong–weak feature alignment algorithm reduces the gap between different states of the same object and enhances the ability to represent the incomplete object. The proposed adaptation framework is validated on the KITTI object benchmark and gets about 6% improvement in detection average precision on 3D moderate difficulty compared to the basic model. The results show that our adaptation method improves the detection performance of incomplete 3D objects.


2021 ◽  
Vol 2021 ◽  
pp. 1-9
Author(s):  
Vincent Majanga ◽  
Serestina Viriri

Recent advances in medical imaging analysis, especially the use of deep learning, are helping to identify, detect, classify, and quantify patterns in radiographs. At the center of these advances is the ability to explore hierarchical feature representations learned from data. Deep learning is invaluably becoming the most sought out technique, leading to enhanced performance in analysis of medical applications and systems. Deep learning techniques have achieved great performance results in dental image segmentation. Segmentation of dental radiographs is a crucial step that helps the dentist to diagnose dental caries. The performance of these deep networks is however restrained by various challenging features of dental carious lesions. Segmentation of dental images becomes difficult due to a vast variety in topologies, intricacies of medical structures, and poor image qualities caused by conditions such as low contrast, noise, irregular, and fuzzy edges borders, which result in unsuccessful segmentation. The dental segmentation method used is based on thresholding and connected component analysis. Images are preprocessed using the Gaussian blur filter to remove noise and corrupted pixels. Images are then enhanced using erosion and dilation morphology operations. Finally, segmentation is done through thresholding, and connected components are identified to extract the Region of Interest (ROI) of the teeth. The method was evaluated on an augmented dataset of 11,114 dental images. It was trained with 10 090 training set images and tested on 1024 testing set images. The proposed method gave results of 93 % for both precision and recall values, respectively.


Sensors ◽  
2020 ◽  
Vol 20 (9) ◽  
pp. 2501 ◽  
Author(s):  
Yanan Song ◽  
Liang Gao ◽  
Xinyu Li ◽  
Weiming Shen

Deep learning is robust to the perturbation of a point cloud, which is an important data form in the Internet of Things. However, it cannot effectively capture the local information of the point cloud and recognize the fine-grained features of an object. Different levels of features in the deep learning network are integrated to obtain local information, but this strategy increases network complexity. This paper proposes an effective point cloud encoding method that facilitates the deep learning network to utilize the local information. An axis-aligned cube is used to search for a local region that represents the local information. All of the points in the local region are available to construct the feature representation of each point. These feature representations are then input to a deep learning network. Two well-known datasets, ModelNet40 shape classification benchmark and Stanford 3D Indoor Semantics Dataset, are used to test the performance of the proposed method. Compared with other methods with complicated structures, the proposed method with only a simple deep learning network, can achieve a higher accuracy in 3D object classification and semantic segmentation.


Sign in / Sign up

Export Citation Format

Share Document