scholarly journals A Two-Phase Mitosis Detection Approach Based on U-Shaped Network

2021 ◽  
Vol 2021 ◽  
pp. 1-8
Author(s):  
Wenjing Lu

This paper proposes a deep learning-based method for mitosis detection in breast histopathology images. A main problem in mitosis detection is that most of the datasets only have weak labels, i.e., only the coordinates indicating the center of the mitosis region. This makes most of the existing powerful object detection methods hardly be used in mitosis detection. Aiming at solving this problem, this paper firstly applies a CNN-based algorithm to pixelwisely segment the mitosis regions, based on which bounding boxes of mitosis are generated as strong labels. Based on the generated bounding boxes, an object detection network is trained to accomplish mitosis detection. Experimental results show that the proposed method is effective in detecting mitosis, and the accuracies outperform state-of-the-art literatures.


2019 ◽  
Vol 11 (7) ◽  
pp. 786 ◽  
Author(s):  
Yang-Lang Chang ◽  
Amare Anagaw ◽  
Lena Chang ◽  
Yi Wang ◽  
Chih-Yu Hsiao ◽  
...  

Synthetic aperture radar (SAR) imagery has been used as a promising data source for monitoring maritime activities, and its application for oil and ship detection has been the focus of many previous research studies. Many object detection methods ranging from traditional to deep learning approaches have been proposed. However, majority of them are computationally intensive and have accuracy problems. The huge volume of the remote sensing data also brings a challenge for real time object detection. To mitigate this problem a high performance computing (HPC) method has been proposed to accelerate SAR imagery analysis, utilizing the GPU based computing methods. In this paper, we propose an enhanced GPU based deep learning method to detect ship from the SAR images. The You Only Look Once version 2 (YOLOv2) deep learning framework is proposed to model the architecture and training the model. YOLOv2 is a state-of-the-art real-time object detection system, which outperforms Faster Region-Based Convolutional Network (Faster R-CNN) and Single Shot Multibox Detector (SSD) methods. Additionally, in order to reduce computational time with relatively competitive detection accuracy, we develop a new architecture with less number of layers called YOLOv2-reduced. In the experiment, we use two types of datasets: A SAR ship detection dataset (SSDD) dataset and a Diversified SAR Ship Detection Dataset (DSSDD). These two datasets were used for training and testing purposes. YOLOv2 test results showed an increase in accuracy of ship detection as well as a noticeable reduction in computational time compared to Faster R-CNN. From the experimental results, the proposed YOLOv2 architecture achieves an accuracy of 90.05% and 89.13% on the SSDD and DSSDD datasets respectively. The proposed YOLOv2-reduced architecture has a similarly competent detection performance as YOLOv2, but with less computational time on a NVIDIA TITAN X GPU. The experimental results shows that the deep learning can make a big leap forward in improving the performance of SAR image ship detection.



2018 ◽  
Vol 232 ◽  
pp. 04036
Author(s):  
Jun Yin ◽  
Huadong Pan ◽  
Hui Su ◽  
Zhonggeng Liu ◽  
Zhirong Peng

We propose an object detection method that predicts the orientation bounding boxes (OBB) to estimate objects locations, scales and orientations based on YOLO (You Only Look Once), which is one of the top detection algorithms performing well both in accuracy and speed. Horizontal bounding boxes(HBB), which are not robust to orientation variances, are used in the existing object detection methods to detect targets. The proposed orientation invariant YOLO (OIYOLO) detector can effectively deal with the bird’s eye viewpoint images where the orientation angles of the objects are arbitrary. In order to estimate the rotated angle of objects, we design a new angle loss function. Therefore, the training of OIYOLO forces the network to learn the annotated orientation angle of objects, making OIYOLO orientation invariances. The proposed approach that predicts OBB can be applied in other detection frameworks. In additional, to evaluate the proposed OIYOLO detector, we create an UAV-DAHUA datasets that annotated with objects locations, scales and orientation angles accurately. Extensive experiments conducted on UAV-DAHUA and DOTA datasets demonstrate that OIYOLO achieves state-of-the-art detection performance with high efficiency comparing with the baseline YOLO algorithms.



2020 ◽  
Author(s):  
◽  
Yang Liu

[ACCESS RESTRICTED TO THE UNIVERSITY OF MISSOURI AT REQUEST OF AUTHOR.] With the rapid development of deep learning in computer vision, especially deep convolutional neural networks (CNNs), significant advances have been made in recent years on object recognition and detection in images. Highly accurate detection results have been achieved for large objects, whereas detection accuracy on small objects remains to be low. This dissertation focuses on investigating deep learning methods for small object detection in images and proposing new methods with improved performance. First, we conducted a comprehensive review of existing deep learning methods for small object detections, in which we summarized and categorized major techniques and models, identified major challenges, and listed some future research directions. Existing techniques were categorized into using contextual information, combining multiple feature maps, creating sufficient positive examples, and balancing foreground and background examples. Methods developed in four related areas, generic object detection, face detection, object detection in aerial imagery, and segmentation, were summarized and compared. In addition, the performances of several leading deep learning methods for small object detection, including YOLOv3, Faster R-CNN, and SSD, were evaluated based on three large benchmark image datasets of small objects. Experimental results showed that Faster R-CNN performed the best, while YOLOv3 was a close second. Furthermore, a new deep learning method, called Retina-context Net, was proposed and outperformed state-of-the art one-stage deep learning models, including SSD, YOLOv3 and RetinaNet, on the COCO and SUN benchmark datasets. Secondly, we created a new dataset for bird detection, called Little Birds in Aerial Imagery (LBAI), from real-life aerial imagery. LBAI contains birds with sizes ranging from 10 by 10 pixels to 40 by 40 pixels. We adapted and applied several state-of-the-art deep learning models to LBAI, including object detection models such as YOLOv2, SSH, and Tiny Face, and instance segmentation models such as U-Net and Mask R-CNN. Our empirical results illustrated the strength and weakness of these methods, showing that SSH performed the best for easy cases, whereas Tiny Face performed the best for hard cases with cluttered backgrounds. Among small instance segmentation methods, U-Net achieved slightly better performance than Mask R-CNN. Thirdly, we proposed a new graph neural network-based object detection algorithm, called GODM, to take the spatial information of candidate objects into consideration in small object detection. Instead of detecting small objects independently as the existing deep learning methods do, GODM treats the candidate bounding boxes generated by existing object detectors as nodes and creates edges based on the spatial or semantic relationship between the candidate bounding boxes. GODM contains four major components: node feature generation, graph generation, node class labelling, and graph convolutional neural network model. Several graph generation methods were proposed. Experimental results on the LBDA dataset show that GODM outperformed existing state-of-the-art object detector Faster R-CNN significantly, up to 12% better in accuracy. Finally, we proposed a new computer vision-based grass analysis using machine learning. To deal with the variation of lighting condition, a two-stage segmentation strategy is proposed for grass coverage computation based on a blackboard background. On a real world dataset we collected from natural environments, the proposed method was robust to varying environments, lighting, and colors. For grass detection and coverage computation, the error rate was just 3%.



2018 ◽  
Vol 8 (9) ◽  
pp. 1423 ◽  
Author(s):  
Cong Tang ◽  
Yongshun Ling ◽  
Xing Yang ◽  
Wei Jin ◽  
Chao Zheng

A multi-view object detection approach based on deep learning is proposed in this paper. Classical object detection methods based on regression models are introduced, and the reasons for their weak ability to detect small objects are analyzed. To improve the performance of these methods, a multi-view object detection approach is proposed, and the model structure and working principles of this approach are explained. Additionally, the object retrieval ability and object detection accuracy of both the multi-view methods and the corresponding classical methods are evaluated and compared based on a test on a small object dataset. The experimental results show that in terms of object retrieval capability, Multi-view YOLO (You Only Look Once: Unified, Real-Time Object Detection), Multi-view YOLOv2 (based on an updated version of YOLO), and Multi-view SSD (Single Shot Multibox Detector) achieve AF (average F-measure) scores that are higher than those of their classical counterparts by 0.177, 0.06, and 0.169, respectively. Moreover, in terms of the detection accuracy, when difficult objects are not included, the mAP (mean average precision) scores of the multi-view methods are higher than those of the classical methods by 14.3%, 7.4%, and 13.1%, respectively. Thus, the validity of the approach proposed in this paper has been verified. In addition, compared with state-of-the-art methods based on region proposals, multi-view detection methods are faster while achieving mAPs that are approximately the same in small object detection.



2021 ◽  
Vol 12 (1) ◽  
pp. 31-39
Author(s):  
D. D. Rukhovich ◽  

Deep learning-based detectors usually produce a redundant set of object bounding boxes including many duplicate detections of the same object. These boxes are then filtered using non-maximum suppression (NMS) in order to select exactly one bounding box per object of interest. This greedy scheme is simple and provides sufficient accuracy for isolated objects but often fails in crowded environments, since one needs to both preserve boxes for different objects and suppress duplicate detections. In this work we develop an alternative iterative scheme, where a new subset of objects is detected at each iteration. Detected boxes from the previous iterations are passed to the network at the following iterations to ensure that the same object would not be detected twice. This iterative scheme can be applied to both one-stage and two-stage object detectors with just minor modifications of the training and inference proce­dures. We perform extensive experiments with two different baseline detectors on four datasets and show significant improvement over the baseline, leading to state-of-the-art performance on CrowdHuman and WiderPerson datasets.



Author(s):  
M. N. Favorskaya ◽  
L. C. Jain

Introduction:Saliency detection is a fundamental task of computer vision. Its ultimate aim is to localize the objects of interest that grab human visual attention with respect to the rest of the image. A great variety of saliency models based on different approaches was developed since 1990s. In recent years, the saliency detection has become one of actively studied topic in the theory of Convolutional Neural Network (CNN). Many original decisions using CNNs were proposed for salient object detection and, even, event detection.Purpose:A detailed survey of saliency detection methods in deep learning era allows to understand the current possibilities of CNN approach for visual analysis conducted by the human eyes’ tracking and digital image processing.Results:A survey reflects the recent advances in saliency detection using CNNs. Different models available in literature, such as static and dynamic 2D CNNs for salient object detection and 3D CNNs for salient event detection are discussed in the chronological order. It is worth noting that automatic salient event detection in durable videos became possible using the recently appeared 3D CNN combining with 2D CNN for salient audio detection. Also in this article, we have presented a short description of public image and video datasets with annotated salient objects or events, as well as the often used metrics for the results’ evaluation.Practical relevance:This survey is considered as a contribution in the study of rapidly developed deep learning methods with respect to the saliency detection in the images and videos.



Electronics ◽  
2021 ◽  
Vol 10 (4) ◽  
pp. 517
Author(s):  
Seong-heum Kim ◽  
Youngbae Hwang

Owing to recent advancements in deep learning methods and relevant databases, it is becoming increasingly easier to recognize 3D objects using only RGB images from single viewpoints. This study investigates the major breakthroughs and current progress in deep learning-based monocular 3D object detection. For relatively low-cost data acquisition systems without depth sensors or cameras at multiple viewpoints, we first consider existing databases with 2D RGB photos and their relevant attributes. Based on this simple sensor modality for practical applications, deep learning-based monocular 3D object detection methods that overcome significant research challenges are categorized and summarized. We present the key concepts and detailed descriptions of representative single-stage and multiple-stage detection solutions. In addition, we discuss the effectiveness of the detection models on their baseline benchmarks. Finally, we explore several directions for future research on monocular 3D object detection.



2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Changyong Li ◽  
Yongxian Fan ◽  
Xiaodong Cai

Abstract Background With the development of deep learning (DL), more and more methods based on deep learning are proposed and achieve state-of-the-art performance in biomedical image segmentation. However, these methods are usually complex and require the support of powerful computing resources. According to the actual situation, it is impractical that we use huge computing resources in clinical situations. Thus, it is significant to develop accurate DL based biomedical image segmentation methods which depend on resources-constraint computing. Results A lightweight and multiscale network called PyConvU-Net is proposed to potentially work with low-resources computing. Through strictly controlled experiments, PyConvU-Net predictions have a good performance on three biomedical image segmentation tasks with the fewest parameters. Conclusions Our experimental results preliminarily demonstrate the potential of proposed PyConvU-Net in biomedical image segmentation with resources-constraint computing.



Author(s):  
Jwalin Bhatt ◽  
Khurram Azeem Hashmi ◽  
Muhammad Zeshan Afzal ◽  
Didier Stricker

In any document, graphical elements like tables, figures, and formulas contain essential information. The processing and interpretation of such information require specialized algorithms. Off-the-shelf OCR components cannot process this information reliably. Therefore, an essential step in document analysis pipelines is to detect these graphical components. It leads to a high-level conceptual understanding of the documents that makes digitization of documents viable. Since the advent of deep learning, the performance of deep learning-based object detection has improved many folds. In this work, we outline and summarize the deep learning approaches for detecting graphical page objects in the document images. Therefore, we discuss the most relevant deep learning-based approaches and state-of-the-art graphical page object detection in document images. This work provides a comprehensive understanding of the current state-of-the-art and related challenges. Furthermore, we discuss leading datasets along with the quantitative evaluation. Moreover, it discusses briefly the promising directions that can be utilized for further improvements.



2021 ◽  
Vol 23 (06) ◽  
pp. 47-57
Author(s):  
Aditya Kulkarni ◽  
◽  
Manali Munot ◽  
Sai Salunkhe ◽  
Shubham Mhaske ◽  
...  

With the development in technologies right from serial to parallel computing, GPU, AI, and deep learning models a series of tools to process complex images have been developed. The main focus of this research is to compare various algorithms(pre-trained models) and their contributions to process complex images in terms of performance, accuracy, time, and their limitations. The pre-trained models we are using are CNN, R-CNN, R-FCN, and YOLO. These models are python language-based and use libraries like TensorFlow, OpenCV, and free image databases (Microsoft COCO and PAS-CAL VOC 2007/2012). These not only aim at object detection but also on building bounding boxes around appropriate locations. Thus, by this review, we get a better vision of these models and their performance and a good idea of which models are ideal for various situations.



Sign in / Sign up

Export Citation Format

Share Document