scholarly journals DNS: A multi-scale deconvolution semantic segmentation network for joint detection and segmentation

2019 ◽  
Vol 277 ◽  
pp. 02005
Author(s):  
Ning Feng ◽  
Le Dong ◽  
Qianni Zhang ◽  
Ning Zhang ◽  
Xi Wu ◽  
...  

Real-time semantic segmentation has become crucial in many applications such as medical image analysis and autonomous driving. In this paper, we introduce a single semantic segmentation network, called DNS, for joint object detection and segmentation task. We take advantage of multi-scale deconvolution mechanism to perform real time computations. To this goal, down-scale and up-scale streams are utilized to combine the multi-scale features for the final detection and segmentation task. By using the proposed DNS, not only the tradeoff between accuracy and cost but also the balance of detection and segmentation performance are settled. Experimental results for PASCAL VOC datasets show competitive performance for joint object detection and segmentation task.

Sensors ◽  
2020 ◽  
Vol 20 (18) ◽  
pp. 5080
Author(s):  
Baohua Qiang ◽  
Ruidong Chen ◽  
Mingliang Zhou ◽  
Yuanchao Pang ◽  
Yijie Zhai ◽  
...  

In recent years, increasing image data comes from various sensors, and object detection plays a vital role in image understanding. For object detection in complex scenes, more detailed information in the image should be obtained to improve the accuracy of detection task. In this paper, we propose an object detection algorithm by jointing semantic segmentation (SSOD) for images. First, we construct a feature extraction network that integrates the hourglass structure network with the attention mechanism layer to extract and fuse multi-scale features to generate high-level features with rich semantic information. Second, the semantic segmentation task is used as an auxiliary task to allow the algorithm to perform multi-task learning. Finally, multi-scale features are used to predict the location and category of the object. The experimental results show that our algorithm substantially enhances object detection performance and consistently outperforms other three comparison algorithms, and the detection speed can reach real-time, which can be used for real-time detection.


2021 ◽  
Vol 13 (12) ◽  
pp. 307
Author(s):  
Vijayakumar Varadarajan ◽  
Dweepna Garg ◽  
Ketan Kotecha

Deep learning is a relatively new branch of machine learning in which computers are taught to recognize patterns in massive volumes of data. It primarily describes learning at various levels of representation, which aids in understanding data that includes text, voice, and visuals. Convolutional neural networks have been used to solve challenges in computer vision, including object identification, image classification, semantic segmentation and a lot more. Object detection in videos involves confirming the presence of the object in the image or video and then locating it accurately for recognition. In the video, modelling techniques suffer from high computation and memory costs, which may decrease performance measures such as accuracy and efficiency to identify the object accurately in real-time. The current object detection technique based on a deep convolution neural network requires executing multilevel convolution and pooling operations on the entire image to extract deep semantic properties from it. For large objects, detection models can provide superior results; however, those models fail to detect the varying size of the objects that have low resolution and are greatly influenced by noise because the features after the repeated convolution operations of existing models do not fully represent the essential characteristics of the objects in real-time. With the help of a multi-scale anchor box, the proposed approach reported in this paper enhances the detection accuracy by extracting features at multiple convolution levels of the object. The major contribution of this paper is to design a model to understand better the parameters and the hyper-parameters which affect the detection and the recognition of objects of varying sizes and shapes, and to achieve real-time object detection and recognition speeds by improving accuracy. The proposed model has achieved 84.49 mAP on the test set of the Pascal VOC-2007 dataset at 11 FPS, which is comparatively better than other real-time object detection models.


2021 ◽  
Vol 13 (9) ◽  
pp. 1670
Author(s):  
Danilo Avola ◽  
Luigi Cinque ◽  
Anxhelo Diko ◽  
Alessio Fagioli ◽  
Gian Luca Foresti ◽  
...  

Tracking objects across multiple video frames is a challenging task due to several difficult issues such as occlusions, background clutter, lighting as well as object and camera view-point variations, which directly affect the object detection. These aspects are even more emphasized when analyzing unmanned aerial vehicles (UAV) based images, where the vehicle movement can also impact the image quality. A common strategy employed to address these issues is to analyze the input images at different scales to obtain as much information as possible to correctly detect and track the objects across video sequences. Following this rationale, in this paper, we introduce a simple yet effective novel multi-stream (MS) architecture, where different kernel sizes are applied to each stream to simulate a multi-scale image analysis. The proposed architecture is then used as backbone for the well-known Faster-R-CNN pipeline, defining a MS-Faster R-CNN object detector that consistently detects objects in video sequences. Subsequently, this detector is jointly used with the Simple Online and Real-time Tracking with a Deep Association Metric (Deep SORT) algorithm to achieve real-time tracking capabilities on UAV images. To assess the presented architecture, extensive experiments were performed on the UMCD, UAVDT, UAV20L, and UAV123 datasets. The presented pipeline achieved state-of-the-art performance, confirming that the proposed multi-stream method can correctly emulate the robust multi-scale image analysis paradigm.


2020 ◽  
Author(s):  
Ali Hatamizadeh ◽  
Demetri Terzopoulos ◽  
Andriy Myronenko

AbstractTextures and edges contribute different information to image recognition. Edges and boundaries encode shape information, while textures manifest the appearance of regions. Despite the success of Convolutional Neural Networks (CNNs) in computer vision and medical image analysis applications, predominantly only texture abstractions are learned, which often leads to imprecise boundary delineations. In medical imaging, expert manual segmentation often relies on organ boundaries; for example, to manually segment a liver, a medical practitioner usually identifies edges first and subsequently fills in the segmentation mask. Motivated by these observations, we propose a plug-and-play module, dubbed Edge-Gated CNNs (EG-CNNs), that can be used with existing encoder-decoder architectures to process both edge and texture information. The EG-CNN learns to emphasize the edges in the encoder, to predict crisp boundaries by an auxiliary edge supervision, and to fuse its output with the original CNN output. We evaluate the effectiveness of the EG-CNN with various mainstream CNNs on two publicly available datasets, BraTS 19 and KiTS 19 for brain tumor and kidney semantic segmentation. We demonstrate how the addition of EG-CNN consistently improves segmentation accuracy and generalization performance.


2020 ◽  
Vol 13 (5) ◽  
pp. 999-1007
Author(s):  
Karthikeyan Periyasami ◽  
Arul Xavier Viswanathan Mariammal ◽  
Iwin Thanakumar Joseph ◽  
Velliangiri Sarveshwaran

Background: Medical image analysis application has complex resource requirement. Scheduling Medical image analysis application is the complex task to the grid resources. It is necessary to develop a new model to improve the breast cancer screening process. Proposed novel Meta scheduler algorithm allocate the image analyse applications to the local schedulers and local scheduler submit the job to the grid node which analyses the medical image and generates the result sent back to Meta scheduler. Meta schedulers are distinct from the local scheduler. Meta scheduler and local scheduler have the aim at resource allocation and management. Objective: The main objective of the CDAM meta-scheduler is to maximize the number of jobs accepted. Methods: In the beginning, the user sends jobs with the deadline to the global grid resource broker. Resource providers sent information about the available resources connected in the network at a fixed interval of time to the global grid resource broker, the information such as valuation of the resource and number of an available free resource. CDAM requests the global grid resource broker for available resources details and user jobs. After receiving the information from the global grid resource broker, it matches the job with the resources. CDAM sends jobs to the local scheduler and local scheduler schedule the job to the local grid site. Local grid site executes the jobs and sends the result back to the CDAM. Success full completion of the job status and resource status are updated into the auction history database. CDAM collect the result from all local grid site and return to the grid users. Results: The CDAM was simulated using grid simulator. Number of jobs increases then the percentage of the jobs accepted also decrease due to the scarcity of resources. CDAM is providing 2% to 5% better result than Fair share Meta scheduling algorithm. CDAM algorithm bid density value is generated based on the user requirement and user history and ask value is generated from the resource details. Users who, having the most significant deadline are generated the highest bid value, grid resource which is having the fastest processor are generated lowest ask value. The highest bid is assigned to the lowest Ask it means that the user who is having the most significant deadline is assigned to the grid resource which is having the fastest processor. The deadline represents a time by which the user requires the result. The user can define the deadline by which the results are needed, and the CDAM will try to find the fastest resource available in order to meet the user-defined deadline. If the scheduler detects that the tasks cannot be completed before the deadline, then the scheduler abandons the current resource, tries to select the next fastest resource and tries until the completion of application meets the deadline. CDAM is providing 25% better result than grid way Meta scheduler this is because grid way Meta scheduler allocate jobs to the resource based on the first come first served policy. Conclusion: The proposed CDAM model was validated through simulation and was evaluated based on jobs accepted. The experimental results clearly show that the CDAM model maximizes the number of jobs accepted than conventional Meta scheduler. We conclude that a CDAM is highly effective meta-scheduler systems and can be used for an extraordinary situation where jobs have a combinatorial requirement.


Sign in / Sign up

Export Citation Format

Share Document