scholarly journals Deep Features Homography Transformation Fusion Network—A Universal Foreground Segmentation Algorithm for PTZ Cameras and a Comparative Study

Sensors ◽  
2020 ◽  
Vol 20 (12) ◽  
pp. 3420
Author(s):  
Ye Tao ◽  
Zhihao Ling

The foreground segmentation method is a crucial first step for many video analysis methods such as action recognition and object tracking. In the past five years, convolutional neural network based foreground segmentation methods have made a great breakthrough. However, most of them pay more attention to stationary cameras and have constrained performance on the pan–tilt–zoom (PTZ) cameras. In this paper, an end-to-end deep features homography transformation and fusion network based foreground segmentation method (HTFnetSeg) is proposed for surveillance videos recorded by PTZ cameras. In the kernel of HTFnetSeg, there is the combination of an unsupervised semantic attention homography estimation network (SAHnet) for frames alignment and a spatial transformed deep features fusion network (STDFFnet) for segmentation. The semantic attention mask in SAHnet reinforces the network to focus on background alignment by reducing the noise that comes from the foreground. STDFFnet is designed to reuse the deep features extracted during the semantic attention mask generation step by aligning the features rather than only the frames, with a spatial transformation technique in order to reduce the algorithm complexity. Additionally, a conservative strategy is proposed for the motion map based post-processing step to further reduce the false positives that are brought by semantic noise. The experiments on both CDnet2014 and Lasiesta show that our method outperforms many state-of-the-art methods, quantitively and qualitatively.

2012 ◽  
Vol 3 (2) ◽  
pp. 253-255
Author(s):  
Raman Brar

Image segmentation plays a vital role in several medical imaging programs by assisting the delineation of physiological structures along with other parts. The objective of this research work is to segmentize human lung MRI (Medical resonance Imaging) images for early detection of cancer.Watershed Transform Technique is implemented as the Segmentation method in this work. Some comparative experiments using both directly applied watershed algorithm and after marking foreground and computed background segmentation methods show the improved lung segmentation accuracy in some image cases.


2020 ◽  
Vol 961 (7) ◽  
pp. 47-55
Author(s):  
A.G. Yunusov ◽  
A.J. Jdeed ◽  
N.S. Begliarov ◽  
M.A. Elshewy

Laser scanning is considered as one of the most useful and fast technologies for modelling. On the other hand, the size of scan results can vary from hundreds to several million points. As a result, the large volume of the obtained clouds leads to complication at processing the results and increases the time costs. One way to reduce the volume of a point cloud is segmentation, which reduces the amount of data from several million points to a limited number of segments. In this article, we evaluated effect on the performance, the accuracy of various segmentation methods and the geometric accuracy of the obtained models at density changes taking into account the processing time. The results of our experiment were compared with reference data in a form of comparative analysis. As a conclusion, some recommendations for choosing the best segmentation method were proposed.


Author(s):  
S. Elavaar Kuzhali ◽  
D. S. Suresh

For handling digital images for various applications, image denoising is considered as a fundamental pre-processing step. Diverse image denoising algorithms have been introduced in the past few decades. The main intent of this proposal is to develop an effective image denoising model on the basis of internal and external patches. This model adopts Non-local means (NLM) for performing the denoising, which uses redundant information of the image in pixel or spatial domain to reduce the noise. While performing the image denoising using NLM, “denoising an image patch using the other noisy patches within the noisy image is done for internal denoising and denoising a patch using the external clean natural patches is done for external denoising”. Here, the selection of optimal block from the entire datasets including internal noisy images and external clean natural images is decided by a new hybrid optimization algorithm. The two renowned optimization algorithms Chicken Swarm Optimization (CSO), and Dragon Fly Algorithm (DA) are merged, and the new hybrid algorithm Rooster-based Levy Updated DA (RLU-DA) is adopted. The experimental results in terms of some relevant performance measures show the promising results of the proposed model with remarkable stability and high accuracy.


2021 ◽  
Author(s):  
Dmitri Ignakov

A vision system is an integral component of many autonomous robots. It enables the robot to perform essential tasks such as mapping, localization, or path planning. A vision system also assists with guiding the robot's grasping and manipulation tasks. As an increased demand is placed on service robots to operate in uncontrolled environments, advanced vision systems must be created that can function effectively in visually complex and cluttered settings. This thesis presents the development of segmentation algorithms to assist in online model acquisition for guiding robotic manipulation tasks. Specifically, the focus is placed on localizing door handles to assist in robotic door opening, and on acquiring partial object models to guide robotic grasping. . First, a method for localizing a door handle of unknown geometry based on a proposed 3D segmentation method is presented. Following segmentation, localization is performed by fitting a simple box model to the segmented handle. The proposed method functions without requiring assumptions about the appearance of the handle or the door, and without a geometric model of the handle. Next, an object segmentation algorithm is developed, which combines multiple appearance (intensity and texture) and geometric (depth and curvature) cues. The algorithm is able to segment objects without utilizing any a priori appearance or geometric information in visually complex and cluttered environments. The segmentation method is based on the Conditional Random Fields (CRF) framework, and the graph cuts energy minimization technique. A simple and efficient method for initializing the proposed algorithm which overcomes graph cuts' reliance on user interaction is also developed. Finally, an improved segmentation algorithm is developed which incorporates a distance metric learning (DML) step as a means of weighing various appearance and geometric segmentation cues, allowing the method to better adapt to the available data. The improved method also models the distribution of 3D points in space as a distribution of algebraic distances from an ellipsoid fitted to the object, improving the method's ability to predict which points are likely to belong to the object or the background. Experimental validation of all methods is performed. Each method is evaluated in a realistic setting, utilizing scenarios of various complexities. Experimental results have demonstrated the effectiveness of the handle localization method, and the object segmentation methods.


2018 ◽  
Vol 7 (2.5) ◽  
pp. 77
Author(s):  
Anis Farihan Mat Raffei ◽  
Rohayanti Hassan ◽  
Shahreen Kasim ◽  
Hishamudin Asmuni ◽  
Asraful Syifaa’ Ahmad ◽  
...  

The quality of eye image data become degraded particularly when the image is taken in the non-cooperative acquisition environment such as under visible wavelength illumination. Consequently, this environmental condition may lead to noisy eye images, incorrect localization of limbic and pupillary boundaries and eventually degrade the performance of iris recognition system. Hence, this study has compared several segmentation methods to address the abovementioned issues. The results show that Circular Hough transform method is the best segmentation method with the best overall accuracy, error rate and decidability index that more tolerant to ‘noise’ such as reflection.  


Sensors ◽  
2020 ◽  
Vol 20 (17) ◽  
pp. 4979
Author(s):  
Dong Xiao ◽  
Xiwen Liu ◽  
Ba Tuan Le ◽  
Zhiwen Ji ◽  
Xiaoyu Sun

The ore fragment size on the conveyor belt of concentrators is not only the main index to verify the crushing process, but also affects the production efficiency, operation cost and even production safety of the mine. In order to get the size of ore fragments on the conveyor belt, the image segmentation method is a convenient and fast choice. However, due to the influence of dust, light and uneven color and texture, the traditional ore image segmentation methods are prone to oversegmentation and undersegmentation. In order to solve these problems, this paper proposes an ore image segmentation model called RDU-Net (R: residual connection; DU: DUNet), which combines the residual structure of convolutional neural network with DUNet model, greatly improving the accuracy of image segmentation. RDU-Net can adaptively adjust the receptive field according to the size and shape of different ore fragments, capture the ore edge of different shape and size, and realize the accurate segmentation of ore image. The experimental results show that compared with other U-Net and DUNet, the RDU-Net has significantly improved segmentation accuracy, and has better generalization ability, which can fully meet the requirements of ore fragment size detection in the concentrator.


2019 ◽  
Vol 31 (2) ◽  
pp. 163-172
Author(s):  
Maen Qaseem Ghadi ◽  
Árpád Török

In road safety, the process of organizing road infrastructurenetwork data into homogenous entities is called segmentation.Segmenting a road network is considered thefirst and most important step in developing a safety performancefunction (SPF). This article aims to study the benefitof a newly developed network segmentation method which is based on the generation of accident groups applying K-means clustering approach. K-means algorithm has been used to identify the structure of homogeneous accident groups. According to the main assumption of the proposed clustering method, the risk of accidents is strongly influenced by the spatial interdependence and traffic attributes of the accidents. The performance of K-means clustering was compared with four other segmentation methods applying constant average annual daily traffic segments, constant length segments, related curvature characteristics and a multivariable method suggested by the Highway Safety Manual (HSM). The SPF was used to evaluate the performance of the five segmentation methods in predicting accident frequency. K-means clustering-based segmentation method has been proved to be more flexible and accurate than the other models in identifying homogeneous infrastructure segments with similar safety characteristics.


2020 ◽  
Vol 2020 ◽  
pp. 1-13
Author(s):  
Xiaodong Huang ◽  
Hui Zhang ◽  
Li Zhuo ◽  
Xiaoguang Li ◽  
Jing Zhang

Extracting the tongue body accurately from a digital tongue image is a challenge for automated tongue diagnoses, as the blurred edge of the tongue body, interference of pathological details, and the huge difference in the size and shape of the tongue. In this study, an automated tongue image segmentation method using enhanced fully convolutional network with encoder-decoder structure was presented. In the frame of the proposed network, the deep residual network was adopted as an encoder to obtain dense feature maps, and a Receptive Field Block was assembled behind the encoder. Receptive Field Block can capture adequate global contextual prior because of its structure of the multibranch convolution layers with varying kernels. Moreover, the Feature Pyramid Network was used as a decoder to fuse multiscale feature maps for gathering sufficient positional information to recover the clear contour of the tongue body. The quantitative evaluation of the segmentation results of 300 tongue images from the SIPL-tongue dataset showed that the average Hausdorff Distance, average Symmetric Mean Absolute Surface Distance, average Dice Similarity Coefficient, average precision, average sensitivity, and average specificity were 11.2963, 3.4737, 97.26%, 95.66%, 98.97%, and 98.68%, respectively. The proposed method achieved the best performance compared with the other four deep-learning-based segmentation methods (including SegNet, FCN, PSPNet, and DeepLab v3+). There were also similar results on the HIT-tongue dataset. The experimental results demonstrated that the proposed method can achieve accurate tongue image segmentation and meet the practical requirements of automated tongue diagnoses.


2020 ◽  
Vol 15 (4) ◽  
pp. 536-542
Author(s):  
Ibrahim Rizk Hegazy ◽  
Mansour Rifaat Helmi

Abstract Urbanization is a global trend determined primarily by excessive population growth, particularly in the developing countries such as Egypt. The configuration and boundaries of urbanization and their model can be observed at a distance of space and time. In this research, geographic information system and remote sensing were used to analyze urbanization and trends in the past 30 years of Mansoura City, which is one of the largest medium-sized cities in Egypt. Four Landsat images, obtained in 1985, 1995, 2005 and 2015, were adjusted and compared using the ArcGIS software. The classified images were analyzed to determine urbanization trends in Mansoura city during the three periods 1985–1995, 1995–2005 and 2005–2015. The results of the change disclosure showed areas and trends in urbanization. The urban area has grown by approximately five times over 30 years. The results showed that the eastern direction was predominant during the periods (1985–1995) and (1995–2005) with 53 and 53% of the city total growth, respectively. During the period (2005–2015), the northern trend was dominant with 38% of the city total growth. This research promotes future urban planning strategies by evaluating temporal spatial transformation and urbanization trends.


2018 ◽  
Vol 15 (2) ◽  
pp. 739-743 ◽  
Author(s):  
Noor Amjed ◽  
Fatimah Khalid ◽  
Rahmita Wirza O. K. Rahmat ◽  
Hizmawati Binit Madzin

Iris segmentation methods work based on ideal imaging conditions which produce good output results. However, the segmentation accuracy of an iris recognition system significantly influences its performance, especially with data that captured in unconstrained environment of the Smartphone. This paper proposes a novel segmentation method for unconstrained environment of the Smartphone videos based on choose the best frames from the videos and try to enhance the contrast of this frames by applying the two fuzzy logic membership functions on the negative image which delimit between dark and bright regions in able to make the dark region darker and the bright region brighter. This pre-processing step Facilitates the work of the Weighted Adaptive Hough Transform to automatically find the diameter of the iris region to apply the osiris v4.1. The proposed method results on the video of (Mobile Iris Challenge Evaluation (MICHE))-I, iris databases indicate a high level of accuracy and more efficient computationally using the proposed technique.


Sign in / Sign up

Export Citation Format

Share Document