A GRAPH-BASED ALGORITHM FOR CLUSTER DETECTION

One of the most important challenges in computer vision applications is the background modeling, especially when the background is dynamic and the input distribution might not be stationary, i.e. the distribution of the input data could change with time (e.g. changing illuminations, waving trees, water, etc.). In this work, an unsupervised learning neural network is proposed which is able to cope with progressive changes in the input distribution. It is based on a dual learning mechanism which manages the changes of the input distribution separately from the cluster detection. The proposal is adequate for scenes where the background varies slowly. The performance of the method is tested against several state-of-the-art foreground detectors both quantitatively and qualitatively, with favorable results.

Download Full-text

Computer Vision and robotics in postal automation

Human Systems Management ◽

10.3233/hsm-1999-183-411 ◽

1999 ◽

Vol 18 (3-4) ◽

pp. 265-273

Author(s):

Giovanni B. Garibotto

Keyword(s):

Image Processing ◽

Computer Vision ◽

Pattern Recognition ◽

Material Handling ◽

State Of The Art ◽

Short Description ◽

The Other ◽

Functional Requirements ◽

Postal Automation ◽

And Robotics

The paper is intended to provide an overview of advanced robotic technologies within the context of Postal Automation services. The main functional requirements of the application are briefly referred, as well as the state of the art and new emerging solutions. Image Processing and Pattern Recognition have always played a fundamental role in Address Interpretation and Mail sorting and the new challenging objective is now off-line handwritten cursive recognition, in order to be able to handle all kind of addresses in a uniform way. On the other hand, advanced electromechanical and robotic solutions are extremely important to solve the problems of mail storage, transportation and distribution, as well as for material handling and logistics. Finally a short description of new services of Postal Automation is referred, by considering new emerging services of hybrid mail and paper to electronic conversion.

Download Full-text

Activity Recognition using STLBPs

10.31219/osf.io/vnh4y ◽

2020 ◽

Author(s):

Jawad Khan

Keyword(s):

Computer Vision ◽

Human Computer Interaction ◽

Activity Recognition ◽

State Of The Art ◽

Local Binary Patterns ◽

Temporal Information ◽

Violence Detection ◽

Art Methods ◽

Computer Vision Applications ◽

Computer Interaction

Activity recognition is a topic undergoing massive research in the field of computer vision. Applications of activity recognition include sports summaries, human-computer interaction, violence detection, surveillance etc. In this paper, we propose the modification of the standard local binary patterns descriptor to obtain a concatenated histogram of lower dimensions. This helps to encode the spatial and temporal information of various actions happening in a frame. This method helps to overcome the dimensionality problem that occurs with LBP and the results show that the proposed method performed comparably with state of the art methods.

Download Full-text

From Zero-Shot Learning to Cold-Start Recommendation

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33014189 ◽

2019 ◽

Vol 33 ◽

pp. 4189-4196 ◽

Cited By ~ 12

Author(s):

Jingjing Li ◽

Mengmeng Jing ◽

Ke Lu ◽

Lei Zhu ◽

Yang Yang ◽

...

Keyword(s):

Computer Vision ◽

Performance Improvement ◽

User Behavior ◽

State Of The Art ◽

Cold Start ◽

Feature Representation ◽

The Other ◽

Low Rank ◽

Significant Performance ◽

First Time

Zero-shot learning (ZSL) and cold-start recommendation (CSR) are two challenging problems in computer vision and recommender system, respectively. In general, they are independently investigated in different communities. This paper, however, reveals that ZSL and CSR are two extensions of the same intension. Both of them, for instance, attempt to predict unseen classes and involve two spaces, one for direct feature representation and the other for supplementary description. Yet there is no existing approach which addresses CSR from the ZSL perspective. This work, for the first time, formulates CSR as a ZSL problem, and a tailor-made ZSL method is proposed to handle CSR. Specifically, we propose a Lowrank Linear Auto-Encoder (LLAE), which challenges three cruxes, i.e., domain shift, spurious correlations and computing efficiency, in this paper. LLAE consists of two parts, a low-rank encoder maps user behavior into user attributes and a symmetric decoder reconstructs user behavior from user attributes. Extensive experiments on both ZSL and CSR tasks verify that the proposed method is a win-win formulation, i.e., not only can CSR be handled by ZSL models with a significant performance improvement compared with several conventional state-of-the-art methods, but the consideration of CSR can benefit ZSL as well.

Download Full-text

Automatic Image Pixel Clustering based on Mussels Wandering Optimization

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s0218001421540057 ◽

2020 ◽

pp. 2154005

Author(s):

Xin Zhong ◽

Frank Y. Shih

Keyword(s):

Computer Vision ◽

Prior Knowledge ◽

State Of The Art ◽

Fitness Function ◽

Synthetic Data ◽

Natural Image ◽

Hard Problem ◽

Image Content ◽

Number Of Clusters ◽

Np Hard Problem

Image pixel clustering or segmentation intends to identify pixel groups on an image without any preliminary labels. It remains a challenging task in computer vision since the size and shape of object segments are varied. Moreover, determining the segment number in an image without prior knowledge of the image content is an NP-hard problem. In this paper, we present an automatic image pixel clustering scheme based on mussels wandering optimization. An activation variable is applied to determine the number of clusters automatically with the cluster centers optimization. We revise the within- and between-class sum of squares ratio for random natural image content and develop a novel fitness function for the image pixel clustering task. Our proposed scheme is compared against existing state-of-the-art techniques using both synthetic data and real ASD dataset. Experimental results show the superiority performance of the proposed scheme.

Download Full-text

MARE: Self-Supervised Multi-Attention REsu-Net for Semantic Segmentation in Remote Sensing

Remote Sensing ◽

10.3390/rs13163275 ◽

2021 ◽

Vol 13 (16) ◽

pp. 3275

Author(s):

Valerio Marsocci ◽

Simone Scardapane ◽

Nikos Komodakis

Keyword(s):

Remote Sensing ◽

Neural Networks ◽

Computer Vision ◽

Land Cover ◽

Urban Development ◽

State Of The Art ◽

Semantic Segmentation ◽

The Other ◽

Aerial Images ◽

Bag Of Words

Scene understanding of satellite and aerial images is a pivotal task in various remote sensing (RS) practices, such as land cover and urban development monitoring. In recent years, neural networks have become a de-facto standard in many of these applications. However, semantic segmentation still remains a challenging task. With respect to other computer vision (CV) areas, in RS large labeled datasets are not very often available, due to their large cost and to the required manpower. On the other hand, self-supervised learning (SSL) is earning more and more interest in CV, reaching state-of-the-art in several tasks. In spite of this, most SSL models, pretrained on huge datasets like ImageNet, do not perform particularly well on RS data. For this reason, we propose a combination of a SSL algorithm (particularly, Online Bag of Words) and a semantic segmentation algorithm, shaped for aerial images (namely, Multistage Attention ResU-Net), to show new encouraging results (i.e., 81.76% mIoU with ResNet-18 backbone) on the ISPRS Vaihingen dataset.

Download Full-text

Modeling Visual Saliency in Images and Videos

Computer Vision for Multimedia Applications ◽

10.4018/978-1-60960-024-2.ch016 ◽

2011 ◽

pp. 273-293

Author(s):

Yiqun Hu ◽

Viswanath Gopalakrishnan ◽

Deepu Rajan

Keyword(s):

Computer Vision ◽

Visual Saliency ◽

Multimedia Applications ◽

The Other ◽

Visual Content ◽

Top Down ◽

Future Directions ◽

Reading Materials ◽

Salient Regions ◽

Computer Vision Applications

Visual saliency, which distinguishes “interesting” visual content from others, plays an important role in multimedia and computer vision applications. This chapter starts with a brief overview of visual saliency as well as the literature of some popular models to detect salient regions. We describe two methods to model visual saliency – one in images and the other in videos. Specifically, we introduce a graph-based method to model salient region in images in a bottom-up manner. For videos, we introduce a factorization based method to model attention object in motion, which utilizes the top-down knowledge of cameraman for model saliency. Finally, future directions for visual saliency modeling and additional reading materials are highlighted to familiarize readers with the research on visual saliency modeling for multimedia applications.

Download Full-text

Modeling Visual Saliency in Images and Videos

Image Processing ◽

10.4018/978-1-4666-3994-2.ch005 ◽

2013 ◽

pp. 79-100

Author(s):

Yiqun Hu ◽

Viswanath Gopalakrishnan ◽

Deepu Rajan

Keyword(s):

Computer Vision ◽

Visual Saliency ◽

Multimedia Applications ◽

The Other ◽

Visual Content ◽

Top Down ◽

Future Directions ◽

Reading Materials ◽

Salient Regions ◽

Computer Vision Applications

Visual saliency, which distinguishes “interesting” visual content from others, plays an important role in multimedia and computer vision applications. This chapter starts with a brief overview of visual saliency as well as the literature of some popular models to detect salient regions. We describe two methods to model visual saliency – one in images and the other in videos. Specifically, we introduce a graph-based method to model salient region in images in a bottom-up manner. For videos, we introduce a factorization based method to model attention object in motion, which utilizes the top-down knowledge of cameraman for model saliency. Finally, future directions for visual saliency modeling and additional reading materials are highlighted to familiarize readers with the research on visual saliency modeling for multimedia applications.

Download Full-text

Real Time Eye Detector with Cascaded Convolutional Neural Networks

Applied Computational Intelligence and Soft Computing ◽

10.1155/2018/1439312 ◽

2018 ◽

Vol 2018 ◽

pp. 1-8 ◽

Cited By ~ 11

Author(s):

Bin Li ◽

Hong Fu

Keyword(s):

Neural Networks ◽

Computer Vision ◽

State Of The Art ◽

Extreme Points ◽

Detection Accuracy ◽

Computer Vision Applications ◽

Candidate Regions ◽

Facial Images ◽

Eye Location ◽

Image Modality

An accurate and efficient eye detector is essential for many computer vision applications. In this paper, we present an efficient method to evaluate the eye location from facial images. First, a group of candidate regions with regional extreme points is quickly proposed; then, a set of convolution neural networks (CNNs) is adopted to determine the most likely eye region and classify the region as left or right eye; finally, the center of the eye is located with other CNNs. In the experiments using GI4E, BioID, and our datasets, our method attained a detection accuracy which is comparable to existing state-of-the-art methods; meanwhile, our method was faster and adaptable to variations of the images, including external light changes, facial occlusion, and changes in image modality.

Download Full-text

Occupant pre-crash kinematics in rotated seat arrangements

Proceedings of the Institution of Mechanical Engineers Part D Journal of Automobile Engineering ◽

10.1177/09544070211004504 ◽

2021 ◽

pp. 095440702110045

Author(s):

Alexander Diederich ◽

Christophe Bastien ◽

Karthikeyan Ekambaram ◽

Alexis Wilson

Keyword(s):

State Of The Art ◽

The Other ◽

Human Model ◽

Multi Objective ◽

Emergency Braking ◽

Seating Arrangement ◽

Current State ◽

Occupant Comfort ◽

Occupant Kinematics ◽

Belt System

The introduction of automated L5 driving technologies will revolutionise the design of vehicle interiors and seating configurations, improving occupant comfort and experience. It is foreseen that pre-crash emergency braking and swerving manoeuvres will affect occupant posture, which could lead to an interaction with a deploying airbag. This research addresses the urgent safety need of defining the occupant’s kinematics envelope during that pre-crash phase, considering rotated seat arrangements and different seatbelt configurations. The research used two different sets of volunteer tests experiencing L5 vehicle manoeuvres, based in the first instance on 22 50th percentile fit males wearing a lap-belt (OM4IS), while the other dataset is based on 87 volunteers with a BMI range of 19 to 67 kg/m2 wearing a 3-point belt (UMTRI). Unique biomechanics kinematics corridors were then defined, as a function of belt configuration and vehicle manoeuvre, to calibrate an Active Human Model (AHM) using a multi-objective optimisation coupled with a Correlation and Analysis (CORA) rating. The research improved the AHM omnidirectional kinematics response over current state of the art in a generic lap-belted environment. The AHM was then tested in a rotated seating arrangement under extreme braking, highlighting that maximum lateral and frontal motions are comparable, independent of the belt system, while the asymmetry of the 3-point belt increased the occupant’s motion towards the seatbelt buckle. It was observed that the frontal occupant kinematics decrease by 200 mm compared to a lap-belted configuration. This improved omnidirectional AHM is the first step towards designing safer future L5 vehicle interiors.

Download Full-text