Feature fusion using Extended Jaccard Graph and word embedding for robot

2017 ◽  
Vol 37 (3) ◽  
pp. 278-284 ◽  
Author(s):  
Shenglan Liu ◽  
Muxin Sun ◽  
Xiaodong Huang ◽  
Wei Wang ◽  
Feilong Wang

Purpose Robot vision is a fundamental device for human–robot interaction and robot complex tasks. In this paper, the authors aim to use Kinect and propose a feature graph fusion (FGF) for robot recognition. Design/methodology/approach The feature fusion utilizes red green blue (RGB) and depth information to construct fused feature from Kinect. FGF involves multi-Jaccard similarity to compute a robust graph and word embedding method to enhance the recognition results. Findings The authors also collect DUT RGB-Depth (RGB-D) face data set and a benchmark data set to evaluate the effectiveness and efficiency of this method. The experimental results illustrate that FGF is robust and effective to face and object data sets in robot applications. Originality/value The authors first utilize Jaccard similarity to construct a graph of RGB and depth images, which indicates the similarity of pair-wise images. Then, fusion feature of RGB and depth images can be computed by the Extended Jaccard Graph using word embedding method. The FGF can get better performance and efficiency in RGB-D sensor for robots.

2018 ◽  
Vol 6 (3) ◽  
pp. 134-146
Author(s):  
Daniil Igorevich Mikhalchenko ◽  
Arseniy Ivin ◽  
Dmitrii Malov

Purpose Single image depth prediction allows to extract depth information from a usual 2D image without usage of special sensors such as laser sensors, stereo cameras, etc. The purpose of this paper is to solve the problem of obtaining depth information from 2D image by applying deep neural networks (DNNs). Design/methodology/approach Several experiments and topologies are presented: DNN that uses three inputs—sequence of 2D images from videostream and DNN that uses only one input. However, there is no data set, that contains videostream and corresponding depth maps for every frame. So technique of creating data sets using the Blender software is presented in this work. Findings Despite the problem of an insufficient amount of available data sets, the problem of overfitting was encountered. Although created models work on the data sets, they are still overfitted and cannot predict correct depth map for the random images, that were included into the data sets. Originality/value Existing techniques of depth images creation are tested, using DNN.


2019 ◽  
Vol 16 (04) ◽  
pp. 1941002 ◽  
Author(s):  
Jing Li ◽  
Yang Mi ◽  
Gongfa Li ◽  
Zhaojie Ju

Facial expression recognition has been widely used in human computer interaction (HCI) systems. Over the years, researchers have proposed different feature descriptors, implemented different classification methods, and carried out a number of experiments on various datasets for automatic facial expression recognition. However, most of them used 2D static images or 2D video sequences for the recognition task. The main limitations of 2D-based analysis are problems associated with variations in pose and illumination, which reduce the recognition accuracy. Therefore, an alternative way is to incorporate depth information acquired by 3D sensor, because it is invariant in both pose and illumination. In this paper, we present a two-stream convolutional neural network (CNN)-based facial expression recognition system and test it on our own RGB-D facial expression dataset collected by Microsoft Kinect for XBOX in unspontaneous scenarios since Kinect is an inexpensive and portable device to capture both RGB and depth information. Our fully annotated dataset includes seven expressions (i.e., neutral, sadness, disgust, fear, happiness, anger, and surprise) for 15 subjects (9 males and 6 females) aged from 20 to 25. The two individual CNNs are identical in architecture but do not share parameters. To combine the detection results produced by these two CNNs, we propose the late fusion approach. The experimental results demonstrate that the proposed two-stream network using RGB-D images is superior to that of using only RGB images or depth images.


Author(s):  
Yan Wu ◽  
Jiqian Li ◽  
Jing Bai

RGB-D-based object recognition has been enthusiastically investigated in the past few years. RGB and depth images provide useful and complementary information. Fusing RGB and depth features can significantly increase the accuracy of object recognition. However, previous works just simply take the depth image as the fourth channel of the RGB image and concatenate the RGB and depth features, ignoring the different power of RGB and depth information for different objects. In this paper, a new method which contains three different classifiers is proposed to fuse features extracted from RGB image and depth image for RGB-D-based object recognition. Firstly, a RGB classifier and a depth classifier are trained by cross-validation to get the accuracy difference between RGB and depth features for each object. Then a variant RGB-D classifier is trained with different initialization parameters for each class according to the accuracy difference. The variant RGB-D-classifier can result in a more robust classification performance. The proposed method is evaluated on two benchmark RGB-D datasets. Compared with previous methods, ours achieves comparable performance with the state-of-the-art method.


2014 ◽  
Vol 31 (8) ◽  
pp. 1709-1719
Author(s):  
Ming-Yuan Shieh ◽  
Chung-Yu Hsieh ◽  
Tsung-Min Hsieh

Purpose – The purpose of this paper is to propose a fast object detection algorithm based on structural light analysis, which aims to detect and recognize human gesture and pose and then to conclude the respective commands for human-robot interaction control. Design/methodology/approach – In this paper, the human poses are estimated and analyzed by the proposed scheme, and then the resultant data concluded by the fuzzy decision-making system are used to launch respective robotic motions. The RGB camera and the infrared light module aim to do distance estimation of a body or several bodies. Findings – The modules not only provide image perception but also objective skeleton detection. In which, a laser source in the infrared light module emits invisible infrared light which passes through a filter and is scattered into a semi-random but constant pattern of small dots which is projected onto the environment in front of the sensor. The reflected pattern is then detected by an infrared camera and analyzed for depth estimation. Since the depth of object is a key parameter for pose recognition, one can estimate the distance to each dot and then get depth information by calculation of distance between emitter and receiver. Research limitations/implications – Future work will consider to reduce the computation time for objective estimation and to tune parameters adaptively. Practical implications – The experimental results demonstrate the feasibility of the proposed system. Originality/value – This paper achieves real-time human-robot interaction by visual detection based on structural light analysis.


Author(s):  
Giorgio Metta

This chapter outlines a number of research lines that, starting from the observation of nature, attempt to mimic human behavior in humanoid robots. Humanoid robotics is one of the most exciting proving grounds for the development of biologically inspired hardware and software—machines that try to recreate billions of years of evolution with some of the abilities and characteristics of living beings. Humanoids could be especially useful for their ability to “live” in human-populated environments, occupying the same physical space as people and using tools that have been designed for people. Natural human–robot interaction is also an important facet of humanoid research. Finally, learning and adapting from experience, the hallmark of human intelligence, may require some approximation to the human body in order to attain similar capacities to humans. This chapter focuses particularly on compliant actuation, soft robotics, biomimetic robot vision, robot touch, and brain-inspired motor control in the context of the iCub humanoid robot.


2020 ◽  
Vol 47 (3) ◽  
pp. 547-560 ◽  
Author(s):  
Darush Yazdanfar ◽  
Peter Öhman

PurposeThe purpose of this study is to empirically investigate determinants of financial distress among small and medium-sized enterprises (SMEs) during the global financial crisis and post-crisis periods.Design/methodology/approachSeveral statistical methods, including multiple binary logistic regression, were used to analyse a longitudinal cross-sectional panel data set of 3,865 Swedish SMEs operating in five industries over the 2008–2015 period.FindingsThe results suggest that financial distress is influenced by macroeconomic conditions (i.e. the global financial crisis) and, in particular, by various firm-specific characteristics (i.e. performance, financial leverage and financial distress in previous year). However, firm size and industry affiliation have no significant relationship with financial distress.Research limitationsDue to data availability, this study is limited to a sample of Swedish SMEs in five industries covering eight years. Further research could examine the generalizability of these findings by investigating other firms operating in other industries and other countries.Originality/valueThis study is the first to examine determinants of financial distress among SMEs operating in Sweden using data from a large-scale longitudinal cross-sectional database.


2017 ◽  
Vol 55 (4) ◽  
pp. 376-389 ◽  
Author(s):  
Alice Huguet ◽  
Caitlin C. Farrell ◽  
Julie A. Marsh

Purpose The use of data for instructional improvement is prevalent in today’s educational landscape, yet policies calling for data use may result in significant variation at the school level. The purpose of this paper is to focus on tools and routines as mechanisms of principal influence on data-use professional learning communities (PLCs). Design/methodology/approach Data were collected through a comparative case study of two low-income, low-performing schools in one district. The data set included interview and focus group transcripts, observation field notes and documents, and was iteratively coded. Findings The two principals in the study employed tools and routines differently to influence ways that teachers interacted with data in their PLCs. Teachers who were given leeway to co-construct data-use tools found them to be more beneficial to their work. Findings also suggest that teachers’ data use may benefit from more flexibility in their day-to-day PLC routines. Research limitations/implications Closer examination of how tools are designed and time is spent in data-use PLCs may help the authors further understand the influence of the principal’s role. Originality/value Previous research has demonstrated that data use can improve teacher instruction, yet the varied implementation of data-use PLCs in this district illustrates that not all students have an equal opportunity to learn from teachers who meaningfully engage with data.


Author(s):  
Jing Qi ◽  
Kun Xu ◽  
Xilun Ding

AbstractHand segmentation is the initial step for hand posture recognition. To reduce the effect of variable illumination in hand segmentation step, a new CbCr-I component Gaussian mixture model (GMM) is proposed to detect the skin region. The hand region is selected as a region of interest from the image using the skin detection technique based on the presented CbCr-I component GMM and a new adaptive threshold. A new hand shape distribution feature described in polar coordinates is proposed to extract hand contour features to solve the false recognition problem in some shape-based methods and effectively recognize the hand posture in cases when different hand postures have the same number of outstretched fingers. A multiclass support vector machine classifier is utilized to recognize the hand posture. Experiments were carried out on our data set to verify the feasibility of the proposed method. The results showed the effectiveness of the proposed approach compared with other methods.


2017 ◽  
Vol 37 (1) ◽  
pp. 1-12 ◽  
Author(s):  
Haluk Ay ◽  
Anthony Luscher ◽  
Carolyn Sommerich

Purpose The purpose of this study is to design and develop a testing device to simulate interaction between human hand–arm dynamics, right-angle (RA) computer-controlled power torque tools and joint-tightening task-related variables. Design/methodology/approach The testing rig can simulate a variety of tools, tasks and operator conditions. The device includes custom data-acquisition electronics and graphical user interface-based software. The simulation of the human hand–arm dynamics is based on the rig’s four-bar mechanism-based design and mechanical components that provide adjustable stiffness (via pneumatic cylinder) and mass (via plates) and non-adjustable damping. The stiffness and mass values used are based on an experimentally validated hand–arm model that includes a database of model parameters. This database is with respect to gender and working posture, corresponding to experienced tool operators from a prior study. Findings The rig measures tool handle force and displacement responses simultaneously. Peak force and displacement coefficients of determination (R2) between rig estimations and human testing measurements were 0.98 and 0.85, respectively, for the same set of tools, tasks and operator conditions. The rig also provides predicted tool operator acceptability ratings, using a data set from a prior study of discomfort in experienced operators during torque tool use. Research limitations/implications Deviations from linearity may influence handle force and displacement measurements. Stiction (Coulomb friction) in the overall rig, as well as in the air cylinder piston, is neglected. The rig’s mechanical damping is not adjustable, despite the fact that human hand–arm damping varies with respect to gender and working posture. Deviations from these assumptions may affect the correlation of the handle force and displacement measurements with those of human testing for the same tool, task and operator conditions. Practical implications This test rig will allow the rapid assessment of the ergonomic performance of DC torque tools, saving considerable time in lineside applications and reducing the risk of worker injury. DC torque tools are an extremely effective way of increasing production rate and improving torque accuracy. Being a complex dynamic system, however, the performance of DC torque tools varies in each application. Changes in worker mass, damping and stiffness, as well as joint stiffness and tool program, make each application unique. This test rig models all of these factors and allows quick assessment. Social implications The use of this tool test rig will help to identify and understand risk factors that contribute to musculoskeletal disorders (MSDs) associated with the use of torque tools. Tool operators are subjected to large impulsive handle reaction forces, as joint torque builds up while tightening a fastener. Repeated exposure to such forces is associated with muscle soreness, fatigue and physical stress which are also risk factors for upper extremity injuries (MSDs; e.g. tendinosis, myofascial pain). Eccentric exercise exertions are known to cause damage to muscle tissue in untrained individuals and affect subsequent performance. Originality/value The rig provides a novel means for quantitative, repeatable dynamic evaluation of RA powered torque tools and objective selection of tightening programs. Compared to current static tool assessment methods, dynamic testing provides a more realistic tool assessment relative to the tool operator’s experience. This may lead to improvements in tool or controller design and reduction in associated musculoskeletal discomfort in operators.


Sign in / Sign up

Export Citation Format

Share Document