scholarly journals Image-Text Joint Learning for Social Images with Spatial Relation Model

Complexity ◽  
2020 ◽  
Vol 2020 ◽  
pp. 1-11
Author(s):  
Jiangfan Feng ◽  
Xuejun Fu ◽  
Yao Zhou ◽  
Yuling Zhu ◽  
Xiaobo Luo

The rapid developments in sensor technology and mobile devices bring a flourish of social images, and large-scale social images have attracted increasing attention to researchers. Existing approaches generally rely on recognizing object instances individually with geo-tags, visual patterns, etc. However, the social image represents a web of interconnected relations; these relations between entities carry semantic meaning and help a viewer differentiate between instances of a substance. This article forms the perspective of the spatial relationship to exploring the joint learning of social images. Precisely, the model consists of three parts: (a) a module for deep semantic understanding of images based on residual network (ResNet); (b) a deep semantic analysis module of text beyond traditional word bag methods; (c) a joint reasoning module from which the text weights obtained using image features on self-attention and a novel tree-based clustering algorithm. The experimental results demonstrate the effectiveness of using Flickr30k and Microsoft COCO datasets. Meanwhile, our method considers spatial relations while matching.

2014 ◽  
Vol 667 ◽  
pp. 277-285 ◽  
Author(s):  
Fang Chen ◽  
Yan Hui Zhou

With the rapid development of Internet, tag technology has been widely used in various sites. The brief text labels of network resources are greatly convenient for people to access the massive data. Social tags allows the user to use any word ----to tag network objects, and to share these tags, because of its simple and flexible operation, and it has become one of the popular applications. However, there exists some problems like noise of tags, lack of using criteria, and sparse distribution etc. Especially sparsity of tags seriously limits its application in the semantic analysis of web pages. This paper, by exploiting the user-related tag expansion method to overcome this problem, at the same time by using the topic model----LDA to model the web tags, mine its potential topic from the large-scale web page, and obtain the topic distribution of the text to the text clustering analysis. The experimental results show that, compared with the traditional clustering algorithm, the method of based LDA clustering on the analysis of the web tags have a larger increase.


Author(s):  
Noralhuda Alabid

By interpreting spatial relations among objects, many applications such as video surveillance, robotics, and scene understanding systems can be utilized efficiently for different purposes. The vast majority of known models for spatial relationships are carried out with an image. However, due to the advance in technology, a three-dimensional scene became available. For our knowledge, most of the interpreted spatial relations were defined between silent objects in images. A technique for determining the dynamic spatial relation between a moving object and another silent one in a time varying scene is presented here. The spatial relationships were determined by using motion-based object tracking along with hypergraph object-oriented model. Defining the spatial relationship types between a single silent object and a moving human body has applied based on two strategies; determining each object with a bounding box, then comparing the locations of these boxes by applying certain conditional rules. This study identifies some of the spatial relationships in three dimensions of streaming frames, which has carried out by establishing a highly accurate and efficient proposed algorithm. The following relations have been studied; (“direct in front of”, “in front of on the Right/Left”, “direct behind of”, “behind of on the Right/Left”, “to the Right”, “to the Left”, “On”, “Under”, Besides, and “Besides on to the Right/Left”). The experimental results, which have been obtained based on actual indoor streaming frames, show effectiveness and reliable execution of our system


2009 ◽  
Vol 35 (7) ◽  
pp. 859-866
Author(s):  
Ming LIU ◽  
Xiao-Long WANG ◽  
Yuan-Chao LIU

2021 ◽  
Vol 13 (3) ◽  
pp. 355
Author(s):  
Weixian Tan ◽  
Borong Sun ◽  
Chenyu Xiao ◽  
Pingping Huang ◽  
Wei Xu ◽  
...  

Classification based on polarimetric synthetic aperture radar (PolSAR) images is an emerging technology, and recent years have seen the introduction of various classification methods that have been proven to be effective to identify typical features of many terrain types. Among the many regions of the study, the Hunshandake Sandy Land in Inner Mongolia, China stands out for its vast area of sandy land, variety of ground objects, and intricate structure, with more irregular characteristics than conventional land cover. Accounting for the particular surface features of the Hunshandake Sandy Land, an unsupervised classification method based on new decomposition and large-scale spectral clustering with superpixels (ND-LSC) is proposed in this study. Firstly, the polarization scattering parameters are extracted through a new decomposition, rather than other decomposition approaches, which gives rise to more accurate feature vector estimate. Secondly, a large-scale spectral clustering is applied as appropriate to meet the massive land and complex terrain. More specifically, this involves a beginning sub-step of superpixels generation via the Adaptive Simple Linear Iterative Clustering (ASLIC) algorithm when the feature vector combined with the spatial coordinate information are employed as input, and subsequently a sub-step of representative points selection as well as bipartite graph formation, followed by the spectral clustering algorithm to complete the classification task. Finally, testing and analysis are conducted on the RADARSAT-2 fully PolSAR dataset acquired over the Hunshandake Sandy Land in 2016. Both qualitative and quantitative experiments compared with several classification methods are conducted to show that proposed method can significantly improve performance on classification.


Sensors ◽  
2021 ◽  
Vol 21 (14) ◽  
pp. 4804
Author(s):  
Marcin Piekarczyk ◽  
Olaf Bar ◽  
Łukasz Bibrzycki ◽  
Michał Niedźwiecki ◽  
Krzysztof Rzecki ◽  
...  

Gamification is known to enhance users’ participation in education and research projects that follow the citizen science paradigm. The Cosmic Ray Extremely Distributed Observatory (CREDO) experiment is designed for the large-scale study of various radiation forms that continuously reach the Earth from space, collectively known as cosmic rays. The CREDO Detector app relies on a network of involved users and is now working worldwide across phones and other CMOS sensor-equipped devices. To broaden the user base and activate current users, CREDO extensively uses the gamification solutions like the periodical Particle Hunters Competition. However, the adverse effect of gamification is that the number of artefacts, i.e., signals unrelated to cosmic ray detection or openly related to cheating, substantially increases. To tag the artefacts appearing in the CREDO database we propose the method based on machine learning. The approach involves training the Convolutional Neural Network (CNN) to recognise the morphological difference between signals and artefacts. As a result we obtain the CNN-based trigger which is able to mimic the signal vs. artefact assignments of human annotators as closely as possible. To enhance the method, the input image signal is adaptively thresholded and then transformed using Daubechies wavelets. In this exploratory study, we use wavelet transforms to amplify distinctive image features. As a result, we obtain a very good recognition ratio of almost 99% for both signal and artefacts. The proposed solution allows eliminating the manual supervision of the competition process.


2021 ◽  
Vol 10 (7) ◽  
pp. 432
Author(s):  
Nicolai Moos ◽  
Carsten Juergens ◽  
Andreas P. Redecker

This paper describes a methodological approach that is able to analyse socio-demographic and -economic data in large-scale spatial detail. Based on the two variables, population density and annual income, one investigates the spatial relationship of these variables to identify locations of imbalance or disparities assisted by bivariate choropleth maps. The aim is to gain a deeper insight into spatial components of socioeconomic nexuses, such as the relationships between the two variables, especially for high-resolution spatial units. The used methodology is able to assist political decision-making, target group advertising in the field of geo-marketing and for the site searches of new shop locations, as well as further socioeconomic research and urban planning. The developed methodology was tested in a national case study in Germany and is easily transferrable to other countries with comparable datasets. The analysis was carried out utilising data about population density and average annual income linked to spatially referenced polygons of postal codes. These were disaggregated initially via a readapted three-class dasymetric mapping approach and allocated to large-scale city block polygons. Univariate and bivariate choropleth maps generated from the resulting datasets were then used to identify and compare spatial economic disparities for a study area in North Rhine-Westphalia (NRW), Germany. Subsequently, based on these variables, a multivariate clustering approach was conducted for a demonstration area in Dortmund. In the result, it was obvious that the spatially disaggregated data allow more detailed insight into spatial patterns of socioeconomic attributes than the coarser data related to postal code polygons.


Author(s):  
Ju-Wei Chen ◽  
Suh-Yin Lee

Chinese characters are constructed by basic strokes based on structural rules. In handwritten characters, the shapes of the strokes may vary to some extent, but the spatial relations and geometric configurations of the strokes are usually maintained. Therefore these spatial relations and configurations could be regarded as invariant features and could be used in the recognition of handwritten Chinese characters. In this paper, we investigate the structural knowledge in Chinese characters and propose the stroke spatial relationship representation (SSRR) to describe Chinese characters. An On-Line Chinese Character Recognition (OLCCR) method using the SSRR is also presented. With SSRR, each character is processed and is represented by an attribute graph. The process of character recognition is thereby transformed into a graph matching problem. After careful analysis, the basic spatial relationship between strokes can be characterized into five classes. A bitwise representation is adopted in the design of the data structure to reduce storage requirements and to speed up character matching. The strategy of hierarchical search in the preclassification improves the recognition speed. Basically, the attribute graph model is a generalized character representation that provides a useful and convenient representation for newly added characters in an OLCCR system with automatic learning capability. The significance of the structural approach of character recognition using spatial relationships is analyzed and is proved by experiments. Realistic testing is provided to show the effectiveness of the proposed method.


2021 ◽  
Vol 0 (0) ◽  
Author(s):  
Víctor Acedo-Matellán

Abstract Prefixed verbs in Latin may take an argument in the dative case, interpreted as the ground of the spatial relation codified by the preverb. This phenomenon is constrained by the semantics of that spatial relation: while preverbs encoding a location, a goal, or a source of motion generally accept the dative argument, preverbs encoding a route do not. I propose a syntactic analysis of this phenomenon, framed within the Spanning framework. I assume an analysis of the spatial dative as an applied argument interpreted as a possessor of the final location of motion. Developing a configurational theory of spatial relations, I show how only the syntax-semantics of the preverbs interpreted as encoding a location, be this final (a goal), initial (a source), or unrelated to motion (a static location), is compatible with the projection of an Appl(icative)P integrating the dative argument. By the same token, pure route preverbs, involving a path but not a location, are correctly predicted to disallow the projection of ApplP, and hence the spatial dative.


Sign in / Sign up

Export Citation Format

Share Document