scholarly journals A Bio-Inspired Integration Method for Object Semantic Representation

2016 ◽  
Vol 6 (3) ◽  
pp. 137-154 ◽  
Author(s):  
Hui Wei

Abstract We have two motivations. Firstly, semantic gap is a tough problem puzzling almost all sub-fields of Artificial Intelligence. We think semantic gap is the conflict between the abstractness of high-level symbolic definition and the details, diversities of low-level stimulus. Secondly, in object recognition, a pre-defined prototype of object is crucial and indispensable for bi-directional perception processing. On the one hand this prototype was learned from perceptional experience, and on the other hand it should be able to guide future downward processing. Human can do this very well, so physiological mechanism is simulated here. We utilize a mechanism of classical and non-classical receptive field (nCRF) to design a hierarchical model and form a multi-layer prototype of an object. This also is a realistic definition of concept, and a representation of denoting semantic. We regard this model as the most fundamental infrastructure that can ground semantics. Here a AND-OR tree is constructed to record prototypes of a concept, in which either raw data at low-level or symbol at high-level is feasible, and explicit production rules are also available. For the sake of pixel processing, knowledge should be represented in a data form; for the sake of scene reasoning, knowledge should be represented in a symbolic form. The physiological mechanism happens to be the bridge that can join them together seamlessly. This provides a possibility for finding a solution to semantic gap problem, and prevents discontinuity in low-order structures.

2021 ◽  
Author(s):  
Maryam Nematollahi Arani

Object recognition has become a central topic in computer vision applications such as image search, robotics and vehicle safety systems. However, it is a challenging task due to the limited discriminative power of low-level visual features in describing the considerably diverse range of high-level visual semantics of objects. Semantic gap between low-level visual features and high-level concepts are a bottleneck in most systems. New content analysis models need to be developed to bridge the semantic gap. In this thesis, algorithms based on conditional random fields (CRF) from the class of probabilistic graphical models are developed to tackle the problem of multiclass image labeling for object recognition. Image labeling assigns a specific semantic category from a predefined set of object classes to each pixel in the image. By well capturing spatial interactions of visual concepts, CRF modeling has proved to be a successful tool for image labeling. This thesis proposes novel approaches to empowering the CRF modeling for robust image labeling. Our primary contributions are twofold. To better represent feature distributions of CRF potentials, new feature functions based on generalized Gaussian mixture models (GGMM) are designed and their efficacy is investigated. Due to its shape parameter, GGMM can provide a proper fit to multi-modal and skewed distribution of data in nature images. The new model proves more successful than Gaussian and Laplacian mixture models. It also outperforms a deep neural network model on Corel imageset by 1% accuracy. Further in this thesis, we apply scene level contextual information to integrate global visual semantics of the image with pixel-wise dense inference of fully-connected CRF to preserve small objects of foreground classes and to make dense inference robust to initial misclassifications of the unary classifier. Proposed inference algorithm factorizes the joint probability of labeling configuration and image scene type to obtain prediction update equations for labeling individual image pixels and also the overall scene type of the image. The proposed context-based dense CRF model outperforms conventional dense CRF model by about 2% in terms of labeling accuracy on MSRC imageset and by 4% on SIFT Flow imageset. Also, the proposed model obtains the highest scene classification rate of 86% on MSRC dataset.


2017 ◽  
Vol 1 (1) ◽  
Author(s):  
Roseilla Nora Izaach

This study aimed to describe the level of grit in the Nursing Academy student X in the Aru Islands. Grit is the one of the latest theory in the study of Positive Psychology which emphasizes of two important aspects are perseverance of efforts and consistency of interest, that determines the success of individuals in achieving their life goals. The goal of achieving future success through education is the reason this research is conducted. Respondents in this study were students in 2014. The number of respondents are 51 people with entirely female. Measuring instrument used in this study was grit scale consists of 12 items with reliability of 0.85 and a validity coefficient range  from 0.44 to 0.82 ( Duckworth, et.al.,2007) . Based on the results of the processing of descriptive data, it was found that the majority of respondents have a low level of grit with percentage of 86.3%. Variable aspect of grit perseverance of efforts, the majority of respondents have a low level of 90.2%, and the consistency aspect of interest, the majority of respondents have a high level of 66.7%. The socioeconomic status of the students is based on the type of work of the parents, not indicating the tendency to be related to the degree of grit. Further research that can be done is to investigate more deeply about the contribution of personality factors, differences in cultural background and demographics that affect grit. Keywords: Grit, socioeconomic status, demographics


2020 ◽  
Vol 9 (4) ◽  
pp. 256 ◽  
Author(s):  
Liguo Weng ◽  
Yiming Xu ◽  
Min Xia ◽  
Yonghong Zhang ◽  
Jia Liu ◽  
...  

Changes on lakes and rivers are of great significance for the study of global climate change. Accurate segmentation of lakes and rivers is critical to the study of their changes. However, traditional water area segmentation methods almost all share the following deficiencies: high computational requirements, poor generalization performance, and low extraction accuracy. In recent years, semantic segmentation algorithms based on deep learning have been emerging. Addressing problems associated to a very large number of parameters, low accuracy, and network degradation during training process, this paper proposes a separable residual SegNet (SR-SegNet) to perform the water area segmentation using remote sensing images. On the one hand, without compromising the ability of feature extraction, the problem of network degradation is alleviated by adding modified residual blocks into the encoder, the number of parameters is limited by introducing depthwise separable convolutions, and the ability of feature extraction is improved by using dilated convolutions to expand the receptive field. On the other hand, SR-SegNet removes the convolution layers with relatively more convolution kernels in the encoding stage, and uses the cascading method to fuse the low-level and high-level features of the image. As a result, the whole network can obtain more spatial information. Experimental results show that the proposed method exhibits significant improvements over several traditional methods, including FCN, DeconvNet, and SegNet.


2015 ◽  
Vol 20 (4) ◽  
pp. 46-56
Author(s):  
V.A. Ilyin ◽  
E.V. Khrisanova

The article presents the results of a study of intellectual development of high-status, middle-status and low-status members of the educational preschool groups. It is shown that the intellectual development of high status and middle status 4-5 years old children is higher than their low-status peers, especially in such aspects as perception, attention, and memory. This integral indicator of high status subjects corresponds to the average or high level of intelligence, and for most of the subjects of this category is characterized by a high level. An integral component of intellectual development of middle-status children is comparable to the one in high-status. In fact, there is only one, but not least, difference between the two categories: among high-status children there is no kids whose integral indicator of intellectual development is below average. Integral indicator of intellectual development of most low-status subjects corresponds to the low intelligence level. We analyzed a dialectical relationship of intellectual, social, and psychological development of preschool children according to the concept of «interpersonal situation of development». The article presents methodical maintenance of structure definition of interpersonal relations in the preschool educational groups. The study proposed a number of scientific and practical recommendations.


2021 ◽  
Vol 6 (2) ◽  
pp. 161-167
Author(s):  
Eduard Yakubchykt ◽  
◽  
Iryna Yurchak

Finding similar images on a visual sample is a difficult AI task, to solve which many works are devoted. The problem is to determine the essential properties of images of low and higher semantic level. Based on them, a vector of features is built, which will be used in the future to compare pairs of images. Each pair always includes an image from the collection and a sample image that the user is looking for. The result of the comparison is a quantity called the visual relativity of the images. Image properties are called features and are evaluated by calculation algorithms. Image features can be divided into low-level and high-level. Low-level features include basic colors, textures, shapes, significant elements of the whole image. These features are used as part of more complex recognition tasks. The main progress is in the definition of high-level features, which is associated with understanding the content of images. In this paper, research of modern algorithms is done for finding similar images in large multimedia databases. The main problems of determining high-level image features, algorithms of overcoming them and application of effective algorithms are described. The algorithms used to quickly determine the semantic content and improve the search accuracy of similar images are presented. The aim: The purpose of work is to conduct comparative analysis of modern image retrieval algorithms and retrieve its weakness and strength.


Sensors ◽  
2021 ◽  
Vol 21 (21) ◽  
pp. 7136
Author(s):  
Zhiqiang Zhang ◽  
Xin Qiu ◽  
Yongzhou Li

Feature Pyramid Network (FPN) is used as the neck of current popular object detection networks. Research has shown that the structure of FPN has some defects. In addition to the loss of information caused by the reduction of the channel number, the features scale of different levels are also different, and the corresponding information at different abstract levels are also different, resulting in a semantic gap between each level. We call the semantic gap level imbalance. Correlation convolution is a way to alleviate the imbalance between adjacent layers; however, how to alleviate imbalance between all levels is another problem. In this article, we propose a new simple but effective network structure called Scale-Equalizing Feature Pyramid Network (SEFPN), which generates multiple features of different scales by iteratively fusing the features of each level. SEFPN improves the overall performance of the network by balancing the semantic representation of each layer of features. The experimental results on the MS-COCO2017 dataset show that the integration of SEFPN as a standalone module into the one-stage network can further improve the performance of the detector, by ∼1AP, and improve the detection performance of Faster R-CNN, a typical two-stage network, especially for large object detection APL∼2AP.


2001 ◽  
Vol 01 (01) ◽  
pp. 63-81 ◽  
Author(s):  
ALAN HANJALIC ◽  
REGINALD L. LAGENDIJK ◽  
JAN BIEMOND

This paper addresses the problem of automatically partitioning a video into semantic segments using visual low-level features only. Semantic segments may be understood as building content blocks of a video with a clear sequential content structure. Examples are reports in a news program, episodes in a movie, scenes of a situation comedy or topic segments of a documentary. In some video genres like news programs or documentaries, the usage of different media (visual, audio, speech, text) may be beneficial or is even unavoidable for reliably detecting the boundaries between semantic segments. In many other genres, however, the pay-off in using different media for the purpose of high-level segmentation is not high. On the one hand, relating the audio, speech or text to the semantic temporal structure of video content is generally very difficult. This is especially so in "acting" video genres like movies and situation comedies. On the other hand, the information contained in the visual stream of these video genres often seems to provide the major clue about the position of semantic segments boundaries. Partitioning a video into semantic segments can be performed by measuring the coherence of the content along neighboring video shots of a sequence. The segment boundaries are then found at places (e.g., shot boundaries) where the values of content coherence are sufficiently low. On the basis of two state-of-the-art techniques for content coherence modeling, we illustrate in this paper the current possibilities for detecting the boundaries of semantic segments using visual low-level features only.


Author(s):  
Subalalitha C. N.

This chapter discusses how text summaries could be generated by using a high-level semantic representation. The semantic representation is built using the discourse structure which is comprised of three text representation techniques, namely, universal networking language (UNL), rhetorical structure theory (RST), and Saṅgatis. Sangati is an ancient concept that is used in Sanskrit language literature to capture coherence. This discourse structure is indexed using a concept called sūtra which has been used in both Tamil language and Sanskrit literatures. The chapter mainly focusses on how summary could be generated by using this unique discourse structure and the indexing technique concept, sūtra. Forum for information retreival (FIRE) corpus has been used to test the system and a performance comparison has been done with the one of the state-of-art summary generation systems that is built on discourse structure.


2015 ◽  
Vol 114 (2) ◽  
pp. 846-856 ◽  
Author(s):  
Ronen Sosnik ◽  
Eliyahu Chaim ◽  
Tamar Flash

Stopping performance is known to depend on low-level motion features, such as movement velocity. It is not known, however, whether it is also subject to high-level motion constraints. Here, we report results of 15 subjects instructed to connect four target points depicted on a digitizing tablet and stop “as rapidly as possible” upon hearing a “stop” cue (tone). Four subjects connected target points with straight paths, whereas 11 subjects generated movements corresponding to coarticulation between adjacent movement components. For the noncoarticulating and coarticulating subjects, stopping performance was not correlated or only weakly correlated with motion velocity, respectively. The generation of a straight, point-to-point movement or a smooth, curved trajectory was not disturbed by the occurrence of a stop cue. Overall, the results indicate that stopping performance is subject to high-level motion constraints, such as the completion of a geometrical plan, and that globally planned movements, once started, must run to completion, providing evidence for the definition of a motion primitive as an unstoppable motion element.


Sign in / Sign up

Export Citation Format

Share Document