scholarly journals A Generic Model to Compose Vision Modules for Holistic Scene Understanding

Author(s):  
Congcong Li ◽  
Adarsh Kowdle ◽  
Ashutosh Saxena ◽  
Tsuhan Chen
2003 ◽  
Vol 42 (03) ◽  
pp. 203-211 ◽  
Author(s):  
J. L. G. Dietz ◽  
A. Hasman ◽  
P. F. de Vries Robbé ◽  
H. J. Tange

Summary Objectives: Many shared-care projects feel the need for electronic patient-record (EPR) systems. In absence of practical experiences from paper record keeping, a theoretical model is the only reference for the design of these systems. In this article, we review existing models of individual clinical practice and integrate their useful elements. We then present a generic model of clinical practice that is applicable to both individual and collaborative clinical practice. Methods: We followed the principles of the conversation-for-action theory and the DEMO method. According to these principles, information can only be generated by a conversation between two actors. An actor is a role that can be played by one or more human subjects, so the model does not distinguish between inter-individual and intra-individual conversations. Results: Clinical practice has been divided into four actors: service provider, problem solver, coordinator, and worker. Each actor represents a level of clinical responsibility. Any information in the patient record is the result of a conversation between two of these actors. Connecting different conversations to one another can create a process view with meta-information about the rationale of clinical practice. Such process view can be implemented as an extension to the EPR. Conclusions: The model has the potential to cover all professional activities, but needs to be further validated. The model can serve as a theoretical basis for the design of EPR-systems for shared care, but a successful EPR-system needs more than just a theoretical model.


2012 ◽  
Author(s):  
Laurent Itti ◽  
Nader Noori ◽  
Lior Elazary

Symmetry ◽  
2021 ◽  
Vol 13 (4) ◽  
pp. 563
Author(s):  
Babu Rajendiran ◽  
Jayashree Kanniappan

Nowadays, many business organizations are operating on the cloud environment in order to diminish their operating costs and to select the best service from many cloud providers. The increasing number of Cloud Services available on the market encourages the cloud consumer to be conscious in selecting the most apt Cloud Service Provider that satisfies functionality, as well as QoS parameters. Many disciplines of computer-based applications use standardized ontology to represent information in their fields that indicate the necessity of an ontology-based representation. The proposed generic model can help service consumers to identify QoS parameters interrelations in the cloud services selection ontology during run-time, and for service providers to enhance their business by interpreting the various relations. The ontology has been developed using the intended attributes of QoS from various service providers. A generic model has been developed and it is tested with the developed ontology.


Author(s):  
Jin Zhou ◽  
Qing Zhang ◽  
Jian-Hao Fan ◽  
Wei Sun ◽  
Wei-Shi Zheng

AbstractRecent image aesthetic assessment methods have achieved remarkable progress due to the emergence of deep convolutional neural networks (CNNs). However, these methods focus primarily on predicting generally perceived preference of an image, making them usually have limited practicability, since each user may have completely different preferences for the same image. To address this problem, this paper presents a novel approach for predicting personalized image aesthetics that fit an individual user’s personal taste. We achieve this in a coarse to fine manner, by joint regression and learning from pairwise rankings. Specifically, we first collect a small subset of personal images from a user and invite him/her to rank the preference of some randomly sampled image pairs. We then search for the K-nearest neighbors of the personal images within a large-scale dataset labeled with average human aesthetic scores, and use these images as well as the associated scores to train a generic aesthetic assessment model by CNN-based regression. Next, we fine-tune the generic model to accommodate the personal preference by training over the rankings with a pairwise hinge loss. Experiments demonstrate that our method can effectively learn personalized image aesthetic preferences, clearly outperforming state-of-the-art methods. Moreover, we show that the learned personalized image aesthetic benefits a wide variety of applications.


2021 ◽  
Vol 10 (7) ◽  
pp. 488
Author(s):  
Peng Li ◽  
Dezheng Zhang ◽  
Aziguli Wulamu ◽  
Xin Liu ◽  
Peng Chen

A deep understanding of our visual world is more than an isolated perception on a series of objects, and the relationships between them also contain rich semantic information. Especially for those satellite remote sensing images, the span is so large that the various objects are always of different sizes and complex spatial compositions. Therefore, the recognition of semantic relations is conducive to strengthen the understanding of remote sensing scenes. In this paper, we propose a novel multi-scale semantic fusion network (MSFN). In this framework, dilated convolution is introduced into a graph convolutional network (GCN) based on an attentional mechanism to fuse and refine multi-scale semantic context, which is crucial to strengthen the cognitive ability of our model Besides, based on the mapping between visual features and semantic embeddings, we design a sparse relationship extraction module to remove meaningless connections among entities and improve the efficiency of scene graph generation. Meanwhile, to further promote the research of scene understanding in remote sensing field, this paper also proposes a remote sensing scene graph dataset (RSSGD). We carry out extensive experiments and the results show that our model significantly outperforms previous methods on scene graph generation. In addition, RSSGD effectively bridges the huge semantic gap between low-level perception and high-level cognition of remote sensing images.


2021 ◽  
Vol 10 (1) ◽  
pp. 32
Author(s):  
Abhishek V. Potnis ◽  
Surya S. Durbha ◽  
Rajat C. Shinde

Earth Observation data possess tremendous potential in understanding the dynamics of our planet. We propose the Semantics-driven Remote Sensing Scene Understanding (Sem-RSSU) framework for rendering comprehensive grounded spatio-contextual scene descriptions for enhanced situational awareness. To minimize the semantic gap for remote-sensing-scene understanding, the framework puts forward the transformation of scenes by using semantic-web technologies to Remote Sensing Scene Knowledge Graphs (RSS-KGs). The knowledge-graph representation of scenes has been formalized through the development of a Remote Sensing Scene Ontology (RSSO)—a core ontology for an inclusive remote-sensing-scene data product. The RSS-KGs are enriched both spatially and contextually, using a deductive reasoner, by mining for implicit spatio-contextual relationships between land-cover classes in the scenes. The Sem-RSSU, at its core, constitutes novel Ontology-driven Spatio-Contextual Triple Aggregation and realization algorithms to transform KGs to render grounded natural language scene descriptions. Considering the significance of scene understanding for informed decision-making from remote sensing scenes during a flood, we selected it as a test scenario, to demonstrate the utility of this framework. In that regard, a contextual domain knowledge encompassing Flood Scene Ontology (FSO) has been developed. Extensive experimental evaluations show promising results, further validating the efficacy of this framework.


2021 ◽  
pp. 1-31
Author(s):  
S.H. Derrouaoui ◽  
Y. Bouzid ◽  
M. Guiatni

Abstract Recently, transformable Unmanned Aerial Vehicles (UAVs) have become a subject of great interest in the field of flying systems, due to their maneuverability, agility and morphological capacities. They can be used for specific missions and in more congested spaces. Moreover, this novel class of UAVs is considered as a viable solution for providing flying robots with specific and versatile functionalities. In this paper, we propose (i) a new design of a transformable quadrotor with (ii) generic modeling and (iii) adaptive control strategy. The proposed UAV is able to change its flight configuration by rotating its four arms independently around a central body, thanks to its adaptive geometry. To simplify and lighten the prototype, a simple mechanism with a light mechanical structure is proposed. Since the Center of Gravity (CoG) of the UAV moves according to the desired morphology of the system, a variation of the inertia and the allocation matrix occurs instantly. These dynamics parameters play an important role in the system control and its stability, representing a key difference compared with the classic quadrotor. Thus, a new generic model is developed, taking into account all these variations together with aerodynamic effects. To validate this model and ensure the stability of the designed UAV, an adaptive backstepping control strategy based on the change in the flight configuration is applied. MATLAB simulations are provided to evaluate and illustrate the performance and efficiency of the proposed controller. Finally, some experimental tests are presented.


2021 ◽  
Vol 11 (9) ◽  
pp. 3952
Author(s):  
Shimin Tang ◽  
Zhiqiang Chen

With the ubiquitous use of mobile imaging devices, the collection of perishable disaster-scene data has become unprecedentedly easy. However, computing methods are unable to understand these images with significant complexity and uncertainties. In this paper, the authors investigate the problem of disaster-scene understanding through a deep-learning approach. Two attributes of images are concerned, including hazard types and damage levels. Three deep-learning models are trained, and their performance is assessed. Specifically, the best model for hazard-type prediction has an overall accuracy (OA) of 90.1%, and the best damage-level classification model has an explainable OA of 62.6%, upon which both models adopt the Faster R-CNN architecture with a ResNet50 network as a feature extractor. It is concluded that hazard types are more identifiable than damage levels in disaster-scene images. Insights are revealed, including that damage-level recognition suffers more from inter- and intra-class variations, and the treatment of hazard-agnostic damage leveling further contributes to the underlying uncertainties.


Sign in / Sign up

Export Citation Format

Share Document