A COMPUTATIONAL MODEL FOR CONTEXT-BASED IMAGE CATEGORIZATION AND DESCRIPTION

2012 ◽  
Vol 12 (01) ◽  
pp. 1250001 ◽  
Author(s):  
TAREK HELMY

Automatic image categorization and description are key components for many applications, i.e., multimedia database management, web content analysis, human–computer interactions, and biometrics. In general, image description is a difficult task because of the wide variety of objects potentially to be recognized and the complexity and variety of backgrounds. This paper introduces a computational model for context-based image categorization and description. First, for a given image, a classifier is trained by the associated text features using advanced concepts, so that it can assign the image to a specific category. Then, a similarity matching with that category's annotated templates is performed for images in every other category. The proposed model uses novel text and image features that allow it to differentiate between geometrical images (GIs) and ordinary images. The experimental results show that the model is able to categorize correctly images with an expected increase in similarity matching as larger datasets and neural document classifier (NDC) are used. An important feature of the proposed model is that its specific matching techniques, suitable for a particular category, can be easily integrated and developed for other categories.

Author(s):  
Huimin Lu ◽  
Rui Yang ◽  
Zhenrong Deng ◽  
Yonglin Zhang ◽  
Guangwei Gao ◽  
...  

Chinese image description generation tasks usually have some challenges, such as single-feature extraction, lack of global information, and lack of detailed description of the image content. To address these limitations, we propose a fuzzy attention-based DenseNet-BiLSTM Chinese image captioning method in this article. In the proposed method, we first improve the densely connected network to extract features of the image at different scales and to enhance the model’s ability to capture the weak features. At the same time, a bidirectional LSTM is used as the decoder to enhance the use of context information. The introduction of an improved fuzzy attention mechanism effectively improves the problem of correspondence between image features and contextual information. We conduct experiments on the AI Challenger dataset to evaluate the performance of the model. The results show that compared with other models, our proposed model achieves higher scores in objective quantitative evaluation indicators, including BLEU , BLEU , METEOR, ROUGEl, and CIDEr. The generated description sentence can accurately express the image content.


Agronomy ◽  
2021 ◽  
Vol 11 (7) ◽  
pp. 1307
Author(s):  
Haoriqin Wang ◽  
Huaji Zhu ◽  
Huarui Wu ◽  
Xiaomin Wang ◽  
Xiao Han ◽  
...  

In the question-and-answer (Q&A) communities of the “China Agricultural Technology Extension Information Platform”, thousands of rice-related Chinese questions are newly added every day. The rapid detection of the same semantic question is the key to the success of a rice-related intelligent Q&A system. To allow the fast and automatic detection of the same semantic rice-related questions, we propose a new method based on the Coattention-DenseGRU (Gated Recurrent Unit). According to the rice-related question characteristics, we applied word2vec with the TF-IDF (Term Frequency–Inverse Document Frequency) method to process and analyze the text data and compare it with the Word2vec, GloVe, and TF-IDF methods. Combined with the agricultural word segmentation dictionary, we applied Word2vec with the TF-IDF method, effectively solving the problem of high dimension and sparse data in the rice-related text. Each network layer employed the connection information of features and all previous recursive layers’ hidden features. To alleviate the problem of feature vector size increasing due to dense splicing, an autoencoder was used after dense concatenation. The experimental results show that rice-related question similarity matching based on Coattention-DenseGRU can improve the utilization of text features, reduce the loss of features, and achieve fast and accurate similarity matching of the rice-related question dataset. The precision and F1 values of the proposed model were 96.3% and 96.9%, respectively. Compared with seven other kinds of question similarity matching models, we present a new state-of-the-art method with our rice-related question dataset.


2013 ◽  
Vol 12 (2) ◽  
pp. 055-062
Author(s):  
Stefan Pradelok ◽  
Piotr Bętkowski ◽  
Adam Rudzik ◽  
Piotr Łaziński

This paper presents a method of engineering modelling of structural details, which enables the analysis of local static and dynamic effects in a complex structure with the use of a personal computer. An analysed structural detail, modelled with the use of shell finite elements, is mounted to a spatial truss member system. Then, on the basis of prepared computational model, a static or dynamic analysis is carried out. The proposed model allows to detect the local effects in a theoretical. Conducted analyses confirmed the correct operation of such a computational model. Hence, the method of modelling presented in this paper allows to analyse the local effects on ordinary personal computer and more importantly, the results of such calculations are available within a relatively short period of time. The calculations are carried out by analysing the local effects in a steel node of the truss railway bridge.


2014 ◽  
Vol 2014 ◽  
pp. 1-15
Author(s):  
Mohamed Abdo Abd Al-Hady ◽  
Amr Ahmed Badr ◽  
Mostafa Abd Al-Azim Mostafa

The immune system has a cognitive ability to differentiate between healthy and unhealthy cells. The immune system response (ISR) is stimulated by a disorder in the temporary fuzzy state that is oscillating between the healthy and unhealthy states. However, modeling the immune system is an enormous challenge; the paper introduces an extensive summary of how the immune system response functions, as an overview of a complex topic, to present the immune system as a cognitive intelligent agent. The homogeneity and perfection of the natural immune system have been always standing out as the sought-after model we attempted to imitate while building our proposed model of cognitive architecture. The paper divides the ISR into four logical phases: setting a computational architectural diagram for each phase, proceeding from functional perspectives (input, process, and output), and their consequences. The proposed architecture components are defined by matching biological operations with computational functions and hence with the framework of the paper. On the other hand, the architecture focuses on the interoperability of main theoretical immunological perspectives (classic, cognitive, and danger theory), as related to computer science terminologies. The paper presents a descriptive model of immune system, to figure out the nature of response, deemed to be intrinsic for building a hybrid computational model based on a cognitive intelligent agent perspective and inspired by the natural biology. To that end, this paper highlights the ISR phases as applied to a case study on hepatitis C virus, meanwhile illustrating our proposed architecture perspective.


2020 ◽  
Vol 21 (S6) ◽  
Author(s):  
Jianqiang Li ◽  
Guanghui Fu ◽  
Yueda Chen ◽  
Pengzhi Li ◽  
Bo Liu ◽  
...  

Abstract Background Screening of the brain computerised tomography (CT) images is a primary method currently used for initial detection of patients with brain trauma or other conditions. In recent years, deep learning technique has shown remarkable advantages in the clinical practice. Researchers have attempted to use deep learning methods to detect brain diseases from CT images. Methods often used to detect diseases choose images with visible lesions from full-slice brain CT scans, which need to be labelled by doctors. This is an inaccurate method because doctors detect brain disease from a full sequence scan of CT images and one patient may have multiple concurrent conditions in practice. The method cannot take into account the dependencies between the slices and the causal relationships among various brain diseases. Moreover, labelling images slice by slice spends much time and expense. Detecting multiple diseases from full slice brain CT images is, therefore, an important research subject with practical implications. Results In this paper, we propose a model called the slice dependencies learning model (SDLM). It learns image features from a series of variable length brain CT images and slice dependencies between different slices in a set of images to predict abnormalities. The model is necessary to only label the disease reflected in the full-slice brain scan. We use the CQ500 dataset to evaluate our proposed model, which contains 1194 full sets of CT scans from a total of 491 subjects. Each set of data from one subject contains scans with one to eight different slice thicknesses and various diseases that are captured in a range of 30 to 396 slices in a set. The evaluation results present that the precision is 67.57%, the recall is 61.04%, the F1 score is 0.6412, and the areas under the receiver operating characteristic curves (AUCs) is 0.8934. Conclusion The proposed model is a new architecture that uses a full-slice brain CT scan for multi-label classification, unlike the traditional methods which only classify the brain images at the slice level. It has great potential for application to multi-label detection problems, especially with regard to the brain CT images.


2019 ◽  
Vol 277 ◽  
pp. 02036
Author(s):  
Yu Li ◽  
Lizhuang Liu

In this work we investigate the use of deep learning for image quality classification problem. We use a pre-trained Convolutional Neural Network (CNN) for image description, and the Support Vector Machine (SVM) model is trained as an image quality classifier whose inputs are normalized features extracted by the CNN model. We report on different design choices, ranging from the use of various CNN architectures to the use of features extracted from different layers of a CNN model. To cope with the problem of a lack of adequate amounts of distorted picture data, a novel training strategy of multi-scale training, which is selecting a new image size for training after several batches, combined with data augmentation is introduced. The experimental results tested on the actual monitoring video images shows that the proposed model can accurately classify distorted images.


2015 ◽  
Vol 33 (1) ◽  
pp. 35-51 ◽  
Author(s):  
Anusha Lakmini Wijayaratne ◽  
Diljit Singh

Purpose – The purpose of this paper is to introduce a library website model. Further, the paper discusses a designer’s checklist and an evaluative instrument that were constructed based on the proposed model. Design/methodology/approach – The model was developed through a Delphi study that was participated by two panels of experts. The researcher communicated with the panel members via e-mail using two Delphi instruments designed out of two item pools that were developed based on the knowledge gained from surveying the literature, visiting the selected libraries and exploring the library websites. Then, a designer’s checklist and an evaluative instrument were derived from the proposed model through a series of brainstorming sessions. Findings – The proposed model consisted of altogether 140 items (60 web content elements and 80 web design features). The designer’s checklist comprises all 140 items, and the evaluative instrument comprises 60 content elements and 57 design features. Research limitations/implications – This study has developed an academic library website model and derived two instruments based on the proposed model. Further studies are needed to customize, particularly, the web content pillar of this conceptual model, to meet the specific needs of different types of libraries including public libraries, special libraries, school libraries, etc. Practical implications – The designer’s checklist and the evaluative instrument derived from the proposed model are useful tools for library professionals in designing, re-designing, maintaining and evaluating their library websites. The librarians may use these tools for both institutional and research purposes. Originality/value – The model and the two instruments proposed by this study are unique in focus, origin, content and presentation.


2011 ◽  
Vol 3 (1) ◽  
pp. 27-38
Author(s):  
Marco Campenní ◽  
Federico Cecconi

In this paper, the authors present a computational model of a fundamental social phenomenon in the study of animal behavior: the foraging. The purpose of this work is, first, to test the validity of the proposed model compared to another existing model, the flocking model; then, to try to understand whether the model may provide useful suggestions in studying the size of the group in some species of social mammals.


1990 ◽  
Vol 5 (6) ◽  
pp. 547-555 ◽  
Author(s):  
D. I. Flitcroft

AbstractAccommodation is more accurate with polychromatic stimuli than with narrowband or monochromatic stimuli. The aim of this paper is to develop a computational model for how the visual system uses the extra information in polychromatic stimuli to increase the accuracy of accommodation responses. The proposed model is developed within the context of both trichromacy and also the organization of spatial and chromatic processing within the visual cortex.The refractive error present in the retinal image can be estimated by comparing image quality with and without small additional changes in refractive state. In polychromatic light, the chromatic aberration of the eye results in differences in ocular refractive power for light of different wavelengths. As a result, the refractive state of the eye can be estimated by comparing image quality in the three types of cone photoreceptor. The ability of cortical neurons to perform such comparisons on image quality with a crude form of spatial-frequency analysis is examined theoretically. It is found that spatially band-pass chromatically opponent neurons (that may correspond to double opponent neurons) can perform such calculations and that chromatic cues to accommodation are extracted most effectively by neurons responding to spatial frequencies of between 2 and 8 cycles/deg.


Sign in / Sign up

Export Citation Format

Share Document