An Adaptive Scheme to Achieve Fine Grained Video Scaling

Author(s):  
S Safinaz ◽  
A. V. Ravi Kumar

<p>A robust Adaptive Reconstruction Error Minimization Convolution Neural Network (<strong> ARemCNN</strong>) architecture introduced to provide high reconstruction quality from low resolution using parallel configuration. Our proposed model can easily train the bulky datasets such as YUV21 and Videoset4.Our experimental results shows that our model outperforms many existing techniques in terms of PSNR, SSIM and reconstruction quality. The experimental results shows that our average PSNR result is 39.81 considering upscale-2, 35.56 for upscale-3 and 33.77 for upscale-4 for Videoset4 dataset which is very high in contrast to other existing techniques. Similarly, the experimental results shows that our average PSNR result is 38.71 considering upscale-2, 34.58 for upscale-3 and 33.047 for upscale-4 for YUV21 dataset.</p>

2021 ◽  
pp. 1-11
Author(s):  
Jinglei Shi ◽  
Junjun Guo ◽  
Zhengtao Yu ◽  
Yan Xiang

Unsupervised aspect identification is a challenging task in aspect-based sentiment analysis. Traditional topic models are usually used for this task, but they are not appropriate for short texts such as product reviews. In this work, we propose an aspect identification model based on aspect vector reconstruction. A key of our model is that we make connections between sentence vectors and multi-grained aspect vectors using fuzzy k-means membership function. Furthermore, to make full use of different aspect representations in vector space, we reconstruct sentence vectors based on coarse-grained aspect vectors and fine-grained aspect vectors simultaneously. The resulting model can therefore learn better aspect representations. Experimental results on two datasets from different domains show that our proposed model can outperform a few baselines in terms of aspect identification and topic coherence of the extracted aspect terms.


The most critical tools for fine-grained opinion extraction are opinion goals and opinion terms extracted from on-line comments. The key part of this process is to identify the connection between terms. To do this, the Word Alignment Model (WAM) was introduced in which the associated variable can be identified by word alignment by an opinion goal. Nevertheless, its ability to extract opinion words was less successful. In order to determine opinion connections as a process of alignment, the partially supervised Word Alienation Model (PSWAM) has therefore been created. Then a visual co-ranking algorithm was implemented together with the Opinion Relationship Map, to model all the candidates and to measure the confidence of each voter by defining their opinion. In addition, higher-confidence candidates were extracted as opinions or opinions. This method, though, involves an added kind of interaction with terms such as topical connections in graphic thought. Therefore the current relationship is assumed in this report in order to model the applicants and derive the feelings, views and opinions. The efficiency of co-extracting thoughts, viewpoints and issues is enhanced effectively by using this method. The experimental results further indicate that compared to the existing paradigm, the efficiency of the proposed model.


2018 ◽  
Vol 8 (10) ◽  
pp. 1906 ◽  
Author(s):  
Zhicheng Zhao ◽  
Ze Luo ◽  
Jian Li ◽  
Kaihua Wang ◽  
Bingying Shi

The main purpose of fine-grained classification is to distinguish among many subcategories of a single basic category, such as birds or flowers. We propose a model based on a triple network and bilinear methods for fine-grained bird identification. Our proposed model can be trained in an end-to-end manner, which effectively increases the inter-class distance of the network extraction features and improves the accuracy of bird recognition. When experimentally tested on 1096 birds in a custom-built dataset and on Caltech-UCSD (a public bird dataset), the model achieved an accuracy of 88.91% and 85.58%, respectively. The experimental results confirm the high generalization ability of our model in fine-grained image classification. Moreover, our model requires no additional manual annotation information such as object-labeling frames and part-labeling points, which guarantees good versatility and robustness in fine-grained bird recognition.


2020 ◽  
Vol 34 (07) ◽  
pp. 12144-12151
Author(s):  
Guan-An Wang ◽  
Tianzhu Zhang ◽  
Yang Yang ◽  
Jian Cheng ◽  
Jianlong Chang ◽  
...  

RGB-Infrared (IR) person re-identification is very challenging due to the large cross-modality variations between RGB and IR images. The key solution is to learn aligned features to the bridge RGB and IR modalities. However, due to the lack of correspondence labels between every pair of RGB and IR images, most methods try to alleviate the variations with set-level alignment by reducing the distance between the entire RGB and IR sets. However, this set-level alignment may lead to misalignment of some instances, which limits the performance for RGB-IR Re-ID. Different from existing methods, in this paper, we propose to generate cross-modality paired-images and perform both global set-level and fine-grained instance-level alignments. Our proposed method enjoys several merits. First, our method can perform set-level alignment by disentangling modality-specific and modality-invariant features. Compared with conventional methods, ours can explicitly remove the modality-specific features and the modality variation can be better reduced. Second, given cross-modality unpaired-images of a person, our method can generate cross-modality paired images from exchanged images. With them, we can directly perform instance-level alignment by minimizing distances of every pair of images. Extensive experimental results on two standard benchmarks demonstrate that the proposed model favourably against state-of-the-art methods. Especially, on SYSU-MM01 dataset, our model can achieve a gain of 9.2% and 7.7% in terms of Rank-1 and mAP. Code is available at https://github.com/wangguanan/JSIA-ReID.


Author(s):  
Le Wu ◽  
Lei Chen ◽  
Yonghui Yang ◽  
Richang Hong ◽  
Yong Ge ◽  
...  

When recommending or advertising items to users, an emerging trend is to present each multimedia item with  a key frame image (e.g., the poster of a movie). As each multimedia item can be represented as  multiple fine-grained  visual images (e.g., related images of the movie), personalized key frame recommendation is necessary in these applications to attract users' unique visual preferences. However, previous personalized key frame recommendation models relied on users' fine grained image  behavior of  multimedia items (e.g., user-image interaction behavior), which is often not available in real scenarios.  In this paper, we study the general problem of joint multimedia item and key frame recommendation in the absence of the fine-grained user-image behavior. We argue that the key challenge of this problem lies in discovering users' visual profiles for key frame recommendation, as most recommendation models  would fail without any users' fine-grained image behavior. To tackle this challenge, we leverage users' item behavior by projecting users(items) in two latent spaces: a collaborative latent space and a visual latent space. We further design a model to discern both the collaborative and  visual dimensions of users, and model how users make decisive item preferences from these two spaces. As a result, the learned user visual profiles could be directly applied for key frame recommendation. Finally, experimental results on a real-world dataset clearly show the effectiveness of our proposed model on the two recommendation tasks.


2020 ◽  
Vol 2020 (14) ◽  
pp. 305-1-305-6
Author(s):  
Tianyu Li ◽  
Camilo G. Aguilar ◽  
Ronald F. Agyei ◽  
Imad A. Hanhan ◽  
Michael D. Sangid ◽  
...  

In this paper, we extend our previous 2D connected-tube marked point process (MPP) model to a 3D connected-tube MPP model for fiber detection. In the 3D case, a tube is represented by a cylinder model with two spherical areas at its ends. The spherical area is used to define connection priors that encourage connection of tubes that belong to the same fiber. Since each long fiber can be fitted by a series of connected short tubes, the proposed model is capable of detecting curved long tubes. We present experimental results on fiber-reinforced composite material images to show the performance of our method.


2021 ◽  
Vol 11 (5) ◽  
pp. 2083
Author(s):  
Jia Xie ◽  
Zhu Wang ◽  
Zhiwen Yu ◽  
Bin Guo ◽  
Xingshe Zhou

Ischemic stroke is one of the typical chronic diseases caused by the degeneration of the neural system, which usually leads to great damages to human beings and reduces life quality significantly. Thereby, it is crucial to extract useful predictors from physiological signals, and further diagnose or predict ischemic stroke when there are no apparent symptoms. Specifically, in this study, we put forward a novel prediction method by exploring sleep related features. First, to characterize the pattern of ischemic stroke accurately, we extract a set of effective features from several aspects, including clinical features, fine-grained sleep structure-related features and electroencephalogram-related features. Second, a two-step prediction model is designed, which combines commonly used classifiers and a data filter model together to optimize the prediction result. We evaluate the framework using a real polysomnogram dataset that contains 20 stroke patients and 159 healthy individuals. Experimental results demonstrate that the proposed model can predict stroke events effectively, and the Precision, Recall, Precision Recall Curve and Area Under the Curve are 63%, 85%, 0.773 and 0.919, respectively.


Author(s):  
Peilian Zhao ◽  
Cunli Mao ◽  
Zhengtao Yu

Aspect-Based Sentiment Analysis (ABSA), a fine-grained task of opinion mining, which aims to extract sentiment of specific target from text, is an important task in many real-world applications, especially in the legal field. Therefore, in this paper, we study the problem of limitation of labeled training data required and ignorance of in-domain knowledge representation for End-to-End Aspect-Based Sentiment Analysis (E2E-ABSA) in legal field. We proposed a new method under deep learning framework, named Semi-ETEKGs, which applied E2E framework using knowledge graph (KG) embedding in legal field after data augmentation (DA). Specifically, we pre-trained the BERT embedding and in-domain KG embedding for unlabeled data and labeled data with case elements after DA, and then we put two embeddings into the E2E framework to classify the polarity of target-entity. Finally, we built a case-related dataset based on a popular benchmark for ABSA to prove the efficiency of Semi-ETEKGs, and experiments on case-related dataset from microblog comments show that our proposed model outperforms the other compared methods significantly.


2018 ◽  
Vol 15 (5) ◽  
pp. 593-625 ◽  
Author(s):  
Chi-Hé Elder ◽  
Michael Haugh

Abstract Dominant accounts of “speaker meaning” in post-Gricean contextualist pragmatics tend to focus on single utterances, making the theoretical assumption that the object of pragmatic analysis is restricted to cases where speakers and hearers agree on utterance meanings, leaving instances of misunderstandings out of their scope. However, we know that divergences in understandings between interlocutors do often arise, and that when they do, speakers can engage in a local process of meaning negotiation. In this paper, we take insights from interactional pragmatics to offer an empirically informed view on speaker meaning that incorporates both speakers’ and hearers’ perspectives, alongside a formalization of how to model speaker meanings in such a way that we can account for both understandings – the canonical cases – and misunderstandings, but critically, also the process of interactionally negotiating meanings between interlocutors. We highlight that utterance-level theories of meaning provide only a partial representation of speaker meaning as it is understood in interaction, and show that inferences about a given utterance at any given time are formally connected to prior and future inferences of participants. Our proposed model thus provides a more fine-grained account of how speakers converge on speaker meanings in real time, showing how such meanings are often subject to a joint endeavor of complex inferential work.


Sign in / Sign up

Export Citation Format

Share Document