scholarly journals A Novel Feature-based Bayesian Model for Query Focused Multi-document Summarization

Author(s):  
Jiwei Li ◽  
Sujian Li

Supervised learning methods and LDA based topic model have been successfully applied in the field of multi-document summarization. In this paper, we propose a novel supervised approach that can incorporate rich sentence features into Bayesian topic models in a principled way, thus taking advantages of both topic model and feature based supervised learning methods. Experimental results on DUC2007, TAC2008 and TAC2009 demonstrate the effectiveness of our approach.

Symmetry ◽  
2019 ◽  
Vol 11 (12) ◽  
pp. 1486
Author(s):  
Zhinan Gou ◽  
Zheng Huo ◽  
Yuanzhen Liu ◽  
Yi Yang

Supervised topic modeling has been successfully applied in the fields of document classification and tag recommendation in recent years. However, most existing models neglect the fact that topic terms have the ability to distinguish topics. In this paper, we propose a term frequency-inverse topic frequency (TF-ITF) method for constructing a supervised topic model, in which the weight of each topic term indicates the ability to distinguish topics. We conduct a series of experiments with not only the symmetric Dirichlet prior parameters but also the asymmetric Dirichlet prior parameters. Experimental results demonstrate that the result of introducing TF-ITF into a supervised topic model outperforms several state-of-the-art supervised topic models.


2021 ◽  
Author(s):  
Yue Niu ◽  
Hongjie Zhang

With the growth of the internet, short texts such as tweets from Twitter, news titles from the RSS, or comments from Amazon have become very prevalent. Many tasks need to retrieve information hidden from the content of short texts. So ontology learning methods are proposed for retrieving structured information. Topic hierarchy is a typical ontology that consists of concepts and taxonomy relations between concepts. Current hierarchical topic models are not specially designed for short texts. These methods use word co-occurrence to construct concepts and general-special word relations to construct taxonomy topics. But in short texts, word cooccurrence is sparse and lacking general-special word relations. To overcome this two problems and provide an interpretable result, we designed a hierarchical topic model which aggregates short texts into long documents and constructing topics and relations. Because long documents add additional semantic information, our model can avoid the sparsity of word cooccurrence. In experiments, we measured the quality of concepts by topic coherence metric on four real-world short texts corpus. The result showed that our topic hierarchy is more interpretable than other methods.


Sensors ◽  
2021 ◽  
Vol 21 (5) ◽  
pp. 1894
Author(s):  
Jiangzhong Cao ◽  
Yunfei Huang ◽  
Qingyun Dai ◽  
Wing-Kuen Ling

Aiming at the high cost of data labeling and ignoring the internal relevance of features in existing trademark retrieval methods, this paper proposes an unsupervised trademark retrieval method based on attention mechanism. In the proposed method, the instance discrimination framework is adopted and a lightweight attention mechanism is introduced to allocate a more reasonable learning weight to key features. With an unsupervised way, this proposed method can obtain good feature representation of trademarks and improve the performance of trademark retrieval. Extensive comparative experiments on the METU trademark dataset are conducted. The experimental results show that the proposed method is significantly better than traditional trademark retrieval methods and most existing supervised learning methods. The proposed method obtained a smaller value of NAR (Normalized Average Rank) at 0.051, which verifies the effectiveness of the proposed method in trademark retrieval.


Author(s):  
Wenbo Hu ◽  
Jun Zhu ◽  
Hang Su ◽  
Jingwei Zhuo ◽  
Bo Zhang

Supervised topic models leverage label information to learn discriminative latent topic representations. As collecting a fully labeled dataset is often time-consuming, semi-supervised learning is of high interest. In this paper, we present an effective semi-supervised max-margin topic model by naturally introducing manifold posterior regularization to a regularized Bayesian topic model, named LapMedLDA. The model jointly learns latent topics and a related classifier with only a small fraction of labeled documents. To perform the approximate inference, we derive an efficient stochastic gradient MCMC method. Unlike the previous semi-supervised topic models, our model adopts a tight coupling between the generative topic model and the discriminative classifier. Extensive experiments demonstrate that such tight coupling brings significant benefits in quantitative and qualitative performance.


2021 ◽  
Vol 7 (3) ◽  
pp. 51
Author(s):  
Emanuela Paladini ◽  
Edoardo Vantaggiato ◽  
Fares Bougourzi ◽  
Cosimo Distante ◽  
Abdenour Hadid ◽  
...  

In recent years, automatic tissue phenotyping has attracted increasing interest in the Digital Pathology (DP) field. For Colorectal Cancer (CRC), tissue phenotyping can diagnose the cancer and differentiate between different cancer grades. The development of Whole Slide Images (WSIs) has provided the required data for creating automatic tissue phenotyping systems. In this paper, we study different hand-crafted feature-based and deep learning methods using two popular multi-classes CRC-tissue-type databases: Kather-CRC-2016 and CRC-TP. For the hand-crafted features, we use two texture descriptors (LPQ and BSIF) and their combination. In addition, two classifiers are used (SVM and NN) to classify the texture features into distinct CRC tissue types. For the deep learning methods, we evaluate four Convolutional Neural Network (CNN) architectures (ResNet-101, ResNeXt-50, Inception-v3, and DenseNet-161). Moreover, we propose two Ensemble CNN approaches: Mean-Ensemble-CNN and NN-Ensemble-CNN. The experimental results show that the proposed approaches outperformed the hand-crafted feature-based methods, CNN architectures and the state-of-the-art methods in both databases.


2021 ◽  
Vol 38 (1) ◽  
pp. 31-41
Author(s):  
Florent Chiaroni ◽  
Mohamed-Cherif Rahal ◽  
Nicolas Hueber ◽  
Frederic Dufaux

Sign in / Sign up

Export Citation Format

Share Document