BATS: A Spectral Biclustering Approach to Single Document Topic Modeling and Segmentation

Existing topic modeling and text segmentation methodologies generally require large datasets for training, limiting their capabilities when only small collections of text are available. In this work, we reexamine the inter-related problems of “topic identification” and “text segmentation” for sparse document learning, when there is a single new text of interest. In developing a methodology to handle single documents, we face two major challenges. First is sparse information : with access to only one document, we cannot train traditional topic models or deep learning algorithms. Second is significant noise : a considerable portion of words in any single document will produce only noise and not help discern topics or segments. To tackle these issues, we design an unsupervised, computationally efficient methodology called Biclustering Approach to Topic modeling and Segmentation (BATS). BATS leverages three key ideas to simultaneously identify topics and segment text: (i) a new mechanism that uses word order information to reduce sample complexity, (ii) a statistically sound graph-based biclustering technique that identifies latent structures of words and sentences, and (iii) a collection of effective heuristics that remove noise words and award important words to further improve performance. Experiments on six datasets show that our approach outperforms several state-of-the-art baselines when considering topic coherence, topic diversity, segmentation, and runtime comparison metrics.

Download Full-text

Enhanced context-aware recommendation using topic modeling and particle swarm optimization

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-210331 ◽

2021 ◽

pp. 1-16

Author(s):

Ibtissem Gasmi ◽

Mohamed Walid Azizi ◽

Hassina Seridi-Bouchelaghem ◽

Nabiha Azizi ◽

Samir Brahim Belhaouari

Keyword(s):

Topic Modeling ◽

Latent Dirichlet Allocation ◽

State Of The Art ◽

Weighting Function ◽

Contextual Factors ◽

Pearson Correlation ◽

Correlation Coefficients ◽

Pso Algorithm ◽

Context Aware ◽

Proposed Model

Context-Aware Recommender System (CARS) suggests more relevant services by adapting them to the user’s specific context situation. Nevertheless, the use of many contextual factors can increase data sparsity while few context parameters fail to introduce the contextual effects in recommendations. Moreover, several CARSs are based on similarity algorithms, such as cosine and Pearson correlation coefficients. These methods are not very effective in the sparse datasets. This paper presents a context-aware model to integrate contextual factors into prediction process when there are insufficient co-rated items. The proposed algorithm uses Latent Dirichlet Allocation (LDA) to learn the latent interests of users from the textual descriptions of items. Then, it integrates both the explicit contextual factors and their degree of importance in the prediction process by introducing a weighting function. Indeed, the PSO algorithm is employed to learn and optimize weights of these features. The results on the Movielens 1 M dataset show that the proposed model can achieve an F-measure of 45.51% with precision as 68.64%. Furthermore, the enhancement in MAE and RMSE can respectively reach 41.63% and 39.69% compared with the state-of-the-art techniques.

Download Full-text

Regularized Training and Tight Certification for Randomized Smoothed Classifier with Provable Robustness

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i04.5798 ◽

2020 ◽

Vol 34 (04) ◽

pp. 3858-3865

Author(s):

Huijie Feng ◽

Chunpeng Wu ◽

Guoyang Chen ◽

Weifeng Zhang ◽

Yang Ning

Keyword(s):

Neural Network ◽

High Probability ◽

Deep Neural Network ◽

State Of The Art ◽

Computationally Efficient ◽

Base Classifier ◽

Training Scheme ◽

Adversarial Training ◽

Gaussian Perturbation ◽

Probabilistic Robustness

Recently smoothing deep neural network based classifiers via isotropic Gaussian perturbation is shown to be an effective and scalable way to provide state-of-the-art probabilistic robustness guarantee against ℓ2 norm bounded adversarial perturbations. However, how to train a good base classifier that is accurate and robust when smoothed has not been fully investigated. In this work, we derive a new regularized risk, in which the regularizer can adaptively encourage the accuracy and robustness of the smoothed counterpart when training the base classifier. It is computationally efficient and can be implemented in parallel with other empirical defense methods. We discuss how to implement it under both standard (non-adversarial) and adversarial training scheme. At the same time, we also design a new certification algorithm, which can leverage the regularization effect to provide tighter robustness lower bound that holds with high probability. Our extensive experimentation demonstrates the effectiveness of the proposed training and certification approaches on CIFAR-10 and ImageNet datasets.

Download Full-text

DiagnosisQA: A semi-automated pipeline for developing clinician validated diagnosis specific QA datasets.

10.1101/2021.11.10.21266184 ◽

2021 ◽

Author(s):

Shreya Mishra ◽

Raghav Awasthi ◽

Frank Papay ◽

Kamal Maheshawari ◽

Jacek B Cywinski ◽

...

Keyword(s):

Question Answering ◽

State Of The Art ◽

Healthcare Providers ◽

Unstructured Data ◽

Significant Progress ◽

Improve Performance ◽

Healthcare Data ◽

Research Areas ◽

Benchmark Datasets ◽

Automated Pipeline

Question answering (QA) is one of the oldest research areas of AI and Compu- national Linguistics. QA has seen significant progress with the development of state-of-the-art models and benchmark datasets over the last few years. However, pre-trained QA models perform poorly for clinical QA tasks, presumably due to the complexity of electronic healthcare data. With the digitization of healthcare data and the increasing volume of unstructured data, it is extremely important for healthcare providers to have a mechanism to query the data to find appropriate answers. Since diagnosis is central to any decision-making for the clinicians and patients, we have created a pipeline to develop diagnosis-specific QA datasets and curated a QA database for the Cerebrovascular Accident (CVA). CVA, also commonly known as Stroke, is an important and commonly occurring diagnosis amongst critically ill patients. Our method when compared to clinician validation achieved an accuracy of 0.90(with 90% CI [0.82,0.99]). Using our method, we hope to overcome the key challenges of building and validating a highly accurate QA dataset in a semiautomated manner which can help improve performance of QA models.

Download Full-text

Complete Bottom-Up Predicate Invention in Meta-Interpretive Learning

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/320 ◽

2020 ◽

Author(s):

Céline Hocquette ◽

Stephen H. Muggleton

Keyword(s):

State Of The Art ◽

Order Logic ◽

Learning Performance ◽

Sample Complexity ◽

Logic Programs ◽

Top Down ◽

Bottom Up ◽

Predicate Invention ◽

Feature Discovery ◽

Second Order Logic

Predicate Invention in Meta-Interpretive Learning (MIL) is generally based on a top-down approach, and the search for a consistent hypothesis is carried out starting from the positive examples as goals. We consider augmenting top-down MIL systems with a bottom-up step during which the background knowledge is generalised with an extension of the immediate consequence operator for second-order logic programs. This new method provides a way to perform extensive predicate invention useful for feature discovery. We demonstrate this method is complete with respect to a fragment of dyadic datalog. We theoretically prove this method reduces the number of clauses to be learned for the top-down learner, which in turn can reduce the sample complexity. We formalise an equivalence relation for predicates which is used to eliminate redundant predicates. Our experimental results suggest pairing the state-of-the-art MIL system Metagol with an initial bottom-up step can significantly improve learning performance.

Download Full-text

Two-Level Transformer and Auxiliary Coherence Modeling for Improved Text Segmentation

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i05.6284 ◽

2020 ◽

Vol 34 (05) ◽

pp. 7797-7804

Author(s):

Goran Glavašš ◽

Swapna Somasundaran

Keyword(s):

State Of The Art ◽

Language Transfer ◽

Text Segmentation ◽

Word Embeddings ◽

Neural Architecture ◽

Text Coherence ◽

Sentence Level ◽

Proposed Model ◽

Benchmark Datasets ◽

Cross Lingual

Breaking down the structure of long texts into semantically coherent segments makes the texts more readable and supports downstream applications like summarization and retrieval. Starting from an apparent link between text coherence and segmentation, we introduce a novel supervised model for text segmentation with simple but explicit coherence modeling. Our model – a neural architecture consisting of two hierarchically connected Transformer networks – is a multi-task learning model that couples the sentence-level segmentation objective with the coherence objective that differentiates correct sequences of sentences from corrupt ones. The proposed model, dubbed Coherence-Aware Text Segmentation (CATS), yields state-of-the-art segmentation performance on a collection of benchmark datasets. Furthermore, by coupling CATS with cross-lingual word embeddings, we demonstrate its effectiveness in zero-shot language transfer: it can successfully segment texts in languages unseen in training.

Download Full-text

Insertion-based Decoding with Automatically Inferred Generation Order

Transactions of the Association for Computational Linguistics ◽

10.1162/tacl_a_00292 ◽

2019 ◽

Vol 7 ◽

pp. 661-676 ◽

Cited By ~ 3

Author(s):

Jiatao Gu ◽

Qi Liu ◽

Kyunghyun Cho

Keyword(s):

Machine Translation ◽

Real World ◽

Word Order ◽

Code Generation ◽

State Of The Art ◽

Generation Model ◽

Beam Search ◽

Input Information ◽

Sequence Generation ◽

Image Caption

Conventional neural autoregressive decoding commonly assumes a fixed left-to-right generation order, which may be sub-optimal. In this work, we propose a novel decoding algorithm— InDIGO—which supports flexible sequence generation in arbitrary orders through insertion operations. We extend Transformer, a state-of-the-art sequence generation model, to efficiently implement the proposed approach, enabling it to be trained with either a pre-defined generation order or adaptive orders obtained from beam-search. Experiments on four real-world tasks, including word order recovery, machine translation, image caption, and code generation, demonstrate that our algorithm can generate sequences following arbitrary orders, while achieving competitive or even better performance compared with the conventional left-to-right generation. The generated sequences show that InDIGO adopts adaptive generation orders based on input information.

Download Full-text

CharTeC-Net: An Efficient and Lightweight Character-Based Convolutional Network for Text Classification

Journal of Electrical and Computer Engineering ◽

10.1155/2020/9701427 ◽

2020 ◽

Vol 2020 ◽

pp. 1-7 ◽

Cited By ~ 2

Author(s):

Aboubakar Nasser Samatin Njikam ◽

Huan Zhao

Keyword(s):

Text Classification ◽

Building Block ◽

Large Scale ◽

State Of The Art ◽

Building Blocks ◽

Training Data ◽

Superior Performance ◽

Classification Problems ◽

Computationally Efficient ◽

Convolutional Network

This paper introduces an extremely lightweight (with just over around two hundred thousand parameters) and computationally efficient CNN architecture, named CharTeC-Net (Character-based Text Classification Network), for character-based text classification problems. This new architecture is composed of four building blocks for feature extraction. Each of these building blocks, except the last one, uses 1 × 1 pointwise convolutional layers to add more nonlinearity to the network and to increase the dimensions within each building block. In addition, shortcut connections are used in each building block to facilitate the flow of gradients over the network, but more importantly to ensure that the original signal present in the training data is shared across each building block. Experiments on eight standard large-scale text classification and sentiment analysis datasets demonstrate CharTeC-Net’s superior performance over baseline methods and yields competitive accuracy compared with state-of-the-art methods, although CharTeC-Net has only between 181,427 and 225,323 parameters and weighs less than 1 megabyte.

Download Full-text

Deep Image Demosaicing for Submicron Image Sensors

Journal of Imaging Science and Technology ◽

10.2352/j.imagingsci.technol.2019.63.6.060410 ◽

2019 ◽

Vol 63 (6) ◽

pp. 60410-1-60410-12

Author(s):

Irina Kim ◽

Seongwook Song ◽

Soonkeun Chang ◽

Sukhwan Lim ◽

Kai Guo

Keyword(s):

Image Quality ◽

Mobile Devices ◽

State Of The Art ◽

Image Sensor ◽

Sensor Technology ◽

Color Filter ◽

Computationally Efficient ◽

Linear Feature ◽

Visual Artifacts ◽

Very High

Abstract Latest trend in image sensor technology allowing submicron pixel size for high-end mobile devices comes at very high image resolutions and with irregularly sampled Quad Bayer color filter array (CFA). Sustaining image quality becomes a challenge for the image signal processor (ISP), namely for demosaicing. Inspired by the success of deep learning approach to standard Bayer demosaicing, we aim to investigate how artifacts-prone Quad Bayer array can benefit from it. We found that deeper networks are capable to improve image quality and reduce artifacts; however, deeper networks can be hardly deployed on mobile devices given very high image resolutions: 24MP, 36MP, 48MP. In this article, we propose an efficient end-to-end solution to bridge this gap—a duplex pyramid network (DPN). Deep hierarchical structure, residual learning, and linear feature map depth growth allow very large receptive field, yielding better details restoration and artifacts reduction, while staying computationally efficient. Experiments show that the proposed network outperforms state of the art for standard and Quad Bayer demosaicing. For the challenging Quad Bayer CFA, the proposed method reduces visual artifacts better than state-of-the-art deep networks including artifacts existing in conventional commercial solutions. While superior in image quality, it is 2‐25 times faster than state-of-the-art deep neural networks and therefore feasible for deployment on mobile devices, paving the way for a new era of on-device deep ISPs.

Download Full-text

ERLP: Ensembles of Reinforcement Learning Policies (Student Abstract)

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i10.7225 ◽

2020 ◽

Vol 34 (10) ◽

pp. 13905-13906

Author(s):

Rohan Saphal ◽

Balaraman Ravindran ◽

Dheevatsa Mudigere ◽

Sasikanth Avancha ◽

Bharat Kaul

Keyword(s):

Reinforcement Learning ◽

State Of The Art ◽

Multiple Models ◽

Model Parameters ◽

Continuous Control ◽

Sample Complexity ◽

Local Minima ◽

Single Model ◽

Learning Policies ◽

Reinforcement Learning Models

Reinforcement learning algorithms are sensitive to hyper-parameters and require tuning and tweaking for specific environments for improving performance. Ensembles of reinforcement learning models on the other hand are known to be much more robust and stable. However, training multiple models independently on an environment suffers from high sample complexity. We present here a methodology to create multiple models from a single training instance that can be used in an ensemble through directed perturbation of the model parameters at regular intervals. This allows training a single model that converges to several local minima during the optimization process as a result of the perturbation. By saving the model parameters at each such instance, we obtain multiple policies during training that are ensembled during evaluation. We evaluate our approach on challenging discrete and continuous control tasks and also discuss various ensembling strategies. Our framework is substantially sample efficient, computationally inexpensive and is seen to outperform state of the art (SOTA) approaches

Download Full-text

A Method for Constructing Supervised Topic Model Based on Term Frequency-Inverse Topic Frequency

Symmetry ◽

10.3390/sym11121486 ◽

2019 ◽

Vol 11 (12) ◽

pp. 1486

Author(s):

Zhinan Gou ◽

Zheng Huo ◽

Yuanzhen Liu ◽

Yi Yang

Keyword(s):

Topic Modeling ◽

Topic Model ◽

State Of The Art ◽

Topic Models ◽

Document Classification ◽

Experimental Results ◽

Tag Recommendation ◽

Term Frequency ◽

Series Of Experiments ◽

Dirichlet Prior

Supervised topic modeling has been successfully applied in the fields of document classification and tag recommendation in recent years. However, most existing models neglect the fact that topic terms have the ability to distinguish topics. In this paper, we propose a term frequency-inverse topic frequency (TF-ITF) method for constructing a supervised topic model, in which the weight of each topic term indicates the ability to distinguish topics. We conduct a series of experiments with not only the symmetric Dirichlet prior parameters but also the asymmetric Dirichlet prior parameters. Experimental results demonstrate that the result of introducing TF-ITF into a supervised topic model outperforms several state-of-the-art supervised topic models.

Download Full-text