Infusing Knowledge into the Textual Entailment Task Using Graph Convolutional Networks

Pavan Kapanipathi; Veronika Thost; Siva Sankalp Patel; Spencer Whitehead; Ibrahim Abdelaziz; Avinash Balakrishnan; Maria Chang; Kshitij Fadnis; Chulaka Gunasekara; Bassem Makni; Nicholas Mattei; Kartik Talamadupula; Achille Fokoue

doi:10.1609/aaai.v34i05.6318

Infusing Knowledge into the Textual Entailment Task Using Graph Convolutional Networks

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i05.6318 ◽

2020 ◽

Vol 34 (05) ◽

pp. 8074-8081

Author(s):

Pavan Kapanipathi ◽

Veronika Thost ◽

Siva Sankalp Patel ◽

Spencer Whitehead ◽

Ibrahim Abdelaziz ◽

...

Keyword(s):

Language Processing ◽

Prediction Accuracy ◽

Semantic Information ◽

Training Data ◽

Personalized Pagerank ◽

External Knowledge ◽

Convolutional Networks ◽

Textual Entailment ◽

Knowledge Graphs ◽

Textual Content

Textual entailment is a fundamental task in natural language processing. Most approaches for solving this problem use only the textual content present in training data. A few approaches have shown that information from external knowledge sources like knowledge graphs (KGs) can add value, in addition to the textual content, by providing background knowledge that may be critical for a task. However, the proposed models do not fully exploit the information in the usually large and noisy KGs, and it is not clear how it can be effectively encoded to be useful for entailment. We present an approach that complements text-based entailment models with information from KGs by (1) using Personalized PageRank to generate contextual subgraphs with reduced noise and (2) encoding these subgraphs using graph convolutional networks to capture the structural and semantic information in KGs. We evaluate our approach on multiple textual entailment datasets and show that the use of external knowledge helps the model to be robust and improves prediction accuracy. This is particularly evident in the challenging BreakingNLI dataset, where we see an absolute improvement of 5-20% over multiple text-based entailment models.

Download Full-text

Commonsense Knowledge Aware Conversation Generation with Graph Attention

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2018/643 ◽

2018 ◽

Cited By ~ 35

Author(s):

Hao Zhou ◽

Tom Young ◽

Minlie Huang ◽

Haizhou Zhao ◽

Jingfang Xu ◽

...

Keyword(s):

Language Processing ◽

Large Scale ◽

Semantic Information ◽

Attention Mechanism ◽

Generation Model ◽

Dynamic Graph ◽

Commonsense Knowledge ◽

Word Generation ◽

Proposed Model ◽

Knowledge Graphs

Commonsense knowledge is vital to many natural language processing tasks. In this paper, we present a novel open-domain conversation generation model to demonstrate how large-scale commonsense knowledge can facilitate language understanding and generation. Given a user post, the model retrieves relevant knowledge graphs from a knowledge base and then encodes the graphs with a static graph attention mechanism, which augments the semantic information of the post and thus supports better understanding of the post. Then, during word generation, the model attentively reads the retrieved knowledge graphs and the knowledge triples within each graph to facilitate better generation through a dynamic graph attention mechanism. This is the first attempt that uses large-scale commonsense knowledge in conversation generation. Furthermore, unlike existing models that use knowledge triples (entities) separately and independently, our model treats each knowledge graph as a whole, which encodes more structured, connected semantic information in the graphs. Experiments show that the proposed model can generate more appropriate and informative responses than state-of-the-art baselines.

Download Full-text

Graph Convolutional Networks for Text Classification

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33017370 ◽

2019 ◽

Vol 33 ◽

pp. 7370-7377 ◽

Cited By ~ 90

Author(s):

Liang Yao ◽

Chengsheng Mao ◽

Yuan Luo

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Language Processing ◽

Text Classification ◽

State Of The Art ◽

Classical Problem ◽

Experimental Results ◽

Training Data ◽

Convolutional Networks ◽

Single Text

Text classification is an important and classical problem in natural language processing. There have been a number of studies that applied convolutional neural networks (convolution on regular grid, e.g., sequence) to classification. However, only a limited number of studies have explored the more flexible graph convolutional neural networks (convolution on non-grid, e.g., arbitrary graph) for the task. In this work, we propose to use graph convolutional networks for text classification. We build a single text graph for a corpus based on word co-occurrence and document word relations, then learn a Text Graph Convolutional Network (Text GCN) for the corpus. Our Text GCN is initialized with one-hot representation for word and document, it then jointly learns the embeddings for both words and documents, as supervised by the known class labels for documents. Our experimental results on multiple benchmark datasets demonstrate that a vanilla Text GCN without any external word embeddings or knowledge outperforms state-of-the-art methods for text classification. On the other hand, Text GCN also learns predictive word and document embeddings. In addition, experimental results show that the improvement of Text GCN over state-of-the-art comparison methods become more prominent as we lower the percentage of training data, suggesting the robustness of Text GCN to less training data in text classification.

Download Full-text

OWL2Vec*: embedding of OWL ontologies

Machine Learning ◽

10.1007/s10994-021-05997-6 ◽

2021 ◽

Author(s):

Jiaoyan Chen ◽

Pan Hu ◽

Ernesto Jimenez-Ruiz ◽

Ole Magnus Holter ◽

Denvar Antonyrajah ◽

...

Keyword(s):

Language Processing ◽

Semantic Information ◽

State Of The Art ◽

Empirical Evaluation ◽

Graph Structure ◽

Robust Methods ◽

Lexical Information ◽

Ontology Language ◽

Knowledge Graphs ◽

Real World Datasets

AbstractSemantic embedding of knowledge graphs has been widely studied and used for prediction and statistical analysis tasks across various domains such as Natural Language Processing and the Semantic Web. However, less attention has been paid to developing robust methods for embedding OWL (Web Ontology Language) ontologies, which contain richer semantic information than plain knowledge graphs, and have been widely adopted in domains such as bioinformatics. In this paper, we propose a random walk and word embedding based ontology embedding method named , which encodes the semantics of an OWL ontology by taking into account its graph structure, lexical information and logical constructors. Our empirical evaluation with three real world datasets suggests that benefits from these three different aspects of an ontology in class membership prediction and class subsumption prediction tasks. Furthermore, often significantly outperforms the state-of-the-art methods in our experiments.

Download Full-text

Multiple Data Augmentation Strategies for Improving Performance on Automatic Short Answer Scoring

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i09.7062 ◽

2020 ◽

Vol 34 (09) ◽

pp. 13389-13396 ◽

Cited By ~ 2

Author(s):

Jiaqi Lun ◽

Jia Zhu ◽

Yong Tang ◽

Min Yang

Keyword(s):

Natural Language ◽

Language Processing ◽

Data Augmentation ◽

Training Data ◽

Short Answer ◽

External Knowledge ◽

Multiple Data ◽

Language Representation ◽

Augmentation Strategies ◽

Back Translation

Automatic short answer scoring (ASAS) is a research subject of intelligent education, which is a hot field of natural language understanding. Many experiments have confirmed that the ASAS system is not good enough, because its performance is limited by the training data. Focusing on the problem, we propose MDA-ASAS, multiple data augmentation strategies for improving performance on automatic short answer scoring. MDA-ASAS is designed to learn language representation enhanced by data augmentation strategies, which includes back-translation, correct answer as reference answer, and swap content. We argue that external knowledge has a profound impact on the ASAS process. Meanwhile, the Bidirectional Encoder Representations from Transformers (BERT) model has been shown to be effective for improving many natural language processing tasks, which acquires more semantic, grammatical and other features in large amounts of unsupervised data, and actually adds external knowledge. Combining with the latest BERT model, our experimental results on the ASAS dataset show that MDA-ASAS brings a significant gain over state-of-art. We also perform extensive ablation studies and suggest parameters for practical use.

Download Full-text

Paraphrase Identification with Lexical, Syntactic and Sentential Encodings

Applied Sciences ◽

10.3390/app10124144 ◽

2020 ◽

Vol 10 (12) ◽

pp. 4144

Author(s):

Sheng Xu ◽

Xingfa Shen ◽

Fumiyo Fukumoto ◽

Jiyi Li ◽

Yoshimi Suzuki ◽

...

Keyword(s):

Performance Improvement ◽

Language Processing ◽

Semantic Information ◽

Learning Model ◽

Convolutional Networks ◽

Syntactic Features ◽

Local Contexts ◽

Benchmark Datasets ◽

Dependency Structures ◽

Hidden States

Paraphrase identification has been one of the major topics in Natural Language Processing (NLP). However, how to interpret a diversity of contexts such as lexical and semantic information within a sentence as relevant features is still an open problem. This paper addresses the problem and presents an approach for leveraging contextual features with a neural-based learning model. Our Lexical, Syntactic, and Sentential Encodings (LSSE) learning model incorporates Relational Graph Convolutional Networks (R-GCNs) to make use of different features from local contexts, i.e., word encoding, position encoding, and full dependency structures. By utilizing the hidden states obtained by the R-GCNs as well as lexical and sentential encodings by Bidirectional Encoder Representations from Transformers (BERT), our model learns the contextual similarity between sentences effectively. The experimental results by using the two benchmark datasets, Microsoft Research Paraphrase Corpus (MRPC) and Quora Question Pairs (QQP) show that the improvement compared with the baseline, BERT sentential encodings model, was 1.7% F1-score on MRPC and 1.0% F1-score on QQP. Moreover, we verified that the combination of position encoding and syntactic features contributes to performance improvement.

Download Full-text

An AdaBoost Using a Weak-Learner Generating Several Weak-Hypotheses for Large Training Data of Natural Language Processing

IEEJ Transactions on Electronics Information and Systems ◽

10.1541/ieejeiss.130.83 ◽

2010 ◽

Vol 130 (1) ◽

pp. 83-91 ◽

Cited By ~ 1

Author(s):

Tomoya Iwakura ◽

Seishi Okamoto ◽

Kazuo Asakawa

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Training Data ◽

Weak Learner

Download Full-text

Optimal breeding-value prediction using a Sparse Selection Index

Genetics ◽

10.1093/genetics/iyab030 ◽

2021 ◽

Author(s):

Marco Lopez-Cruz ◽

Gustavo de los Campos

Keyword(s):

Sample Size ◽

Dna Sequences ◽

Genomic Prediction ◽

Prediction Accuracy ◽

Regularization Parameter ◽

Selection Index ◽

Prediction Method ◽

Training Data ◽

Breeding Value ◽

Data Set

Abstract Genomic prediction uses DNA sequences and phenotypes to predict genetic values. In homogeneous populations, theory indicates that the accuracy of genomic prediction increases with sample size. However, differences in allele frequencies and in linkage disequilibrium patterns can lead to heterogeneity in SNP effects. In this context, calibrating genomic predictions using a large, potentially heterogeneous, training data set may not lead to optimal prediction accuracy. Some studies tried to address this sample size/homogeneity trade-off using training set optimization algorithms; however, this approach assumes that a single training data set is optimum for all individuals in the prediction set. Here, we propose an approach that identifies, for each individual in the prediction set, a subset from the training data (i.e., a set of support points) from which predictions are derived. The methodology that we propose is a Sparse Selection Index (SSI) that integrates Selection Index methodology with sparsity-inducing techniques commonly used for high-dimensional regression. The sparsity of the resulting index is controlled by a regularization parameter (λ); the G-BLUP (the prediction method most commonly used in plant and animal breeding) appears as a special case which happens when λ = 0. In this study, we present the methodology and demonstrate (using two wheat data sets with phenotypes collected in ten different environments) that the SSI can achieve significant (anywhere between 5-10%) gains in prediction accuracy relative to the G-BLUP.

Download Full-text

Automatic information retrievement for exporting services: First project findings from the development of an AI based export decision supporting instrument

Marketing Science & Inspirations ◽

10.46286/msi.2021.16.2.1 ◽

2021 ◽

pp. 2-11

Author(s):

David Aufreiter ◽

Doris Ehrlinger ◽

Christian Stadlmann ◽

Margarethe Uberwimmer ◽

Anna Biedersberger ◽

...

Keyword(s):

Artificial Intelligence ◽

Language Processing ◽

Research Process ◽

Training Data ◽

Future Research ◽

Manufacturing Companies ◽

Market Information ◽

Annotation Scheme ◽

Export Decisions ◽

Decision Supporting

On the servitization journey, manufacturing companies complement their offerings with new industrial and knowledge-based services, which causes challenges of uncertainty and risk. In addition to the required adjustment of internal factors, the international selling of services is a major challenge. This paper presents the initial results of an international research project aimed at assisting advanced manufacturers in making decisions about exporting their service offerings to foreign markets. In the frame of this project, a tool is developed to support managers in their service export decisions through the automated generation of market information based on Natural Language Processing and Machine Learning. The paper presents a roadmap for progressing towards an Artificial Intelligence-based market information solution. It describes the research process steps of analyzing problem statements of relevant industry partners, selecting target countries and markets, defining parameters for the scope of the tool, classifying different service offerings and their components into categories and developing annotation scheme for generating reliable and focused training data for the Artificial Intelligence solution. This paper demonstrates good practices in essential steps and highlights common pitfalls to avoid for researcher and managers working on future research projects supported by Artificial Intelligence. In the end, the paper aims at contributing to support and motivate researcher and manager to discover AI application and research opportunities within the servitization field.

Download Full-text

Natural Language Processing Service Based on Stroke-Level Convolutional Networks for Chinese Text Classification

2017 IEEE International Conference on Web Services (ICWS) ◽

10.1109/icws.2017.46 ◽

2017 ◽

Cited By ~ 5

Author(s):

Hang Zhuang ◽

Chao Wang ◽

Changlong Li ◽

Qingfeng Wang ◽

Xuehai Zhou

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Chinese Text ◽

Text Classification ◽

Convolutional Networks ◽

Chinese Text Classification ◽

Processing Service

Download Full-text

Voice of the Customer Oriented New Product Synthesis Over Knowledge Graphs

Volume 1A: 38th Computers and Information in Engineering Conference ◽

10.1115/detc2018-85909 ◽

2018 ◽

Author(s):

Feiwei Qin ◽

Hairui Xu ◽

Weicheng Zhang ◽

Lin Yuan ◽

Ming Li ◽

...

Keyword(s):

Language Processing ◽

Online Shopping ◽

Specific Aspect ◽

Product Reviews ◽

Product Knowledge ◽

Online Data ◽

Challenges And Opportunities ◽

Knowledge Graphs ◽

The Voice ◽

Voice Of The Customer

The online shopping has been much easier and popular, and meanwhile brings new challenges and opportunities to the field of product design and marketing sale. On one hand, product manufacturers find it challenging to produce new popularly accepted products to meet the customers’ needs; on the other hand, end customers usually feel it difficult to buy ideal goods that they really want, even if navigating a huge amount of commodities. There are indeed a ‘communication gap’ between the customers and manufacturers. As an effort to partially resolve the issue, this paper proposes a novel product synthesis approach from ‘voice of the customer’ over product knowledge graphs. Here the voice of customers mainly refer to the buyers’ product reviews from online shopping platforms or blogs, while the product knowledge graph is constructed containing professional hierarchical product knowledge on its properties based on ontological models. Using the technologies of natural language processing, we first extract the customs’ polarities on each specific aspect of a product, which are then transited to design requirements on the product’s design components. Based on the requirement extractions, and the pre-built product knowledge, semantic web and reasoning techniques are utilized to synthesize a novel product that meets more customer needs. Typical case studies on mobile phones from raw online data demonstrate the proposed approach’s performance.

Download Full-text