Commonsense Knowledge Base Completion with Structural and Semantic Context

Chaitanya Malaviya; Chandra Bhagavatula; Antoine Bosselut; Yejin Choi

doi:10.1609/aaai.v34i03.5684

Commonsense Knowledge Base Completion with Structural and Semantic Context

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i03.5684 ◽

2020 ◽

Vol 34 (03) ◽

pp. 2925-2933 ◽

Cited By ~ 1

Author(s):

Chaitanya Malaviya ◽

Chandra Bhagavatula ◽

Antoine Bosselut ◽

Yejin Choi

Keyword(s):

Language Model ◽

Knowledge Bases ◽

Semantic Context ◽

Language Models ◽

Free Form ◽

Graph Structure ◽

Commonsense Knowledge ◽

Convolutional Networks ◽

Local Graph ◽

Knowledge Graphs

Automatic KB completion for commonsense knowledge graphs (e.g., ATOMIC and ConceptNet) poses unique challenges compared to the much studied conventional knowledge bases (e.g., Freebase). Commonsense knowledge graphs use free-form text to represent nodes, resulting in orders of magnitude more nodes compared to conventional KBs ( ∼18x more nodes in ATOMIC compared to Freebase (FB15K-237)). Importantly, this implies significantly sparser graph structures — a major challenge for existing KB completion methods that assume densely connected graphs over a relatively smaller set of nodes.In this paper, we present novel KB completion models that can address these challenges by exploiting the structural and semantic context of nodes. Specifically, we investigate two key ideas: (1) learning from local graph structure, using graph convolutional networks and automatic graph densification and (2) transfer learning from pre-trained language models to knowledge graphs for enhanced contextual representation of knowledge. We describe our method to incorporate information from both these sources in a joint model and provide the first empirical results for KB completion on ATOMIC and evaluation with ranking metrics on ConceptNet. Our results demonstrate the effectiveness of language model representations in boosting link prediction performance and the advantages of learning from local graph structure (+1.5 points in MRR for ConceptNet) when training on subgraphs for computational efficiency. Further analysis on model predictions shines light on the types of commonsense knowledge that language models capture well.

Download Full-text

Spelling Check Combined Language Models and Knowledge Resources for Printer Drivers

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.764-765.955 ◽

2015 ◽

Vol 764-765 ◽

pp. 955-959

Author(s):

Jui Feng Yeh ◽

Cheng Hsien Lee ◽

Yun Yun Lu ◽

Guan Huei Wu ◽

Yao Yi Wang

Keyword(s):

Error Detection ◽

Detection Rate ◽

Language Model ◽

Knowledge Bases ◽

Language Models ◽

Spelling Error ◽

Linguistic Features ◽

Knowledge Resources ◽

Error Detection And Correction ◽

New Feature

This paper proposed a spelling error detection and correction using the linguistic features and knowledge resource. The linguistic features mainly come from language model that describes the probability of a sentence. In practice, the formal document with typos is defective and fall short of the specifications, since typos and error hidden in printed document are frequent, rework will cause the waste of paper and ink. This paper proposed an approach that addresses the spelling errors and before printing. In this method, the linguistic features are used in this research to compare and increase a new feature additionally that is a function of Internet search based on knowledge bases. Combining these research manners, this paper expect to achieve the goals of confirming, improving the detection rate of typos, and reducing the waste of resources. Experimental results shows, the proposed method is practicable and efficient for users to detect the typos in the printed documents.

Download Full-text

A General Method for Transferring Explicit Knowledge into Language Model Pretraining

Security and Communication Networks ◽

10.1155/2021/7115167 ◽

2021 ◽

Vol 2021 ◽

pp. 1-8

Author(s):

Ruiqing Yan ◽

Lanchang Sun ◽

Fang Wang ◽

Xiaoming Zhang

Keyword(s):

Domain Knowledge ◽

Explicit Knowledge ◽

Language Model ◽

Background Knowledge ◽

Knowledge Bases ◽

Language Models ◽

Domain Specific ◽

Text Understanding ◽

Domain Specific Knowledge ◽

General Method

Recently, pretrained language models, such as Bert and XLNet, have rapidly advanced the state of the art on many NLP tasks. They can model implicit semantic information between words in the text. However, it is solely at the token level without considering the background knowledge. Intuitively, background knowledge influences the efficacy of text understanding. Inspired by this, we focus on improving model pretraining by leveraging external knowledge. Different from recent research that optimizes pretraining models by knowledge masking strategies, we propose a simple but general method to transfer explicit knowledge with pretraining. To be specific, we first match knowledge facts from a knowledge base (KB) and then add a knowledge injunction layer to a transformer directly without changing its architecture. This study seeks to find the direct impact of explicit knowledge on model pretraining. We conduct experiments on 7 datasets using 5 knowledge bases in different downstream tasks. Our investigation reveals promising results in all the tasks. The experiment also verifies that domain-specific knowledge is superior to open-domain knowledge in domain-specific task, and different knowledge bases have different performances in different tasks.

Download Full-text

INTEGRATION OF n-GRAM LANGUAGE MODELS IN MULTIPLE CLASSIFIER SYSTEMS FOR OFFLINE HANDWRITTEN TEXT LINE RECOGNITION

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s0218001408006855 ◽

2008 ◽

Vol 22 (07) ◽

pp. 1301-1321 ◽

Cited By ~ 2

Author(s):

ROMAN BERTOLAMI ◽

HORST BUNKE

Keyword(s):

Language Model ◽

Language Models ◽

Combination Method ◽

Text Line ◽

Multiple Classifier Systems ◽

Classifier Systems ◽

Handwritten Text ◽

Handwritten Text Recognition ◽

Multiple Classifier ◽

N Gram

Current multiple classifier systems for unconstrained handwritten text recognition do not provide a straightforward way to utilize language model information. In this paper, we describe a generic method to integrate a statistical n-gram language model into the combination of multiple offline handwritten text line recognizers. The proposed method first builds a word transition network and then rescores this network with an n-gram language model. Experimental evaluation conducted on a large dataset of offline handwritten text lines shows that the proposed approach improves the recognition accuracy over a reference system as well as over the original combination method that does not include a language model.

Download Full-text

Automated Source Code Generation and Auto-Completion Using Deep Learning: Comparing and Discussing Current Language Model-Related Approaches

AI ◽

10.3390/ai2010001 ◽

2021 ◽

Vol 2 (1) ◽

pp. 1-16

Author(s):

Juan Cruz-Benito ◽

Sanjay Vishwakarma ◽

Francisco Martin-Fernandez ◽

Ismael Faro

Keyword(s):

Deep Learning ◽

Learning Community ◽

Programming Languages ◽

Language Processing ◽

Code Generation ◽

Language Model ◽

Language Models ◽

Stochastic Gradient Descent ◽

Network Architectures ◽

Learning Architectures

In recent years, the use of deep learning in language models has gained much attention. Some research projects claim that they can generate text that can be interpreted as human writing, enabling new possibilities in many application areas. Among the different areas related to language processing, one of the most notable in applying this type of modeling is programming languages. For years, the machine learning community has been researching this software engineering area, pursuing goals like applying different approaches to auto-complete, generate, fix, or evaluate code programmed by humans. Considering the increasing popularity of the deep learning-enabled language models approach, we found a lack of empirical papers that compare different deep learning architectures to create and use language models based on programming code. This paper compares different neural network architectures like Average Stochastic Gradient Descent (ASGD) Weight-Dropped LSTMs (AWD-LSTMs), AWD-Quasi-Recurrent Neural Networks (QRNNs), and Transformer while using transfer learning and different forms of tokenization to see how they behave in building language models using a Python dataset for code generation and filling mask tasks. Considering the results, we discuss each approach’s different strengths and weaknesses and what gaps we found to evaluate the language models or to apply them in a real programming context.

Download Full-text

HATIMU AISYAH KARYA ZURINAH HASSAN MENERUSI PERSPEKTIF ELAINE SHOWALTER MODEL BAHASA

International Journal of Creative Future and Heritage (TENIAT) ◽

10.47252/teniat.v8i2.296 ◽

2020 ◽

Vol 8 (2) ◽

pp. 54-62

Author(s):

NUR ZALIKHA MAT RADZI ◽

NASIRIN ABDILLAH ◽

DAENG HALIZA DAENG JAMAL

Keyword(s):

Language Model ◽

Southeast Asian ◽

Language Models ◽

Symbolic Language ◽

Female Characters ◽

Literary Works ◽

The Past ◽

Historical Practices

Hatimu Aisyah karya Sasterawan Negara ke-13 iaitu - Zurinah Hassan, yang juga penerima Anugerah Hadiah Penulis Asia Tenggara (SEA Write Award) pada tahun 2004. Rentetan kejayaan beliau, telah menjadi tumpuan para pengkaji untuk meneliti aspek mengenai pengarangan wanita. Hatimu Aisyah merupakan novel pertama dihasilkan oleh Zurinah Hassan yang menekankan mengenai amalan adat resam zaman terdahulu sehingga ditelan arus pemodenan zaman. Novel Hatimu Aisyah mengetengahkan gambaran wanita yang mengutamakan adat dalam konteks perjalanan hidup bermasyarakat. Kajian terhadap karya Zurinah Hassan ini, bersandarkan kepada Model Bahasa Gagasan Elaine Showalter dari perspektif ginokritik untuk melihat watak-watak wanita. Antara Perbincangan dalam kajian ini adalah berfokuskan kepada simbolik bahasa dan bahasa sebagai ekspresi kesedaran wanita. Hasil dapatan keseluruhan kajian menunjukkan bahawa Zurinah Hassan menggunakan bahasa yang bersesuaian dengan gagasan bahasa daripada Elaine Showalter tetapi agak kurang menyerlah. Hal ini disebabkan keterbatasan penggunaan bahasa selaras dengan sosiobudaya masyarakat Melayu. Penemuan kajian ini dalam model bahasa wanita dapat dilihat menerusi simbolik bahasa dan bahasa sebagai ekspresi kesedaran wanita. Hasil manfaat dan kepentingan diperolehi masa hadapan dapat dilihat bahawa golongan wanita menzahirkan protes dan kritikan menerusi corak penulisan karya mereka meskipun masih dalam keadaan terkawal. Hatimu Aisyah the 13th National literary works, namely-Zurinah Hassan, who is also the recipient of the Southeast Asian Writer award (SEA Write Award) in 2004. His success string has been the focus of researchers to examine the aspects of women's writings. Hatimu Aisyah is the first novel to be produced by Zurinah Hassan that emphasizes on the historical practices of the past, having swallowed the current modernization of the day. The Hatimu Aisyah Novel highlights the portrayal of women who are customcentric in the context of the communities life. Studies on Zurinah Hassan's work are based on the language Model of Elaine Showalter from the perspective of Ginokritik to see the female characters. Among the discussions in this study are focused on symbolic language and language as a expression of women's awareness. The overall findings of the study showed that Zurinah Hassan used a language that fits the language idea of Elaine Showalter but was somewhat less striking. This is due to the limitations of usage in line with the Malay social. The findings of this study in female language models can be seen through the symbolic language and language in the expression of women's awareness. The results of the benefits and interests gained future can be seen that women are in their protest and criticism through their work writing patterns despite being controlled.

Download Full-text

Astrid

Proceedings of the VLDB Endowment ◽

10.14778/3436905.3436907 ◽

2020 ◽

Vol 14 (4) ◽

pp. 471-484

Author(s):

Suraj Shetiya ◽

Saravanan Thirumuruganathan ◽

Nick Koudas ◽

Gautam Das

Keyword(s):

Deep Learning ◽

Objective Function ◽

Pattern Matching ◽

Language Processing ◽

Language Model ◽

Language Models ◽

Selectivity Estimation ◽

Statistical Correlations ◽

Benchmark Datasets ◽

Traditional Approaches

Accurate selectivity estimation for string predicates is a long-standing research challenge in databases. Supporting pattern matching on strings (such as prefix, substring, and suffix) makes this problem much more challenging, thereby necessitating a dedicated study. Traditional approaches often build pruned summary data structures such as tries followed by selectivity estimation using statistical correlations. However, this produces insufficiently accurate cardinality estimates resulting in the selection of sub-optimal plans by the query optimizer. Recently proposed deep learning based approaches leverage techniques from natural language processing such as embeddings to encode the strings and use it to train a model. While this is an improvement over traditional approaches, there is a large scope for improvement. We propose Astrid, a framework for string selectivity estimation that synthesizes ideas from traditional and deep learning based approaches. We make two complementary contributions. First, we propose an embedding algorithm that is query-type (prefix, substring, and suffix) and selectivity aware. Consider three strings 'ab', 'abc' and 'abd' whose prefix frequencies are 1000, 800 and 100 respectively. Our approach would ensure that the embedding for 'ab' is closer to 'abc' than 'abd'. Second, we describe how neural language models could be used for selectivity estimation. While they work well for prefix queries, their performance for substring queries is sub-optimal. We modify the objective function of the neural language model so that it could be used for estimating selectivities of pattern matching queries. We also propose a novel and efficient algorithm for optimizing the new objective function. We conduct extensive experiments over benchmark datasets and show that our proposed approaches achieve state-of-the-art results.

Download Full-text

Chord-aware automatic music transcription based on hierarchical Bayesian integration of acoustic and language models

APSIPA Transactions on Signal and Information Processing ◽

10.1017/atsip.2018.17 ◽

2018 ◽

Vol 7 ◽

Author(s):

Yuta Ojima ◽

Eita Nakamura ◽

Katsutoshi Itoyama ◽

Kazuyoshi Yoshii

Keyword(s):

Latent Variables ◽

Language Model ◽

Language Models ◽

Sequential Dependency ◽

Acoustic Model ◽

Hierarchical Bayesian ◽

Generative Process ◽

Music Transcription ◽

Automatic Music Transcription ◽

Music Audio

This paper describes automatic music transcription with chord estimation for music audio signals. We focus on the fact that concurrent structures of musical notes such as chords form the basis of harmony and are considered for music composition. Since chords and musical notes are deeply linked with each other, we propose joint pitch and chord estimation based on a Bayesian hierarchical model that consists of an acoustic model representing the generative process of a spectrogram and a language model representing the generative process of a piano roll. The acoustic model is formulated as a variant of non-negative matrix factorization that has binary variables indicating a piano roll. The language model is formulated as a hidden Markov model that has chord labels as the latent variables and emits a piano roll. The sequential dependency of a piano roll can be represented in the language model. Both models are integrated through a piano roll in a hierarchical Bayesian manner. All the latent variables and parameters are estimated using Gibbs sampling. The experimental results showed the great potential of the proposed method for unified music transcription and grammar induction.

Download Full-text

Generating Sentences by Editing Prototypes

Transactions of the Association for Computational Linguistics ◽

10.1162/tacl_a_00030 ◽

2018 ◽

Vol 6 ◽

pp. 437-450 ◽

Cited By ~ 10

Author(s):

Kelvin Guu ◽

Tatsunori B. Hashimoto ◽

Yonatan Oren ◽

Percy Liang

Keyword(s):

Language Model ◽

Language Modeling ◽

Language Models ◽

Training Corpus ◽

Human Evaluation ◽

Sentence Level ◽

Sentence Similarity ◽

Traditional Language ◽

Generative Language

We propose a new generative language model for sentences that first samples a prototype sentence from the training corpus and then edits it into a new sentence. Compared to traditional language models that generate from scratch either left-to-right or by first sampling a latent sentence vector, our prototype-then-edit model improves perplexity on language modeling and generates higher quality outputs according to human evaluation. Furthermore, the model gives rise to a latent edit vector that captures interpretable semantics such as sentence similarity and sentence-level analogies.

Download Full-text

Building Hybrid Representations from Text Corpora, Knowledge Graphs, and Language Models

A Practical Guide to Hybrid Natural Language Processing ◽

10.1007/978-3-030-44830-1_6 ◽

2020 ◽

pp. 57-89

Author(s):

Jose Manuel Gomez-Perez ◽

Ronald Denaux ◽

Andres Garcia-Silva

Keyword(s):

Language Models ◽

Text Corpora ◽

Knowledge Graphs

Download Full-text

Extending semantic context analysis using machine learning services to process unstructured data

SHS Web of Conferences ◽

10.1051/shsconf/202110202001 ◽

2021 ◽

Vol 102 ◽

pp. 02001

Author(s):

Anja Wilhelm ◽

Wolfgang Ziegler

Keyword(s):

Machine Learning ◽

Language Processing ◽

Knowledge Bases ◽

Semantic Context ◽

Unstructured Data ◽

Context Analysis ◽

Scientific Publications ◽

Starting Point ◽

Processing And Storage ◽

Modelling Process

The primary focus of technical communication (TC) in the past decade has been the system-assisted generation and utilization of standardized, structured, and classified content for dynamic output solutions. Nowadays, machine learning (ML) approaches offer a new opportunity to integrate unstructured data into existing knowledge bases without the need to manually organize information into topic-based content enriched with semantic metadata. To make the field of artificial intelligence (AI) more accessible for technical writers and content managers, cloud-based machine learning as a service (MLaaS) solutions provide a starting point for domain-specific ML modelling while unloading the modelling process from extensive coding, data processing and storage demands. Therefore, information architects can focus on information extraction tasks and on prospects to include pre-existing knowledge from other systems into the ML modelling process. In this paper, the capability and performance of a cloud-based ML service, IBM Watson, are analysed to assess their value for semantic context analysis. The ML model is based on a supervised learning method and features deep learning (DL) and natural language processing (NLP) techniques. The subject of the analysis is a corpus of scientific publications on the 2019 Coronavirus disease. The analysis focuses on information extractions regarding preventive measures and effects of the pandemic on healthcare workers.

Download Full-text