Korean Text Generation Using Word Embedding Enhanced by Domain Information

Identifiers are one of the most important sources of domain information in software development. Therefore, it is recognized that the proper use of names directly impacts the code's comprehensibility, maintainability, and quality. Our goal in this work is to expand the current knowledge about names by considering not only their quality but also their contextual similarity. To achieve that, we extracted names of four large scale open-source projects written in Java. Then, we computed the semantic similarity between classes and their attributes/variables using Fasttext, an word embedding algorithm. As a result, we could observe that source code, in general, preserve an acceptable level of contextual similarity, developers avoid to use names out of the default dictionary (e.g., domain), and files with more changes and maintained by distinct contributors tend to have better a contextual similarity.

Download Full-text

Maintaining cross-domain information in working memory, what resources are involved?

PsycEXTRA Dataset ◽

10.1037/e512592013-659 ◽

2011 ◽

Author(s):

N. Langerock ◽

E. Vergauwe ◽

P. Barrouillet

Keyword(s):

Working Memory ◽

Cross Domain ◽

Domain Information

Download Full-text

The Computational Sublime in Nick Montfort's ‘Round’ and ‘All the Names of God’

CounterText ◽

10.3366/count.2015.0027 ◽

2015 ◽

Vol 1 (3) ◽

pp. 348-365 ◽

Cited By ~ 2

Author(s):

Mario Aquilina

Keyword(s):

Computer Code ◽

Computer Programs ◽

Literary Text ◽

Text Generation ◽

Mathematical Concepts ◽

Literary Space

What if the post-literary also meant that which operates in a literary space (almost) devoid of language as we know it: for instance, a space in which language simply frames the literary or poetic rather than ‘containing’ it? What if the countertextual also meant the (en)countering of literary text with non-textual elements, such as mathematical concepts, or with texts that we would not normally think of as literary, such as computer code? This article addresses these issues in relation to Nick Montfort's #!, a 2014 print collection of poems that presents readers with the output of computer programs as well as the programs themselves, which are designed to operate on principles of text generation regulated by specific constraints. More specifically, it focuses on two works in the collection, ‘Round’ and ‘All the Names of God’, which are read in relation to the notions of the ‘computational sublime’ and the ‘event’.

Download Full-text

Text Genre Detection Using Doc2Vec Word-embedding Language Model

Language and Information ◽

10.29403/li.23.2.2 ◽

2019 ◽

Vol 23 (2) ◽

pp. 23-43

Author(s):

Dongsung Kim

Keyword(s):

Language Model ◽

Word Embedding ◽

Text Genre

Download Full-text

Query-focused summarization using text-to-text generation

Proceedings of the 2009 Workshop on Language Generation and Summarisation - UCNLG+Sum '09 ◽

10.3115/1708155.1708157 ◽

2009 ◽

Author(s):

Kathleen McKeown

Keyword(s):

Text Generation

Download Full-text

A Simple Word Embedding Model for Lexical Substitution

10.3115/v1/w15-1501 ◽

2015 ◽

Cited By ~ 13

Author(s):

Oren Melamud ◽

Omer Levy ◽

Ido Dagan

Keyword(s):

Word Embedding ◽

Lexical Substitution

Download Full-text

Word Embedding Based Knowledge Representation with Extracting Relationship Between Scientific Terminologies

Intelligent Automation & Soft Computing ◽

10.31209/2019.100000135 ◽

2019 ◽

pp. -1--1

Author(s):

Mucheol Kim ◽

Junho Kim ◽

Mincheol Shin

Keyword(s):

Knowledge Representation ◽

Word Embedding

Download Full-text

Key phrase Extraction by Improving TextRank with an Integration of Word Embedding and Syntactic Information

Recent Advances in Computer Science and Communications ◽

10.2174/2666255813999200820155846 ◽

2020 ◽

Vol 13 ◽

Author(s):

Sheng Zhang ◽

Qi Luo ◽

Yukun Feng ◽

Ke Ding ◽

Daniela Gifu ◽

...

Keyword(s):

Semantic Information ◽

Performance Enhancement ◽

Word Embedding ◽

The Other ◽

Test Set ◽

Pagerank Algorithm ◽

Phrase Extraction ◽

Extraction Algorithm ◽

Syntactic Information ◽

Key Phrase Extraction

Background: As a known key phrase extraction algorithm, TextRank is an analogue of PageRank algorithm, which relied heavily on the statistics of term frequency in the manner of co-occurrence analysis. Objective: The frequency-based characteristic made it a neck-bottle for performance enhancement, and various improved TextRank algorithms were proposed in the recent years. Most of improvements incorporated semantic information into key phrase extraction algorithm and achieved improvement. Method: In this research, taking both syntactic and semantic information into consideration, we integrated syntactic tree algorithm and word embedding and put forward an algorithm of Word Embedding and Syntactic Information Algorithm (WESIA), which improved the accuracy of the TextRank algorithm. Results: By applying our method on a self-made test set and a public test set, the result implied that the proposed unsupervised key phrase extraction algorithm outperformed the other algorithms to some extent.

Download Full-text

Assessing Suitable Word Embedding Model for Malay Language through Intrinsic Evaluation

2020 International Conference on Computational Intelligence (ICCI) ◽

10.1109/icci51257.2020.9247707 ◽

2020 ◽

Author(s):

Yeong-Tsann Phua ◽

Kwang-Hooi Yew ◽

Oi-Mean Foong ◽

Matthew Yok-Wooi Teow

Keyword(s):

Word Embedding

Download Full-text