General Purpose Text Embeddings from Pre-trained Language Models for Scalable Inference

We consider the following problem: given neural language models (embeddings) each of which is trained on an unknown data set, how can we determine which model would provide a better result when used for feature representation in a downstream task such as text classification or entity recognition? In this paper, we assess the word similarity measure through analyzing its impact on word embeddings learned from various datasets and how they perform in a simple classification task. Word representations were learned and assessed under the same conditions. For training word vectors, we used the implementation of Continuous Bag of Words described in [1]. To assess the quality of the vectors, we applied the analogy questions test for word similarity described in the same paper. Further, to measure the retrieval rate of an embedding model, we introduced a new metric (Average Retrieval Error) which measures the percentage of missing words in the model. We observe that scoring a high accuracy of syntactic and semantic similarities between word pairs is not an indicator of better classification results. This observation can be justified by the fact that a domain-specific corpus contributes to the performance better than a general-purpose corpus. For reproducibility, we release our experiments scripts and results.

Download Full-text

Privacy Risks of General-Purpose Language Models

2020 IEEE Symposium on Security and Privacy (SP) ◽

10.1109/sp40000.2020.00095 ◽

2020 ◽

Cited By ~ 2

Author(s):

Xudong Pan ◽

Mi Zhang ◽

Shouling Ji ◽

Min Yang

Keyword(s):

General Purpose ◽

Language Models ◽

Privacy Risks

Download Full-text

Pushdown Automata in Statistical Machine Translation

Computational Linguistics ◽

10.1162/coli_a_00197 ◽

2014 ◽

Vol 40 (3) ◽

pp. 687-723 ◽

Cited By ~ 3

Author(s):

Cyril Allauzen ◽

Bill Byrne ◽

Adrià de Gispert ◽

Gonzalo Iglesias ◽

Michael Riley

Keyword(s):

Machine Translation ◽

Large Scale ◽

Complexity Analysis ◽

Statistical Machine Translation ◽

Language Model ◽

General Purpose ◽

Language Models ◽

Experimental Conditions ◽

Context Free ◽

Pushdown Automata

This article describes the use of pushdown automata (PDA) in the context of statistical machine translation and alignment under a synchronous context-free grammar. We use PDAs to compactly represent the space of candidate translations generated by the grammar when applied to an input sentence. General-purpose PDA algorithms for replacement, composition, shortest path, and expansion are presented. We describe HiPDT, a hierarchical phrase-based decoder using the PDA representation and these algorithms. We contrast the complexity of this decoder with a decoder based on a finite state automata representation, showing that PDAs provide a more suitable framework to achieve exact decoding for larger synchronous context-free grammars and smaller language models. We assess this experimentally on a large-scale Chinese-to-English alignment and translation task. In translation, we propose a two-pass decoding strategy involving a weaker language model in the first-pass to address the results of PDA complexity analysis. We study in depth the experimental conditions and tradeoffs in which HiPDT can achieve state-of-the-art performance for large-scale SMT.

Download Full-text

Dependency-based n-gram models for general purpose sentence realisation

Natural Language Engineering ◽

10.1017/s1351324910000288 ◽

2010 ◽

Vol 17 (4) ◽

pp. 455-483 ◽

Cited By ~ 2

Author(s):

YUQING GUO ◽

HAIFENG WANG ◽

JOSEF VAN GENABITH

Keyword(s):

State Of The Art ◽

Structural Information ◽

General Purpose ◽

Language Models ◽

Semantic Representations ◽

Linguistic Features ◽

Word Forms ◽

Grammar Rules ◽

Series Of Experiments ◽

N Gram

AbstractThis paper presents a general-purpose, wide-coverage, probabilistic sentence generator based on dependency n-gram models. This is particularly interesting as many semantic or abstract syntactic input specifications for sentence realisation can be represented as labelled bi-lexical dependencies or typed predicate-argument structures. Our generation method captures the mapping between semantic representations and surface forms by linearising a set of dependencies directly, rather than via the application of grammar rules as in more traditional chart-style or unification-based generators. In contrast to conventional n-gram language models over surface word forms, we exploit structural information and various linguistic features inherent in the dependency representations to constrain the generation space and improve the generation quality. A series of experiments shows that dependency-based n-gram models generalise well to different languages (English and Chinese) and representations (LFG and CoNLL). Compared with state-of-the-art generation systems, our general-purpose sentence realiser is highly competitive with the added advantages of being simple, fast, robust and accurate.

Download Full-text

Infusing Finetuning with Semantic Dependencies

Transactions of the Association for Computational Linguistics ◽

10.1162/tacl_a_00363 ◽

2021 ◽

Vol 9 ◽

pp. 226-242

Author(s):

Zhaofeng Wu ◽

Hao Peng ◽

Noah A. Smith

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Argument Structure ◽

General Purpose ◽

The Other ◽

Language Models ◽

Language Understanding ◽

Semantic Dependencies ◽

Predicate Argument Structure

Abstract For natural language processing systems, two kinds of evidence support the use of text representations from neural language models “pretrained” on large unannotated corpora: performance on application-inspired benchmarks (Peters et al., 2018, inter alia), and the emergence of syntactic abstractions in those representations (Tenney et al., 2019, inter alia). On the other hand, the lack of grounded supervision calls into question how well these representations can ever capture meaning (Bender and Koller, 2020). We apply novel probes to recent language models— specifically focusing on predicate-argument structure as operationalized by semantic dependencies (Ivanova et al., 2012)—and find that, unlike syntax, semantics is not brought to the surface by today’s pretrained models. We then use convolutional graph encoders to explicitly incorporate semantic parses into task-specific finetuning, yielding benefits to natural language understanding (NLU) tasks in the GLUE benchmark. This approach demonstrates the potential for general-purpose (rather than task-specific) linguistic supervision, above and beyond conventional pretraining and finetuning. Several diagnostics help to localize the benefits of our approach.1

Download Full-text

From General Language Understanding to Noisy Text Comprehension

Applied Sciences ◽

10.3390/app11177814 ◽

2021 ◽

Vol 11 (17) ◽

pp. 7814

Author(s):

Buddhika Kasthuriarachchy ◽

Madhu Chetty ◽

Adrian Shatte ◽

Darren Walls

Keyword(s):

Text Comprehension ◽

State Of The Art ◽

Language Model ◽

General Purpose ◽

Language Models ◽

Language Understanding ◽

English Usage ◽

Latent Representations ◽

Noisy Text ◽

Better Than

Obtaining meaning-rich representations of social media inputs, such as Tweets (unstructured and noisy text), from general-purpose pre-trained language models has become challenging, as these inputs typically deviate from mainstream English usage. The proposed research establishes effective methods for improving the comprehension of noisy texts. For this, we propose a new generic methodology to derive a diverse set of sentence vectors combining and extracting various linguistic characteristics from latent representations of multi-layer, pre-trained language models. Further, we clearly establish how BERT, a state-of-the-art pre-trained language model, comprehends the linguistic attributes of Tweets to identify appropriate sentence representations. Five new probing tasks are developed for Tweets, which can serve as benchmark probing tasks to study noisy text comprehension. Experiments are carried out for classification accuracy by deriving the sentence vectors from GloVe-based pre-trained models and Sentence-BERT, and by using different hidden layers from the BERT model. We show that the initial and middle layers of BERT have better capability for capturing the key linguistic characteristics of noisy texts than its latter layers. With complex predictive models, we further show that the sentence vector length has lesser importance to capture linguistic information, and the proposed sentence vectors for noisy texts perform better than the existing state-of-the-art sentence vectors.

Download Full-text

The economics of selection of mail orders Drs. Zahavi and Levin are the masterminds behind the development of AMOS, a customized predictive modeling system for the Franklin Mint in Philadelphia, and GainSmarts, a general purpose data mining system that is the two-time winner of the KDD-CUP competition for the best data mining tools (1997 and 1998) sponsored by the American Association for Artificial Intelligence.

Journal of Interactive Marketing ◽

10.1002/dir.1016.abs ◽

2001 ◽

Vol 15 (3) ◽

pp. 53

Author(s):

Nissan Levin ◽

Jacob Zahavi

Keyword(s):

Artificial Intelligence ◽

Data Mining ◽

Predictive Modeling ◽

American Association ◽

General Purpose ◽

Mining System ◽

Data Mining System ◽

Mining Tools ◽

Selection Of

Download Full-text

Use of an interactive general-purpose computer terminal to simulate training equipment operation.

PsycEXTRA Dataset ◽

10.1037/e441802004-001 ◽

1975 ◽

Cited By ~ 1

Author(s):

George F. Lahey ◽

Alice M. Crawford ◽

Richard E. Hurlock

Keyword(s):

General Purpose ◽

Computer Terminal ◽

Equipment Operation ◽

General Purpose Computer ◽

Purpose Computer

Download Full-text

The general purpose design of 2-dimensional recursive digital filters

IEE Proceedings G (Electronic Circuits and Systems) ◽

10.1049/ip-g-1.1983.0032 ◽

1983 ◽

Vol 130 (5) ◽

pp. 171 ◽

Cited By ~ 1

Author(s):

G. Crebbin ◽

J. Attikiouzel

Keyword(s):

Digital Filters ◽

General Purpose ◽

Recursive Digital Filters

Download Full-text

PENGARUH LATIHAN RANGE OF MOTION (ROM) AKTIF ASSITIF TERHADAP RENTANG GERAK SENDI PADA LANSIA YANG MENGALAMI IMMOBILISASI FISIK

Surya Medika: Jurnal Ilmiah Ilmu Keperawatan dan Ilmu Kesehatan Masyarakat ◽

10.32504/sm.v13i2.116 ◽

2019 ◽

Vol 13 (2) ◽

Author(s):

Andri Setyorini ◽

Niken Setyaningrum

Keyword(s):

Range Of Motion ◽

Human Life ◽

The Elderly ◽

General Purpose ◽

Test Statistic ◽

Physical Mobility ◽

Motion Exercise ◽

The One ◽

The Individual ◽

Active Training

Background: Elderly is the final stage of the human life cycle, that is part of the inevitable life process and will be experienced by every individual. At this stage the individual undergoes many changes both physically and mentally, especially setbacks in various functions and abilities he once had. Preliminary study in Social House Tresna Wreda Yogyakarta Budhi Luhur Units there are 16 elderly who experience physical immobilization. In the social house has done various activities for the elderly are still active, but the elderly who experienced muscle weakness is not able to follow the exercise, so it needs to do ROM (Range Of Motion) exercise. Objective: The general purpose of this research is to know the effect of Range Of Motion (ROM) Active Assitif training to increase the range of motion of joints in elderly who experience physical immobility at Social House of Tresna Werdha Yogyakarta unit Budhi Luhur. Methode: This study was included in the type of pre-experiment, using the One Group Pretest Posttest design in which the range of motion of the joints before (pretest) and posttest (ROM) was performed ROM. Subjects in this study were all elderly with impaired physical mobility in Social House Tresna Wreda Yogyakarta Unit Budhi Luhur a number of 14 elderly people. Data analysis in this research use paired sample t-test statistic Result: The result of this research shows that there is influence of ROM (Range of Motion) Active training to increase of range of motion of joints in elderly who experience physical immobility at Social House Tresna Wredha Yogyakarta Unit Budhi Luhur. Conclusion: There is influence of ROM (Range of Motion) Active training to increase of range of motion of joints in elderly who experience physical immobility at Social House Tresna Wredha Yogyakarta Unit Budhi Luhur.

Download Full-text