Constructing Distributed Semantic Lexical Representations using a Machine Readable Dictionary

Lexical Knowledge Acquisition

10.1093/oxfordhb/9780199276349.013.0021 ◽

2012 ◽

Author(s):

Yuji Matsumoto

Keyword(s):

Natural Language Processing ◽

Mutual Information ◽

Natural Language ◽

Language Processing ◽

Lexical Knowledge ◽

Lexical Representations ◽

Domain Specific ◽

Text Corpora ◽

Corpus Data ◽

Machine Readable

This article deals with the acquisition of lexical knowledge, instrumental in complementing the ambiguous process of NLP (natural language processing). Imprecise in nature, lexical representations are mostly simple and superficial. The thesaurus would be an apt example. Two primary tools for acquiring lexical knowledge are ‘corpora’ and ‘machine-readable dictionary’ (MRD). The former are mostly domain specific, monolingual, while the definitions in MRD are generally described by a ‘genus term’ followed by a set of differentiae. Auxiliary technical nuances of the acquisition process, find mention as well, such as ‘lexical collocation’ and ‘association’, referring to the deliberate co-occurrence of words that form a new meaning altogether and loses it whenever a synonym replaces either of the words. The first seminal work on collocation extraction from large text corpora, was compiled around the early 1990s, using inter-word mutual information to locate collocation. Abundant corpus data would be obtainable from the Linguistic Data Consortium (LDC).

Download Full-text

Changing Dynamics in the Mental Lexicon: New Lexical Representations Strengthen Over Time

PsycEXTRA Dataset ◽

10.1037/e527312012-502 ◽

2008 ◽

Author(s):

Jakke Tamminen ◽

Gareth Gaskell

Keyword(s):

Mental Lexicon ◽

Lexical Representations ◽

Over Time

Download Full-text

Machine-readable codes

Production Engineer ◽

10.1049/tpe.1986.0006 ◽

1986 ◽

Vol 65 (1) ◽

pp. 9

Author(s):

C.W. Painter

Keyword(s):

Machine Readable

Download Full-text

The Use of Medical Record Linkage for Population and Genetic Studies

Methods of Information in Medicine ◽

10.1055/s-0038-1635962 ◽

1969 ◽

Vol 08 (01) ◽

pp. 07-11 ◽

Cited By ~ 9

Author(s):

H. B. Newcombe

Keyword(s):

Record Linkage ◽

Large Scale ◽

Medical Record Linkage ◽

Canadian Province ◽

Genetic Studies ◽

Parental Characteristics ◽

Family Histories ◽

The Family ◽

Large Populations ◽

Machine Readable

Methods are described for deriving personal and family histories of birth, marriage, procreation, ill health and death, for large populations, from existing civil registrations of vital events and the routine records of ill health. Computers have been used to group together and »link« the separately derived records pertaining to successive events in the lives of the same individuals and families, rapidly and on a large scale. Most of the records employed are already available as machine readable punchcards and magnetic tapes, for statistical and administrative purposes, and only minor modifications have been made to the manner in which these are produced.As applied to the population of the Canadian province of British Columbia (currently about 2 million people) these methods have already yielded substantial information on the risks of disease: a) in the population, b) in relation to various parental characteristics, and c) as correlated with previous occurrences in the family histories.

Download Full-text

A Structured Method for the Recognition of Complex Historical Tables

History and Computing ◽

10.3366/hac.1997.9.1-3.58 ◽

1997 ◽

Vol 9 (1-3) ◽

pp. 58-77

Author(s):

Vitaly Kliatskine ◽

Eugene Shchepin ◽

Gunnar Thorvaldsen ◽

Konstantin Zingerman ◽

Valery Lazarev

Keyword(s):

Nineteenth Century ◽

Character Recognition ◽

Optical Character Recognition ◽

Complex Structure ◽

Source Material ◽

Historical Sources ◽

Tax Assessment ◽

Optical Character ◽

Algorithmic Model ◽

Machine Readable

In principle, printed source material should be made machine-readable with systems for Optical Character Recognition, rather than being typed once more. Offthe-shelf commercial OCR programs tend, however, to be inadequate for lists with a complex layout. The tax assessment lists that assess most nineteenth century farms in Norway, constitute one example among a series of valuable sources which can only be interpreted successfully with specially designed OCR software. This paper considers the problems involved in the recognition of material with a complex table structure, outlining a new algorithmic model based on ‘linked hierarchies’. Within the scope of this model, a variety of tables and layouts can be described and recognized. The ‘linked hierarchies’ model has been implemented in the ‘CRIPT’ OCR software system, which successfully reads tables with a complex structure from several different historical sources.

Download Full-text

More Alike Than Unalike: Cross-language Similarity Influences Novel Word Lexicalization in Bilingual Individuals

10.31234/osf.io/phcqj ◽

2020 ◽

Author(s):

Pauline Palma ◽

Marie-France Marin ◽

k onishi ◽

Debra Titone

Keyword(s):

Lexical Decision ◽

Lexical Decision Task ◽

Decision Task ◽

The Novel ◽

Lexical Representations ◽

Before And After ◽

Word Forms ◽

Explicit Recognition ◽

Novel Word Learning ◽

Cross Language

Although several studies have focused on novel word learning and consolidation in native (presumably monolingual) speakers, less is know about how bilinguals add novel words to their mental lexicon. Here, we trained 33 English-French bilinguals on novel word-forms that were neighbors to “hermit” English words (i.e., words with no existing neighbors). Importantly, these English words varied in terms of orthographic overlap with their French translation equivalent (i.e., cognates vs. noncognates). We measured explicit recognition of the novel neighbors and the interaction between novel neighbors and English words through a lexical decision task, both before and after a sleep interval. In the lexical decision task, we found evidence of immediate facilitation for English words with novel neighbors, and evidence of competition after a sleep interval for cognate words only. These results suggest that higher quality of existing lexical representations predicts an earlier onset for novel word lexicalization.

Download Full-text

From the history of the development of electronic dictionaries: foreign and domestic experience

Humanitarian studios: pedagogics, psychology, philosophy ◽

10.31548/hspedagog2020.01.006 ◽

2020 ◽

Vol 11 (1) ◽

pp. 6-12

Author(s):

O. Y. Balalaieva ◽

Keyword(s):

Human Activity ◽

Current Level ◽

Ease Of Use ◽

Advanced Search ◽

Wide Range ◽

Promising Area ◽

History Of ◽

Machine Readable ◽

Everyday Communication ◽

Educational Professional

The purpose of the article is to study the dynamics of electronic dictionaries development abroad and in Ukraine using methods of analysis of scientific sources, comparison, generalization and systematization. Electronic dictionaries have been found to be a relatively new phenomenon in the lexicographic market, evolving from machine-readable dictionaries, exact copies of paper editions to complex digital lexicographic systems with a powerful arsenal of functions over the decades. The stages of development of autonomous and online dictionaries are described. Electronic dictionaries due to the advanced search capabilities, speed, simplicity, ease of use, accessibility and compactness have gained popularity among a wide range of users. Today they are used in many spheres of human activity – scientific, educational, professional, everyday communication. However, the analysis of the current level of development of Ukrainian electronic resources indicates a shortage of electronic dictionaries both common and terminological vocabulary. The lack of electronic dictionaries is due to a number of objective problems, both practical and theoretical, that is why research in the field of domestic computer lexicography is a promising area of further research.

Download Full-text