Bilingual Mental Lexicon and Collocational Processing

2022 ◽  
pp. 706-722
Author(s):  
Hakan Cangır

The chapter starts with a definition and models of mental dictionary. It then builds on the bilingual lexical activation models and goes on to discuss formulaic language (collocations in particular). After explaining the basics of formulaic language processing, the author attempts to address the issue of lexical and collocational priming theory by Hoey, which has its roots in cognitive linguistics and usage-based language models. Last but not least, some suggestions for future research are provided in an attempt to address the needs of the lexical research literature in the Turkish setting.

Author(s):  
Hakan Cangır

The chapter starts with a definition and models of mental dictionary. It then builds on the bilingual lexical activation models and goes on to discuss formulaic language (collocations in particular). After explaining the basics of formulaic language processing, the author attempts to address the issue of lexical and collocational priming theory by Hoey, which has its roots in cognitive linguistics and usage-based language models. Last but not least, some suggestions for future research are provided in an attempt to address the needs of the lexical research literature in the Turkish setting.


2021 ◽  
Author(s):  
Anna Siyanova ◽  
R Martinez

© 2014 Oxford University Press 2014. John Sinclair's Idiom Principle famously posited that most texts are largely composed of multi-word expressions that 'constitute single choices' in the mental lexicon. At the time that assertion was made, little actual psycholinguistic evidence existed in support of that holistic, 'single choice', view of formulaic language. In the intervening years, a number of studies have shown that multi-word expressions are indeed processed differently from novel phrases. This processing advantage, however, does not necessarily support the holistic view of formulaic language. The present review aims to bring together studies on the processing of multi-word expressions in a first and second language that have used a range of psycholinguistic techniques, and presents why such research is important. Practical implications and pathways for future research are discussed.


Complexity ◽  
2021 ◽  
Vol 2021 ◽  
pp. 1-10
Author(s):  
Maha Al-Yahya ◽  
Hend Al-Khalifa ◽  
Heyam Al-Baity ◽  
Duaa AlSaeed ◽  
Amr Essam

Fake news detection (FND) involves predicting the likelihood that a particular news article (news report, editorial, expose, etc.) is intentionally deceptive. Arabic FND started to receive more attention in the last decade, and many detection approaches demonstrated some ability to detect fake news on multiple datasets. However, most existing approaches do not consider recent advances in natural language processing, i.e., the use of neural networks and transformers. This paper presents a comprehensive comparative study of neural network and transformer-based language models used for Arabic FND. We examine the use of neural networks and transformer-based language models for Arabic FND and show their performance compared to each other. We also conduct an extensive analysis of the possible reasons for the difference in performance results obtained by different approaches. The results demonstrate that transformer-based models outperform the neural network-based solutions, which led to an increase in the F1 score from 0.83 (best neural network-based model, GRU) to 0.95 (best transformer-based model, QARiB), and it boosted the accuracy by 16% compared to the best in neural network-based solutions. Finally, we highlight the main gaps in Arabic FND research and suggest future research directions.


Electronics ◽  
2021 ◽  
Vol 10 (21) ◽  
pp. 2656
Author(s):  
Ayato Kuwana ◽  
Atsushi Oba ◽  
Ranto Sawai ◽  
Incheon Paik

In recent years, automatic ontology generation has received significant attention in information science as a means of systemizing vast amounts of online data. As our initial attempt of ontology generation with a neural network, we proposed a recurrent neural network-based method. However, updating the architecture is possible because of the development in natural language processing (NLP). By contrast, the transfer learning of language models trained by a large, unlabeled corpus has yielded a breakthrough in NLP. Inspired by these achievements, we propose a novel workflow for ontology generation comprising two-stage learning. Our results showed that our best method improved accuracy by over 12.5%. As an application example, we applied our model to the Stanford Question Answering Dataset to show ontology generation in a real field. The results showed that our model can generate a good ontology, with some exceptions in the real field, indicating future research directions to improve the quality.


2020 ◽  
Author(s):  
Jonathan P. Scaccia ◽  
Victoria C. Scott

Abstract Introduction: Moving evidence-based practices into the hands of practitioners requires the synthesis and translation of research literature. However, the growing pace of scientific publications across disciplines makes it increasingly difficult to stay abreast of research literature. Natural Language Processing (NLP) methods are emerging as a valuable strategy for conducting content analyses of academic literature. We sought to apply NLP to identify publication trends in the flagship journal field, Implementation Science, including key topic clusters and the distribution of topics over time. A parallel study objective was to demonstrate how NLP can be used in research synthesis. Methods: We examined 1711 Implementation Science abstracts published from February 22, 2006 to October 1, 2020. We retrieved the study data using PubMed’s Application Programming Interface (API) to assemble a database. Following standard preprocessing steps, we use topic modeling with latent Dirichlet Allocation (LDA) to cluster the abstracts following a minimization algorithm.Results: We examined 30 topics and computed topic model statistics of quality. Analyses revealed that published articles largely reflect i) characteristics of research, or ii) domains of practice. Emergent topic clusters encompassed key terms both salient and common to implementation science. HIV and stroke(s) represent the most commonly published clinical areas. Systematic reviews have grown in topic prominence and coherence, whereas articles pertaining to knowledge translation (KT) have dropped in prominence since 2013. Articles on HIV and implementation effectiveness have increased in topic exclusivity over time. Discussion. We demonstrated how NLP can be used as a synthesis and translation method to identify trends and topics across a large number of (over 1700) articles. With applicability to a variety of research domains, NLP is a promising approach to accelerate the dissemination and uptake of research literature. For future research in implementation science, we encourage the inclusion of more equity-focused studies to expand the impact of implementation science on disadvantaged communities.


2021 ◽  
Author(s):  
Anna Siyanova ◽  
R Martinez

© 2014 Oxford University Press 2014. John Sinclair's Idiom Principle famously posited that most texts are largely composed of multi-word expressions that 'constitute single choices' in the mental lexicon. At the time that assertion was made, little actual psycholinguistic evidence existed in support of that holistic, 'single choice', view of formulaic language. In the intervening years, a number of studies have shown that multi-word expressions are indeed processed differently from novel phrases. This processing advantage, however, does not necessarily support the holistic view of formulaic language. The present review aims to bring together studies on the processing of multi-word expressions in a first and second language that have used a range of psycholinguistic techniques, and presents why such research is important. Practical implications and pathways for future research are discussed.


2021 ◽  
Vol 16 (1) ◽  
Author(s):  
Jonathan P. Scaccia ◽  
Victoria C. Scott

Abstract Introduction Moving evidence-based practices into the hands of practitioners requires the synthesis and translation of research literature. However, the growing pace of scientific publications across disciplines makes it increasingly difficult to stay abreast of research literature. Natural language processing (NLP) methods are emerging as a valuable strategy for conducting content analyses of academic literature. We sought to apply NLP to identify publication trends in the journal Implementation Science, including key topic clusters and the distribution of topics over time. A parallel study objective was to demonstrate how NLP can be used in research synthesis. Methods We examined 1711 Implementation Science abstracts published from February 22, 2006, to October 1, 2020. We retrieved the study data using PubMed’s Application Programming Interface (API) to assemble a database. Following standard preprocessing steps, we use topic modeling with Latent Dirichlet allocation (LDA) to cluster the abstracts following a minimization algorithm. Results We examined 30 topics and computed topic model statistics of quality. Analyses revealed that published articles largely reflect (i) characteristics of research, or (ii) domains of practice. Emergent topic clusters encompassed key terms both salient and common to implementation science. HIV and stroke represent the most commonly published clinical areas. Systematic reviews have grown in topic prominence and coherence, whereas articles pertaining to knowledge translation (KT) have dropped in prominence since 2013. Articles on HIV and implementation effectiveness have increased in topic exclusivity over time. Discussion We demonstrated how NLP can be used as a synthesis and translation method to identify trends and topics across a large number of (over 1700) articles. With applicability to a variety of research domains, NLP is a promising approach to accelerate the dissemination and uptake of research literature. For future research in implementation science, we encourage the inclusion of more equity-focused studies to expand the impact of implementation science on disadvantaged communities.


2016 ◽  
Vol 11 (1) ◽  
pp. 34
Author(s):  
Maral Babapour Chafi

Designers engage in various activities, dealing with different materials and media to externalise and represent their form ideas. This paper presents a review of design research literature regarding externalisation activities in design process: sketching, building physical models and digital modelling. The aim has been to review research on the roles of media and representations in design processes, and highlight knowledge gaps and questions for future research.


2021 ◽  
Vol 11 (1) ◽  
pp. 428
Author(s):  
Donghoon Oh ◽  
Jeong-Sik Park ◽  
Ji-Hwan Kim ◽  
Gil-Jin Jang

Speech recognition consists of converting input sound into a sequence of phonemes, then finding text for the input using language models. Therefore, phoneme classification performance is a critical factor for the successful implementation of a speech recognition system. However, correctly distinguishing phonemes with similar characteristics is still a challenging problem even for state-of-the-art classification methods, and the classification errors are hard to be recovered in the subsequent language processing steps. This paper proposes a hierarchical phoneme clustering method to exploit more suitable recognition models to different phonemes. The phonemes of the TIMIT database are carefully analyzed using a confusion matrix from a baseline speech recognition model. Using automatic phoneme clustering results, a set of phoneme classification models optimized for the generated phoneme groups is constructed and integrated into a hierarchical phoneme classification method. According to the results of a number of phoneme classification experiments, the proposed hierarchical phoneme group models improved performance over the baseline by 3%, 2.1%, 6.0%, and 2.2% for fricative, affricate, stop, and nasal sounds, respectively. The average accuracy was 69.5% and 71.7% for the baseline and proposed hierarchical models, showing a 2.2% overall improvement.


2021 ◽  
Vol 11 (2) ◽  
pp. 58
Author(s):  
Lars Fuglsang ◽  
Anne Vorre Hansen ◽  
Ines Mergel ◽  
Maria Taivalsaari Røhnebæk

The public administration literature and adjacent fields have devoted increasing attention to living labs as environments and structures enabling the co-creation of public sector innovation. However, living labs remain a somewhat elusive concept and phenomenon, and there is a lack of understanding of its versatile nature. To gain a deeper understanding of the multiple dimensions of living labs, this article provides a review assessing how the environments, methods and outcomes of living labs are addressed in the extant research literature. The findings are drawn together in a model synthesizing how living labs link to public sector innovation, followed by an outline of knowledge gaps and future research avenues.


Sign in / Sign up

Export Citation Format

Share Document