A longitudinal neuroimaging dataset on language processing in children ages 5, 7, and 9 years old

Jin Wang; Marisa N. Lytle; Yael Weiss; Brianna L. Yamasaki; James R. Booth

doi:10.1038/s41597-021-01106-3

A longitudinal neuroimaging dataset on language processing in children ages 5, 7, and 9 years old

Scientific Data ◽

10.1038/s41597-021-01106-3 ◽

2022 ◽

Vol 9 (1) ◽

Author(s):

Jin Wang ◽

Marisa N. Lytle ◽

Yael Weiss ◽

Brianna L. Yamasaki ◽

James R. Booth

Keyword(s):

Language Processing ◽

Semantic Processing ◽

Longitudinal Design ◽

Word Level ◽

Sentence Level ◽

Multiple Imaging ◽

Educational Assessments ◽

Magnetic Resonance Imaging Mri ◽

Changes Over Time ◽

Brain Behavior

AbstractThis dataset examines language development with a longitudinal design and includes diffusion- and T1-weighted structural magnetic resonance imaging (MRI), task-based functional MRI (fMRI), and a battery of psycho-educational assessments and parental questionnaires. We collected data from 5.5-6.5-year-old children (ses-5) and followed them up when they were 7-8 years old (ses-7) and then again at 8.5-10 years old (ses-9). To increase the sample size at the older time points, another cohort of 7-8-year-old children (ses-7) were recruited and followed up when they were 8.5–10 years old (ses-9). In total, 322 children who completed at least one structural and functional scan were included. Children performed four fMRI tasks consisting of two word-level tasks examining phonological and semantic processing and two sentence-level tasks investigating semantic and syntactic processing. The MRI data is valuable for examining changes over time in interactive specialization due to the use of multiple imaging modalities and tasks in this longitudinal design. In addition, the extensive psycho-educational assessments and questionnaires provide opportunities to explore brain-behavior and brain-environment associations.

Download Full-text

A longitudinal neuroimaging dataset on language processing in children ages 5, 7, and 9 years old.

10.31234/osf.io/tpndf ◽

2021 ◽

Author(s):

Jin Wang ◽

Marisa N. Lytle ◽

Yael Weiss ◽

Brianna L. Yamasaki ◽

James R. Booth

Keyword(s):

Language Processing ◽

Semantic Processing ◽

Longitudinal Design ◽

Word Level ◽

Sentence Level ◽

Multiple Imaging ◽

Educational Assessments ◽

Magnetic Resonance Imaging Mri ◽

Changes Over Time ◽

Brain Behavior

This dataset examines language development with a longitudinal design and includes diffusion- and T1-weighted structural magnetic resonance imaging (MRI), task-based functional MRI (fMRI), and a battery of psycho-educational assessments and parental questionnaires. We collected data from 5.5-6.5-year-old children (ses-5) and followed them up when they were 7-8 years old (ses-7) and then again at 8.5-10 years old (ses-9). To increase the sample size at the older time points, another cohort of 7-8-year-old children (ses-7) were recruited and followed up when they were 8.5-10 years old (ses-9). In total, 322 children who completed at least one structural and functional scan were included. Children performed four fMRI tasks consisting of two word-level tasks examining phonological and semantic processing and two sentence-level tasks investigating semantic and syntactic processing. The MRI data is valuable for examining changes over time in interactive specialization due to the use of multiple imaging modalities and tasks in this longitudinal design. In addition, the extensive psycho-educational assessments and questionnaires provide opportunities to explore brain-behavior and brain-environment associations.

Download Full-text

Textual Adversarial Attacking with Limited Queries

Electronics ◽

10.3390/electronics10212671 ◽

2021 ◽

Vol 10 (21) ◽

pp. 2671

Author(s):

Yu Zhang ◽

Junan Yang ◽

Xiaoshuai Li ◽

Hui Liu ◽

Kun Shao

Keyword(s):

Language Processing ◽

Main Idea ◽

Local Model ◽

Small Perturbations ◽

Target Model ◽

Word Level ◽

Sentence Level ◽

Adversarial Examples ◽

Reducing Costs ◽

The Cost

Recent studies have shown that natural language processing (NLP) models are vulnerable to adversarial examples, which are maliciously designed by adding small perturbations to benign inputs that are imperceptible to the human eye, leading to false predictions by the target model. Compared to character- and sentence-level textual adversarial attacks, word-level attack can generate higher-quality adversarial examples, especially in a black-box setting. However, existing attack methods usually require a huge number of queries to successfully deceive the target model, which is costly in a real adversarial scenario. Hence, finding appropriate models is difficult. Therefore, we propose a novel attack method, the main idea of which is to fully utilize the adversarial examples generated by the local model and transfer part of the attack to the local model to complete ahead of time, thereby reducing costs related to attacking the target model. Extensive experiments conducted on three public benchmarks show that our attack method can not only improve the success rate but also reduce the cost, while outperforming the baselines by a significant margin.

Download Full-text

Grammatical category and the neural processing of phrases

10.1101/2020.12.22.423994 ◽

2020 ◽

Author(s):

Amelia Burroughs ◽

Nina Kazanina ◽

Conor Houghton

Keyword(s):

Language Processing ◽

Semantic Processing ◽

Grammatical Category ◽

Hierarchical Processing ◽

Word Level ◽

Monosyllabic Words ◽

The Subject ◽

Cold Food ◽

Human Participants ◽

Cortical Entrainment

AbstractThe interlocking roles of lexical, syntactic and semantic processing in language comprehension has been the subject of longstanding debate. Recently, the cortical response to a frequency-tagged linguistic stimulus has been shown to track the rate of phrase and sentence, as well as syllable, presentation. This could be interpreted as evidence for the hierarchical processing of speech, or as a response to the repetition of grammatical category. To examine the extent to which hierarchical structure plays a role in language processing we recorded EEG from human participants as they listen to isochronous streams of monosyllabic words. Comparing responses to sequences in which grammatical category is strictly alternating and chosen such that two-word phrases can be grammatically constructed — cold food loud room — or is absent — rough give ill tell — showed cortical entrainment at the two-word phrase rate was only present in the grammatical condition. Thus, grammatical category repetition alone does not yield entertainment at higher level than a word. On the other hand, cortical entrainment was reduced for the mixed-phrase condition that contained two-word phrases but no grammatical category repetition — that word send less — which is not what would be expected if the measured entrainment reflected purely abstract hierarchical syntactic units. Our results support a model in which word-level grammatical category information is required to build larger units.

Download Full-text

Textual Backdoor Defense via Poisoned Sample Recognition

Applied Sciences ◽

10.3390/app11219938 ◽

2021 ◽

Vol 11 (21) ◽

pp. 9938

Author(s):

Kun Shao ◽

Yu Zhang ◽

Junan Yang ◽

Hui Liu

Keyword(s):

Success Rate ◽

Language Processing ◽

Training Data ◽

Infection Model ◽

Search Range ◽

Word Level ◽

Sentence Level ◽

Preliminary Model ◽

Sample Recognition ◽

Better Than

Deep learning models are vulnerable to backdoor attacks. The success rate of textual backdoor attacks based on data poisoning in existing research is as high as 100%. In order to enhance the natural language processing model’s defense against backdoor attacks, we propose a textual backdoor defense method via poisoned sample recognition. Our method consists of two parts: the first step is to add a controlled noise layer after the model embedding layer, and to train a preliminary model with incomplete or no backdoor embedding, which reduces the effectiveness of poisoned samples. Then, we use the model to initially identify the poisoned samples in the training set so as to narrow the search range of the poisoned samples. The second step uses all the training data to train an infection model embedded in the backdoor, which is used to reclassify the samples selected in the first step, and finally identify the poisoned samples. Through detailed experiments, we have proved that our defense method can effectively defend against a variety of backdoor attacks (character-level, word-level and sentence-level backdoor attacks), and the experimental effect is better than the baseline method. For the BERT model trained by the IMDB dataset, this method can even reduce the success rate of word-level backdoor attacks to 0%.

Download Full-text

Chronological Ordering Based on Context Overlap Detection

International Journal of Information Retrieval Research ◽

10.4018/ijirr.2012100103 ◽

2012 ◽

Vol 2 (4) ◽

pp. 31-44

Author(s):

Mohamed H. Haggag ◽

Bassma M. Othman

Keyword(s):

Language Processing ◽

Linguistic Knowledge ◽

Syntactic Analysis ◽

Text Generation ◽

Context Processing ◽

Domain Specific ◽

Word Level ◽

Sentence Level ◽

Proposed Model ◽

Chronological Sequence

Context processing plays an important role in different Natural Language Processing applications. Sentence ordering is one of critical tasks in text generation. Following the same order of sentences in the row sources of text is not necessarily to be applied for the resulted text. Accordingly, a need for chronological sentence ordering is of high importance in this regard. Some researches followed linguistic syntactic analysis and others used statistical approaches. This paper proposes a new model for sentence ordering based on sematic analysis. Word level semantics forms a seed to sentence level sematic relations. The model introduces a clustering technique based on sentences senses relatedness. Following to this, sentences are chronologically ordered through two main steps; overlap detection and chronological cause-effect rules. Overlap detection drills down into each cluster to step through its sentences in chronological sequence. Cause-effect rules forms the linguistic knowledge controlling sentences relations. Evaluation of the proposed algorithm showed the capability of the proposed model to process size free texts, non-domain specific and open to extend the cause-effect rules for specific ordering needs.

Download Full-text

Grammatical category and the neural processing of phrases

Scientific Reports ◽

10.1038/s41598-021-81901-5 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Amelia Burroughs ◽

Nina Kazanina ◽

Conor Houghton

Keyword(s):

Language Processing ◽

Semantic Processing ◽

Grammatical Category ◽

Hierarchical Processing ◽

Category Information ◽

Word Level ◽

Monosyllabic Words ◽

The Subject ◽

Human Participants ◽

Cortical Entrainment

AbstractThe interlocking roles of lexical, syntactic and semantic processing in language comprehension has been the subject of longstanding debate. Recently, the cortical response to a frequency-tagged linguistic stimulus has been shown to track the rate of phrase and sentence, as well as syllable, presentation. This could be interpreted as evidence for the hierarchical processing of speech, or as a response to the repetition of grammatical category. To examine the extent to which hierarchical structure plays a role in language processing we recorded EEG from human participants as they listen to isochronous streams of monosyllabic words. Comparing responses to sequences in which grammatical category is strictly alternating and chosen such that two-word phrases can be grammatically constructed——or is absent——showed cortical entrainment at the two-word phrase rate was only present in the grammatical condition. Thus, grammatical category repetition alone does not yield entertainment at higher level than a word. On the other hand, cortical entrainment was reduced for the mixed-phrase condition that contained two-word phrases but no grammatical category repetition——which is not what would be expected if the measured entrainment reflected purely abstract hierarchical syntactic units. Our results support a model in which word-level grammatical category information is required to build larger units.

Download Full-text

Big Data Management and Analytics in Scientific Programming: A Deep Learning-Based Method for Aspect Category Classification of Question-Answering-Style Reviews

Scientific Programming ◽

10.1155/2020/4690974 ◽

2020 ◽

Vol 2020 ◽

pp. 1-10 ◽

Cited By ~ 1

Author(s):

Hanqian Wu ◽

Mumu Liu ◽

Shangbin Zhang ◽

Zhike Wang ◽

Siliang Cheng

Keyword(s):

Deep Learning ◽

Language Processing ◽

Question Answering ◽

Product Information ◽

Empirical Studies ◽

Product Reviews ◽

Related Information ◽

Basic Task ◽

Word Level ◽

Sentence Level

Online product reviews are exploring on e-commerce platforms, and mining aspect-level product information contained in those reviews has great economic benefit. The aspect category classification task is a basic task for aspect-level sentiment analysis which has become a hot research topic in the natural language processing (NLP) field during the last decades. In various e-commerce platforms, there emerge various user-generated question-answering (QA) reviews which generally contain much aspect-related information of products. Although some researchers have devoted their efforts on the aspect category classification for traditional product reviews, the existing deep learning-based approaches cannot be well applied to represent the QA-style reviews. Thus, we propose a 4-dimension (4D) textual representation model based on QA interaction-level and hyperinteraction-level by modeling with different levels of the text representation, i.e., word-level, sentence-level, QA interaction-level, and hyperinteraction-level. In our experiments, the empirical studies on datasets from three domains demonstrate that our proposals perform better than traditional sentence-level representation approaches, especially in the Digit domain.

Download Full-text

A Sentence-Level Joint Relation Classification Model Based on Reinforcement Learning

Computational Intelligence and Neuroscience ◽

10.1155/2021/5557184 ◽

2021 ◽

Vol 2021 ◽

pp. 1-10

Author(s):

Zhen Liu ◽

XiaoQiang Di ◽

Wei Song ◽

WeiWu Ren

Keyword(s):

Reinforcement Learning ◽

Language Processing ◽

Semantic Processing ◽

Large Scale ◽

Short Term Memory ◽

Attention Mechanism ◽

Training Data ◽

Classification Model ◽

Sentence Level ◽

Relation Classification

Relation classification is an important semantic processing task in the field of natural language processing (NLP). Data sources generally adopt remote monitoring strategies to automatically generate large-scale training data, which inevitably causes label noise problems. At the same time, another challenge is that important information can appear at any place in the sentence. This paper presents a sentence-level joint relation classification model. The model has two modules: a reinforcement learning (RL) agent and a joint network model. In particular, we combine bidirectional long short-term memory (Bi-LSTM) and attention mechanism as a joint model to process the text features of sentences and classify the relation between two entities. At the same time, we introduce an attention mechanism to discover hidden information in sentences. The joint training of the two modules solves the noise problem in relation extraction, sentence-level information extraction, and relation classification. Experimental results demonstrate that the model can effectively deal with data noise and achieve better relation classification performance at the sentence level.

Download Full-text

Nonverbal Semantics Test (NVST)—A Novel Diagnostic Tool to Assess Semantic Processing Deficits: Application to Persons with Aphasia after Cerebrovascular Accident

Brain Sciences ◽

10.3390/brainsci11030359 ◽

2021 ◽

Vol 11 (3) ◽

pp. 359

Author(s):

Katharina Hogrefe ◽

Georg Goldenberg ◽

Ralf Glindemann ◽

Madleen Klonowski ◽

Wolfram Ziegler

Keyword(s):

Language Processing ◽

Cerebrovascular Accident ◽

Semantic Processing ◽

Factor Model ◽

Principal Component ◽

Original Data ◽

Verbal Abilities ◽

Verbal Tasks ◽

Alternative Means ◽

The Relationship

Assessment of semantic processing capacities often relies on verbal tasks which are, however, sensitive to impairments at several language processing levels. Especially for persons with aphasia there is a strong need for a tool that measures semantic processing skills independent of verbal abilities. Furthermore, in order to assess a patient’s potential for using alternative means of communication in cases of severe aphasia, semantic processing should be assessed in different nonverbal conditions. The Nonverbal Semantics Test (NVST) is a tool that captures semantic processing capacities through three tasks—Semantic Sorting, Drawing, and Pantomime. The main aim of the current study was to investigate the relationship between the NVST and measures of standard neurolinguistic assessment. Fifty-one persons with aphasia caused by left hemisphere brain damage were administered the NVST as well as the Aachen Aphasia Test (AAT). A principal component analysis (PCA) was conducted across all AAT and NVST subtests. The analysis resulted in a two-factor model that captured 69% of the variance of the original data, with all linguistic tasks loading high on one factor and the NVST subtests loading high on the other. These findings suggest that nonverbal tasks assessing semantic processing capacities should be administered alongside standard neurolinguistic aphasia tests.

Download Full-text

Two New Large Corpora for Vietnamese Aspect-based Sentiment Analysis at Sentence Level

ACM Transactions on Asian and Low-Resource Language Information Processing ◽

10.1145/3446678 ◽

2021 ◽

Vol 20 (4) ◽

pp. 1-22

Author(s):

Dang Van Thin ◽

Ngan Luu-Thuy Nguyen ◽

Tri Minh Truong ◽

Lac Si Le ◽

Duy Tin Vo

Keyword(s):

Sentiment Analysis ◽

Language Processing ◽

Network Architectures ◽

Low Resource ◽

The Neural Network ◽

Sentence Level ◽

Push Forward ◽

Polarity Classification ◽

Learning Architectures ◽

Single Approach

Aspect-based sentiment analysis has been studied in both research and industrial communities over recent years. For the low-resource languages, the standard benchmark corpora play an important role in the development of methods. In this article, we introduce two benchmark corpora with the largest sizes at sentence-level for two tasks: Aspect Category Detection and Aspect Polarity Classification in Vietnamese. Our corpora are annotated with high inter-annotator agreements for the restaurant and hotel domains. The release of our corpora would push forward the low-resource language processing community. In addition, we deploy and compare the effectiveness of supervised learning methods with a single and multi-task approach based on deep learning architectures. Experimental results on our corpora show that the multi-task approach based on BERT architecture outperforms the neural network architectures and the single approach. Our corpora and source code are published on this footnoted site. 1

Download Full-text