scholarly journals A Rational Model of Incremental Argument Interpretation: The Comprehension of Swedish Transitive Clauses

2021 ◽  
Vol 12 ◽  
Author(s):  
Thomas Hörberg ◽  
T. Florian Jaeger

A central component of sentence understanding is verb-argument interpretation, determining how the referents in the sentence are related to the events or states expressed by the verb. Previous work has found that comprehenders change their argument interpretations incrementally as the sentence unfolds, based on morphosyntactic (e.g., case, agreement), lexico-semantic (e.g., animacy, verb-argument fit), and discourse cues (e.g., givenness). However, it is still unknown whether these cues have a privileged role in language processing, or whether their effects on argument interpretation originate in implicit expectations based on the joint distribution of these cues with argument assignments experienced in previous language input. We compare the former, linguistic account against the latter, expectation-based account, using data from production and comprehension of transitive clauses in Swedish. Based on a large corpus of Swedish, we develop a rational (Bayesian) model of incremental argument interpretation. This model predicts the processing difficulty experienced at different points in the sentence as a function of the Bayesian surprise associated with changes in expectations over possible argument interpretations. We then test the model against reading times from a self-paced reading experiment on Swedish. We find Bayesian surprise to be a significant predictor of reading times, complementing effects of word surprisal. Bayesian surprise also captures the qualitative effects of morpho-syntactic and lexico-semantic cues. Additional model comparisons find that it—with a single degree of freedom—captures much, if not all, of the effects associated with these cues. This suggests that the effects of form- and meaning-based cues to argument interpretation are mediated through expectation-based processing.

Interpreting ◽  
2017 ◽  
Vol 19 (1) ◽  
pp. 1-20 ◽  
Author(s):  
Ena Hodzik ◽  
John N. Williams

We report a study on prediction in shadowing and simultaneous interpreting (SI), both considered as forms of real-time, ‘online’ spoken language processing. The study comprised two experiments, focusing on: (i) shadowing of German head-final sentences by 20 advanced students of German, all native speakers of English; (ii) SI of the same sentences into English head-initial sentences by 22 advanced students of German, again native English speakers, and also by 11 trainee and practising interpreters. Latency times for input and production of the target verbs were measured. Drawing on studies of prediction in English-language reading production, we examined two cues to prediction in both experiments: contextual constraints (semantic cues in the context) and transitional probability (the statistical likelihood of words occurring together in the language concerned). While context affected prediction during both shadowing and SI, transitional probability appeared to favour prediction during shadowing but not during SI. This suggests that the two cues operate on different levels of language processing in SI.


Author(s):  
Christina Blomquist ◽  
Rochelle S. Newman ◽  
Yi Ting Huang ◽  
Jan Edwards

Purpose Children with cochlear implants (CIs) are more likely to struggle with spoken language than their age-matched peers with normal hearing (NH), and new language processing literature suggests that these challenges may be linked to delays in spoken word recognition. The purpose of this study was to investigate whether children with CIs use language knowledge via semantic prediction to facilitate recognition of upcoming words and help compensate for uncertainties in the acoustic signal. Method Five- to 10-year-old children with CIs heard sentences with an informative verb ( draws ) or a neutral verb ( gets ) preceding a target word ( picture ). The target referent was presented on a screen, along with a phonologically similar competitor ( pickle ). Children's eye gaze was recorded to quantify efficiency of access of the target word and suppression of phonological competition. Performance was compared to both an age-matched group and vocabulary-matched group of children with NH. Results Children with CIs, like their peers with NH, demonstrated use of informative verbs to look more quickly to the target word and look less to the phonological competitor. However, children with CIs demonstrated less efficient use of semantic cues relative to their peers with NH, even when matched for vocabulary ability. Conclusions Children with CIs use semantic prediction to facilitate spoken word recognition but do so to a lesser extent than children with NH. Children with CIs experience challenges in predictive spoken language processing above and beyond limitations from delayed vocabulary development. Children with CIs with better vocabulary ability demonstrate more efficient use of lexical-semantic cues. Clinical interventions focusing on building knowledge of words and their associations may support efficiency of spoken language processing for children with CIs. Supplemental Material https://doi.org/10.23641/asha.14417627


2021 ◽  
Author(s):  
Jiaming Zeng ◽  
Michael F. Gensheimer ◽  
Daniel L. Rubin ◽  
Susan Athey ◽  
Ross D. Shachter

AbstractIn medicine, randomized clinical trials (RCT) are the gold standard for informing treatment decisions. Observational comparative effectiveness research (CER) is often plagued by selection bias, and expert-selected covariates may not be sufficient to adjust for confounding. We explore how the unstructured clinical text in electronic medical records (EMR) can be used to reduce selection bias and improve medical practice. We develop a method based on natural language processing to uncover interpretable potential confounders from the clinical text. We validate our method by comparing the hazard ratio (HR) from survival analysis with and without the confounders against the results from established RCTs. We apply our method to four study cohorts built from localized prostate and lung cancer datasets from the Stanford Cancer Institute Research Database and show that our method adjusts the HR estimate towards the RCT results. We further confirm that the uncovered terms can be interpreted by an oncologist as potential confounders. This research helps enable more credible causal inference using data from EMRs, offers a transparent way to improve the design of observational CER, and could inform high-stake medical decisions. Our method can also be applied to studies within and beyond medicine to extract important information from observational data to support decisions.


2021 ◽  
Author(s):  
Fabian Braesemann ◽  
Fabian Stephany ◽  
Leonie Neuhäuser ◽  
Niklas Stoehr ◽  
Philipp Darius ◽  
...  

Abstract The global spread of Covid-19 has caused major economic disruptions. Governments around the world provide considerable financial support to mitigate the economic downturn. However, effective policy responses require reliable data on the economic consequences of the corona pandemic. We propose the CoRisk-Index: a real-time economic indicator of Covid-19 related risk assessments by industry. Using data mining, we analyse all reports from US companies filed since January 2020, representing more than a third of all US employees. We construct two measures - the number of 'corona' words in each report and the average text negativity of the sentences mentioning corona in each industry - that are aggregated in the CoRisk-Index. The index correlates with U.S. unemployment data and preempts stock market losses of February 2020. Moreover, thanks to topic modelling and natural language processing techniques, the CoRisk data provides unique granularity with regards to the particular contexts of the crisis and the concerns of individual industries about them. The data presented here help researchers and decision makers to measure, the previously unobserved, risk awareness of industries with regard to Covid-19, bridging the quantification gap between highly volatile stock market dynamics and long-term macro-economic figures. For immediate access to the data, we provide all findings and raw data on an interactive online dashboard in real time.


2021 ◽  
Author(s):  
Guangjie Li ◽  
Yi Tang ◽  
Biyi Yi ◽  
Xiang Zhang ◽  
Yan He

Code completion is one of the most useful features provided by advanced IDEs and is widely used by software developers. However, as a kind of code completion, recommending arguments for method calls is less used. Most of existing argument recommendation approaches provide a long list of syntactically correct candidate arguments, which is difficult for software engineers to select the correct arguments from the long list. To this end, we propose a deep learning based approach to recommending arguments instantly when programmers type in method names they intend to invoke. First, we extract context information from a large corpus of opensource applications. Second, we preprocess the extracted dataset, which involves natural language processing and data embedding. Third, we feed the preprocessed dataset to a specially designed convolutional neural network to rank and recommend actual arguments. With the resulting CNN model trained with sample applications, we can sort the candidate arguments in a reasonable order and recommend the first one as the correct argument. We evaluate the proposed approach on 100 open-source Java applications. Results suggest that the proposed approach outperforms the state-of-theart approaches in recommending arguments.


Languages ◽  
2019 ◽  
Vol 4 (3) ◽  
pp. 54
Author(s):  
Sokolova ◽  
Slabakova

The article investigates non-native sentence processing and examines the existing scholarly approaches to L2 processing with a population of L3 learners of English, whose native language is Russian. In a self-paced reading experiment, native speakers of Russian and English, as well as (low) intermediate L3 learners of English, read ambiguous relative clauses (RC) and decided on their attachment interpretation: high attachment (HA) or low attachment (LA). In the two-by-two design, linguistic decision-making was prompted by lexical semantic cues vs. a structural change caused by a certain type of matrix verb. The results show that whenever a matrix verb caused a change of syntactic modification, which entailed HA, both native and non-native speakers abandoned the default English-like LA and chose HA. Lexical semantic cues did not have any significant effect in RC attachment resolution. The study provides experimental evidence in favor of the similarity of native and non-native processing strategies. Both native speakers and L3 learners of English apply structural processing strategies and show similar sensitivity to a linguistic prompt that shapes RC resolution. Native and non-native processing is found to be prediction-based; structure building is performed in a top-down manner.


2018 ◽  
Vol 7 (4.5) ◽  
pp. 728
Author(s):  
Rasmita Rautray ◽  
Lopamudra Swain ◽  
Rasmita Dash ◽  
Rajashree Dash

In present scenario, text summarization is a popular and active field of research in both the Information Retrieval (IR) and Natural Language Processing (NLP) communities. Summarization is important for IR since it is a means to identify useful information by condensing the document from large corpus of data in an efficient way. In this study, different aspects of text summarization methods with strength, limitation and gap within the methods are presented.   


2019 ◽  
Vol 10 (1) ◽  
Author(s):  
Daniel Pinto dos Santos ◽  
Sebastian Brodehl ◽  
Bettina Baeßler ◽  
Gordon Arnhold ◽  
Thomas Dratsch ◽  
...  

Abstract Background Data used for training of deep learning networks usually needs large amounts of accurate labels. These labels are usually extracted from reports using natural language processing or by time-consuming manual review. The aim of this study was therefore to develop and evaluate a workflow for using data from structured reports as labels to be used in a deep learning application. Materials and methods We included all plain anteriorposterior radiographs of the ankle for which structured reports were available. A workflow was designed and implemented where a script was used to automatically retrieve, convert, and anonymize the respective radiographs of cases where fractures were either present or absent from the institution’s picture archiving and communication system (PACS). These images were then used to retrain a pretrained deep convolutional neural network. Finally, performance was evaluated on a set of previously unseen radiographs. Results Once implemented and configured, completion of the whole workflow took under 1 h. A total of 157 structured reports were retrieved from the reporting platform. For all structured reports, corresponding radiographs were successfully retrieved from the PACS and fed into the training process. On an unseen validation subset, the model showed a satisfactory performance with an area under the curve of 0.850 (95% CI 0.634–1.000) for detection of fractures. Conclusion We demonstrate that data obtained from structured reports written in clinical routine can be used to successfully train deep learning algorithms. This highlights the potential role of structured reporting for the future of radiology, especially in the context of deep learning.


2017 ◽  
Vol 28 (4) ◽  
pp. 327-352 ◽  
Author(s):  
Ahmed Abdel-Raheem

Using data from the Egyptian public discourse on the United States, this article lays out the foundation for building a general theory of pictorial framing. In this theory, at the most general level, the concept of pictorial framing refers to subtle alterations in the visual presentation of judgment and choice problems. Specifically, pictures are viewed as constructions, and pictorial meaning is seen as an intricate web of connected frames. The article thus adopts the view that a visual grammar is part of cognitive science and is fundamentally concerned with the relation between what goes on in the human mind and manifestations of this activity. The article draws on insights from blending model (Fauconnier and Turner, 2002), Relevance Theory (Sperber and Wilson, 1995) and frame semantics (Fillmore, 1985), discussing a large corpus of 90 multimodal cartoons on the United States.


2018 ◽  
Vol 2018 ◽  
pp. 1-7 ◽  
Author(s):  
J. Bouaziz ◽  
R. Mashiach ◽  
S. Cohen ◽  
A. Kedem ◽  
A. Baron ◽  
...  

Endometriosis is a disease characterized by the development of endometrial tissue outside the uterus, but its cause remains largely unknown. Numerous genes have been studied and proposed to help explain its pathogenesis. However, the large number of these candidate genes has made functional validation through experimental methodologies nearly impossible. Computational methods could provide a useful alternative for prioritizing those most likely to be susceptibility genes. Using artificial intelligence applied to text mining, this study analyzed the genes involved in the pathogenesis, development, and progression of endometriosis. The data extraction by text mining of the endometriosis-related genes in the PubMed database was based on natural language processing, and the data were filtered to remove false positives. Using data from the text mining and gene network information as input for the web-based tool, 15,207 endometriosis-related genes were ranked according to their score in the database. Characterization of the filtered gene set through gene ontology, pathway, and network analysis provided information about the numerous mechanisms hypothesized to be responsible for the establishment of ectopic endometrial tissue, as well as the migration, implantation, survival, and proliferation of ectopic endometrial cells. Finally, the human genome was scanned through various databases using filtered genes as a seed to determine novel genes that might also be involved in the pathogenesis of endometriosis but which have not yet been characterized. These genes could be promising candidates to serve as useful diagnostic biomarkers and therapeutic targets in the management of endometriosis.


Sign in / Sign up

Export Citation Format

Share Document