scholarly journals Morpheme Ordering Across Languages Reflects Optimization for Processing Efficiency

Open Mind ◽  
2022 ◽  
pp. 1-25
Author(s):  
Michael Hahn ◽  
Rebecca Mathew ◽  
Judith Degen

Abstract The ordering of morphemes in a word displays well-documented regularities across languages. Previous work has explained these in terms of notions such as semantic scope, relevance, and productivity. Here, we test a recently formulated processing theory of the ordering of linguistic units, the efficient tradeoff hypothesis (Hahn et al., 2021). The claim of the theory is that morpheme ordering can partly be explained by the optimization of a tradeoff between memory and surprisal. This claim has received initial empirical support from two languages. In this work, we test this idea more extensively using data from four additional agglutinative languages with significant amounts of morphology, and by considering nouns in addition to verbs. We find that the efficient tradeoff hypothesis predicts ordering in most cases with high accuracy, and accounts for cross-linguistic regularities in noun and verb inflection. Our work adds to a growing body of work suggesting that many ordering properties of language arise from a pressure for efficient language processing.

2021 ◽  
pp. 001872672110029
Author(s):  
Yuying Lin ◽  
Mengxi Yang ◽  
Matthew J Quade ◽  
Wansi Chen

How do supervisors who treat the bottom line as more important than anything else influence team success? Drawing from social information processing theory, we explore how and when supervisor bottom-line mentality (i.e. an exclusive focus on bottom-line outcomes at the expense of other priorities) exerts influence on the bottom-line itself, in the form of team performance. We argue that a supervisor’s bottom-line mentality provides significant social cues for the team that securing bottom-line objectives is of sole importance, which stimulates team performance avoidance goal orientation, and thus decreases team performance. Further, we argue performing tension (i.e. tension between contradictory needs, demands, and goals), serving as team members’ mutual perception of the confusing environment, will strengthen the indirect negative relationship between supervisor bottom-line mentality and team performance through team performance avoidance goal orientation. We conduct a path analysis using data from 258 teams in a Chinese food-chain company, which provides support for our hypotheses. Overall, our findings suggest that supervisor’s exclusive focus on the bottom-line can serve to impede team performance. Theoretical contributions and practical implications are discussed.


Author(s):  
Falk Schwendicke ◽  
Akhilanand Chaurasia ◽  
Lubaina Arsiwala ◽  
Jae-Hong Lee ◽  
Karim Elhennawy ◽  
...  

Abstract Objectives Deep learning (DL) has been increasingly employed for automated landmark detection, e.g., for cephalometric purposes. We performed a systematic review and meta-analysis to assess the accuracy and underlying evidence for DL for cephalometric landmark detection on 2-D and 3-D radiographs. Methods Diagnostic accuracy studies published in 2015-2020 in Medline/Embase/IEEE/arXiv and employing DL for cephalometric landmark detection were identified and extracted by two independent reviewers. Random-effects meta-analysis, subgroup, and meta-regression were performed, and study quality was assessed using QUADAS-2. The review was registered (PROSPERO no. 227498). Data From 321 identified records, 19 studies (published 2017–2020), all employing convolutional neural networks, mainly on 2-D lateral radiographs (n=15), using data from publicly available datasets (n=12) and testing the detection of a mean of 30 (SD: 25; range.: 7–93) landmarks, were included. The reference test was established by two experts (n=11), 1 expert (n=4), 3 experts (n=3), and a set of annotators (n=1). Risk of bias was high, and applicability concerns were detected for most studies, mainly regarding the data selection and reference test conduct. Landmark prediction error centered around a 2-mm error threshold (mean; 95% confidence interval: (–0.581; 95 CI: –1.264 to 0.102 mm)). The proportion of landmarks detected within this 2-mm threshold was 0.799 (0.770 to 0.824). Conclusions DL shows relatively high accuracy for detecting landmarks on cephalometric imagery. The overall body of evidence is consistent but suffers from high risk of bias. Demonstrating robustness and generalizability of DL for landmark detection is needed. Clinical significance Existing DL models show consistent and largely high accuracy for automated detection of cephalometric landmarks. The majority of studies so far focused on 2-D imagery; data on 3-D imagery are sparse, but promising. Future studies should focus on demonstrating generalizability, robustness, and clinical usefulness of DL for this objective.


2021 ◽  
pp. 1-14
Author(s):  
Nicholas M. Watanabe ◽  
Hanhan Xue ◽  
Joshua I. Newman ◽  
Grace Yan

With the expansion of the esports industry, there is a growing body of literature examining the motivations and behaviors of consumers and participants. The current study advances this line of research by considering esports consumption through an economic framework, which has been underutilized in this context. Specifically, the “attention economy” is introduced as a theoretical approach—which operates with the understanding that due to increased connectivity and availability of information, it is the attention of consumers that becomes a scarce resource for which organizations must compete. Using data from the Twitch streaming platform, the results of econometric analysis further highlight the importance of structural factors in drawing attention from online viewers. As such, this research advances the theoretical and empirical understanding of online viewership behaviors, while also providing important ramifications for both esports and traditional sport organizations attempting to capture the attention of users in the digital realm.


Author(s):  
Pouneh Shabani-Jadidi

Psycholinguistics encompasses the psychology of language as well as linguistic psychology. Although they might sound similar, they are actually distinct. The first is a branch of linguistics, while the latter is a subdivision of psychology. In the psychology of language, the means are the research tools adopted from psychology and the end is the study of language. However, in linguistic psychology, the means are the data derived from linguistic studies and the end is psychology. This chapter focuses on the first of these two components; that is, the psychology of language. The goal of this chapter is to give a state-of-the-art perspective on the small but growing body of research using psycholinguistic tools to study Persian with a focus on two areas: presenting longstanding debates about the mental lexicon, language impairments and language processing; and introducing a source of data for the linguistic analysis of Persian.


2021 ◽  
Vol 72 ◽  
pp. 1385-1470
Author(s):  
Alexandra N. Uma ◽  
Tommaso Fornaciari ◽  
Dirk Hovy ◽  
Silviu Paun ◽  
Barbara Plank ◽  
...  

Many tasks in Natural Language Processing (NLP) and Computer Vision (CV) offer evidence that humans disagree, from objective tasks such as part-of-speech tagging to more subjective tasks such as classifying an image or deciding whether a proposition follows from certain premises. While most learning in artificial intelligence (AI) still relies on the assumption that a single (gold) interpretation exists for each item, a growing body of research aims to develop learning methods that do not rely on this assumption. In this survey, we review the evidence for disagreements on NLP and CV tasks, focusing on tasks for which substantial datasets containing this information have been created. We discuss the most popular approaches to training models from datasets containing multiple judgments potentially in disagreement. We systematically compare these different approaches by training them with each of the available datasets, considering several ways to evaluate the resulting models. Finally, we discuss the results in depth, focusing on four key research questions, and assess how the type of evaluation and the characteristics of a dataset determine the answers to these questions. Our results suggest, first of all, that even if we abandon the assumption of a gold standard, it is still essential to reach a consensus on how to evaluate models. This is because the relative performance of the various training methods is critically affected by the chosen form of evaluation. Secondly, we observed a strong dataset effect. With substantial datasets, providing many judgments by high-quality coders for each item, training directly with soft labels achieved better results than training from aggregated or even gold labels. This result holds for both hard and soft evaluation. But when the above conditions do not hold, leveraging both gold and soft labels generally achieved the best results in the hard evaluation. All datasets and models employed in this paper are freely available as supplementary materials.


2021 ◽  
Author(s):  
Jiaming Zeng ◽  
Michael F. Gensheimer ◽  
Daniel L. Rubin ◽  
Susan Athey ◽  
Ross D. Shachter

AbstractIn medicine, randomized clinical trials (RCT) are the gold standard for informing treatment decisions. Observational comparative effectiveness research (CER) is often plagued by selection bias, and expert-selected covariates may not be sufficient to adjust for confounding. We explore how the unstructured clinical text in electronic medical records (EMR) can be used to reduce selection bias and improve medical practice. We develop a method based on natural language processing to uncover interpretable potential confounders from the clinical text. We validate our method by comparing the hazard ratio (HR) from survival analysis with and without the confounders against the results from established RCTs. We apply our method to four study cohorts built from localized prostate and lung cancer datasets from the Stanford Cancer Institute Research Database and show that our method adjusts the HR estimate towards the RCT results. We further confirm that the uncovered terms can be interpreted by an oncologist as potential confounders. This research helps enable more credible causal inference using data from EMRs, offers a transparent way to improve the design of observational CER, and could inform high-stake medical decisions. Our method can also be applied to studies within and beyond medicine to extract important information from observational data to support decisions.


2021 ◽  
Author(s):  
Fabian Braesemann ◽  
Fabian Stephany ◽  
Leonie Neuhäuser ◽  
Niklas Stoehr ◽  
Philipp Darius ◽  
...  

Abstract The global spread of Covid-19 has caused major economic disruptions. Governments around the world provide considerable financial support to mitigate the economic downturn. However, effective policy responses require reliable data on the economic consequences of the corona pandemic. We propose the CoRisk-Index: a real-time economic indicator of Covid-19 related risk assessments by industry. Using data mining, we analyse all reports from US companies filed since January 2020, representing more than a third of all US employees. We construct two measures - the number of 'corona' words in each report and the average text negativity of the sentences mentioning corona in each industry - that are aggregated in the CoRisk-Index. The index correlates with U.S. unemployment data and preempts stock market losses of February 2020. Moreover, thanks to topic modelling and natural language processing techniques, the CoRisk data provides unique granularity with regards to the particular contexts of the crisis and the concerns of individual industries about them. The data presented here help researchers and decision makers to measure, the previously unobserved, risk awareness of industries with regard to Covid-19, bridging the quantification gap between highly volatile stock market dynamics and long-term macro-economic figures. For immediate access to the data, we provide all findings and raw data on an interactive online dashboard in real time.


Author(s):  
Waleed Shakeel ◽  
Ming Lu

Deriving a reliable earthwork job cost estimate entails analysis of the interaction of numerous variables defined in a highly complex and dynamic system. Using simulation to plan earthwork haul jobs delivers high accuracy in cost estimating. However, given practical limitations of time and expertise, simulation remains prohibitively expensive and rarely applied in the construction field. The development of a pragmatic tool for field applications that would mimic simulation-derived results while consuming less time was thus warranted. In this research, a spreadsheet based analytical tool was developed using data from industry benchmark databases (such as CAT Handbook and RSMeans). Based on a case study, the proposed methodology outperformed commonly used estimating methods and compared closely to the results obtained from simulation in controlled experiments.


Sign in / Sign up

Export Citation Format

Share Document