Linguistic levelling in Spanish: The analogical strong preterites

Author(s):  
Enrique Pato

AbstractCertain Peninsular Spanish varieties have two third-person plural forms in the simple past indicative of verbs with ‘strong’ (stem-stressed) preterites. While this phenomenon is documented in large-scale linguistic atlas surveys, its current geographic distribution and diachronic origins remain under-studied. This paper sets out to: 1) establish the geographic distribution of these variants; the differing methodologies and epochs of the data sources make them particularly interesting to compare, showing that these analogical strong preterites have suffered a drastic decline over the last century; 2) use historical corpus data to show that the vernacular variant is by no means a recent phenomenon; 3) examine external history as a source of explanation in linguistic reconstruction, showing that this process of analogical levelling took place after the reconquest and resettlement of these regions. These findings support the hypothesis of a feature which spread over the centuries by linguistic diffusion.

2021 ◽  
Vol 0 (0) ◽  
Author(s):  
María Mare

Abstract One of the main discussions about the interaction between morphology and syntax revolves around the richness or poverty of features and wherever this richness/poverty is found either in the syntactic structure or the lexical items. A phenomenon subject to this debate has been syncretism, especially in theories that assume late insertion such as Distributed Morphology. This paper delves into the syncretism observed between the first person plural and the third person in the clitic domain in some Spanish dialects. Our analysis will lead to a revision of the distribution of person features and their relationship with plural number, while at the same time it will shed light on other morphological alternations displayed in Spanish dialects; that is, subject-verb unagreement and mesoclisis in imperatives. In order to explain the behavior of the data under discussion, I propose that lexical items are specified for all the relevant features at the moment of insertion, although the values of these features can be neutralized. I argue that the distribution proposed allows for some fundamental generalizations about the vocabulary inventories in Spanish varieties, and shows that the variation pattern exhibits an *ABA effect, i.e., only contiguous cells in a paradigm are syncretic.


Epidemiologia ◽  
2021 ◽  
Vol 2 (3) ◽  
pp. 315-324
Author(s):  
Juan M. Banda ◽  
Ramya Tekumalla ◽  
Guanyu Wang ◽  
Jingyuan Yu ◽  
Tuo Liu ◽  
...  

As the COVID-19 pandemic continues to spread worldwide, an unprecedented amount of open data is being generated for medical, genetics, and epidemiological research. The unparalleled rate at which many research groups around the world are releasing data and publications on the ongoing pandemic is allowing other scientists to learn from local experiences and data generated on the front lines of the COVID-19 pandemic. However, there is a need to integrate additional data sources that map and measure the role of social dynamics of such a unique worldwide event in biomedical, biological, and epidemiological analyses. For this purpose, we present a large-scale curated dataset of over 1.12 billion tweets, growing daily, related to COVID-19 chatter generated from 1 January 2020 to 27 June 2021 at the time of writing. This data source provides a freely available additional data source for researchers worldwide to conduct a wide and diverse number of research projects, such as epidemiological analyses, emotional and mental responses to social distancing measures, the identification of sources of misinformation, stratified measurement of sentiment towards the pandemic in near real time, among many others.


2020 ◽  
Vol 14 (3) ◽  
pp. 320-328
Author(s):  
Long Guo ◽  
Lifeng Hua ◽  
Rongfei Jia ◽  
Fei Fang ◽  
Binqiang Zhao ◽  
...  

With the rapid growth of e-commerce in recent years, e-commerce platforms are becoming a primary place for people to find, compare and ultimately purchase products. To improve online shopping experience for consumers and increase sales for sellers, it is important to understand user intent accurately and be notified of its change timely. In this way, the right information could be offered to the right person at the right time. To achieve this goal, we propose a unified deep intent prediction network, named EdgeDIPN, which is deployed at the edge, i.e., mobile device, and able to monitor multiple user intent with different granularity simultaneously in real-time. We propose to train EdgeDIPN with multi-task learning, by which EdgeDIPN can share representations between different tasks for better performance and saving edge resources in the meantime. In particular, we propose a novel task-specific attention mechanism which enables different tasks to pick out the most relevant features from different data sources. To extract the shared representations more effectively, we utilize two kinds of attention mechanisms, where the multi-level attention mechanism tries to identify the important actions within each data source and the inter-view attention mechanism learns the interactions between different data sources. In the experiments conducted on a large-scale industrial dataset, EdgeDIPN significantly outperforms the baseline solutions. Moreover, EdgeDIPN has been deployed in the operational system of Alibaba. Online A/B testing results in several business scenarios reveal the potential of monitoring user intent in real-time. To the best of our knowledge, EdgeDIPN is the first full-fledged real-time user intent understanding center deployed at the edge and serving hundreds of millions of users in a large-scale e-commerce platform.


2018 ◽  
Vol 188 ◽  
pp. 05004
Author(s):  
Christos Panagiotou ◽  
Christos Antonopoulos ◽  
Stavros Koubias

WSNs as adopted in current smart city deployments, must address demanding traffic factors and resilience in failures. Furthermore, caching of data in WSN can significantly benefit resource conservation and network performance. However, data sources generate data volumes that could not fit in the restricted data cache resources of the caching nodes. This unavoidably leads to data items been evicted and replaced. This paper aims to experimentally evaluate the prominent caching techniques in large scale networks that resemble the Smart city paradigm regarding network performance with respect to critical application and network parameters. Through respective result analysis valuable insights are provided concerning the behaviour of caching in typical large scale WSN scenarios.


2009 ◽  
pp. 101-108
Author(s):  
Jarmila Panevova

The author claims that the Czech polite forms (so-called 'vykani') for addressing the 2nd person should be understood as a legitimate part of the Czech conjugation paradigm. If we address a single person in a polite way some Czech analytical verb forms exhibit 'hybrid' agreement (auxiliaries are in plural, while participle form is in singular). However, the paradigm for singular and plural polite forms (addressing a single person, or two or more persons, respectively) is not symmetrical. The question, whether 2nd person plural polite forms are ambiguous (between the polite meaning and 2nd plural non-polite), or whether the semantic distinction 'polite - non-polite' is neutralized in plural, is open for further discussion. Some corpus data illustrating the contexts for the 2nd person polite forms are analyzed here too.


2021 ◽  
Author(s):  
Yumi Wakabayashi ◽  
Masamitsu Eitoku ◽  
Narufumi Suganuma

Abstract Background Interventional studies are the fundamental method for obtaining answers to clinical question. However, these studies are sometimes difficult to conduct because of insufficient financial or human resources or the rarity of the disease in question. One means of addressing these issues is to conduct a non-interventional observational study using electronic health record (EHR) databases as the data source, although how best to evaluate the suitability of an EHR database when planning a study remains to be clarified. The aim of the present study is to identify and characterize the data sources that have been used for conducting non-interventional observational studies in Japan and propose a flow diagram to help researchers determine the most appropriate EHR database for their study goals. Methods We compiled a list of published articles reporting observational studies conducted in Japan by searching PubMed for relevant articles published in the last 3 years and by searching database providers’ publication lists related to studies using their databases. For each article, we reviewed the abstract and/or full text to obtain information about data source, target disease or therapeutic area, number of patients, and study design (prospective or retrospective). We then characterized the identified EHR databases. Results In Japan, non-interventional observational studies have been mostly conducted using data stored locally at individual medical institutions (713/1463) or collected from several collaborating medical institutions (351/1463). Whereas the studies conducted with large-scale integrated databases (195/1463) were mostly retrospective (68.2%), 27.2% of the single-center studies, 46.2% of the multi-center studies, and 74.4% of the post-marketing surveillance studies, identified in the present study, were conducted prospectively. Conclusions Our analysis revealed that the non-interventional observational studies were conducted using data stored local at individual medical institutions or collected from collaborating medical institutions in Japan. Disease registries, disease databases, and large-scale databases would enable researchers to conduct studies with large sample sizes to provide robust data from which strong inferences could be drawn. Using our flow diagram, researchers planning non-interventional observational studies should consider the strengths and limitations of each available database and choose the most appropriate one for their study goals. Trial registration Not applicable.


Author(s):  
Yousef Mokhtar Elramli ◽  
Tareq Bashir Maiteq

The aim of this paper is to study Regressive vowel harmony induced by a suffixal back round vowel in the Libyan Arabic dialect spoken in the city of Misrata. The skeletal structure in the collected words is a /CVCVC-/ stem followed by the third person plural suffix /-u/. Consequently, the derived form of the examined words becomes /CVCVCV/. Following a rule of re-syllabification, the coda of the ultimate syllable in the stem becomes the onset of the newly formed syllable (ultimate in the derived form). Thus, in the presence of the suffix /-u/ in the derived form, all vowels in the word must harmonise with the [+round] feature of /-u/ unless there is a high front vowel /i/ intervening. In such cases, the high front vowel is defined as an opaque segment that is incompatible with the feature [+round]. Syllable and morpheme boundaries within words do not seem to contribute to blocking the regressive spreading of harmony. An autosegmental approach to analyze these words is adopted here. It is concluded that there are two sources in underlying representations for regressive vowel harmony in Libyan Arabic. One source is floating [+round] and another source is [+round].


2020 ◽  
Author(s):  
James A. Fellows Yates ◽  
Aida Andrades Valtueña ◽  
Ashild J. Vågene ◽  
Becky Cribdon ◽  
Irina M. Velsko ◽  
...  

ABSTRACTAncient DNA and RNA are valuable data sources for a wide range of disciplines. Within the field of ancient metagenomics, the number of published genetic datasets has risen dramatically in recent years, and tracking this data for reuse is particularly important for large-scale ecological and evolutionary studies of individual microbial taxa, microbial communities, and metagenomic assemblages. AncientMetagenomeDir (archived at https://doi.org/10.5281/zenodo.3980833) is a collection of indices of published genetic data deriving from ancient microbial samples that provides basic, standardised metadata and accession numbers to allow rapid data retrieval from online repositories. These collections are community-curated and span multiple sub-disciplines in order to ensure adequate breadth and consensus in metadata definitions, as well as longevity of the database. Internal guidelines and automated checks to facilitate compatibility with established sequence-read archives and term-ontologies ensure consistency and interoperability for future meta-analyses. This collection will also assist in standardising metadata reporting for future ancient metagenomic studies.


Author(s):  
Pattabiraman V. ◽  
Parvathi R.

Natural data erupting directly out of various data sources, such as text, image, video, audio, and sensor data, comes with an inherent property of having very large dimensions or features of the data. While these features add richness and perspectives to the data, due to sparsity associated with them, it adds to the computational complexity while learning, unable to visualize and interpret them, thus requiring large scale computational power to make insights out of it. This is famously called “curse of dimensionality.” This chapter discusses the methods by which curse of dimensionality is cured using conventional methods and analyzes its performance for given complex datasets. It also discusses the advantages of nonlinear methods over linear methods and neural networks, which could be a better approach when compared to other nonlinear methods. It also discusses future research areas such as application of deep learning techniques, which can be applied as a cure for this curse.


Sign in / Sign up

Export Citation Format

Share Document