Dynamic visualizations of language change

This paper uses diachronic corpus data to visualize language change in a dynamic fashion. Bivariate and multivariate data sets form the input for so-called motion charts, i.e. series of diachronically ordered scatterplots that can be viewed in sequence. Based on data from COHA (Davies 2010), two case studies illustrate recent changes in American English. The first study visualizes change in a diachronic analysis of ambicategorical nouns and verbs such as hope or drink; the second study shows structural change in the behavior of complement-taking predicates such as expect or remember. Whereas motion charts are typically used to represent bivariate data sets, it is argued here that they are also useful for the analysis of multivariate data over time. The present paper submits multivariate diachronic data to a multi-dimensional scaling analysis. Viewing the resulting data points in separate time slices offers a holistic and intuitive representation of complex linguistic change.

Download Full-text

Systematically Exploring Associations among Multivariate Data

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i04.6158 ◽

2020 ◽

Vol 34 (04) ◽

pp. 6786-6794

Author(s):

Lifeng Zhang

Keyword(s):

Interaction Effect ◽

Functional Relationship ◽

Multivariate Data ◽

Coefficient Of Determination ◽

High Dimensional ◽

Data Sets ◽

Statistical Tool ◽

Wide Range ◽

Main Effect ◽

Data Points

Detecting relationships among multivariate data is often of great importance in the analysis of high-dimensional data sets, and has received growing attention for decades from both academic and industrial fields. In this study, we propose a statistical tool named the neighbor correlation coefficient (nCor), which is based on a new idea that measures the local continuity of the reordered data points to quantify the strength of the global association between variables. With sufficient sample size, the new method is able to capture a wide range of functional relationship, whether it is linear or nonlinear, bivariate or multivariate, main effect or interaction. The score of nCor roughly approximates the coefficient of determination (R2) of the data which implies the proportion of variance in one variable that is predictable from one or more other variables. On this basis, three nCor based statistics are also proposed here to further characterize the intra and inter structures of the associations from the aspects of nonlinearity, interaction effect, and variable redundancy. The mechanisms of these measures are proved in theory and demonstrated with numerical analyses.

Download Full-text

The application of multivariate data analysis in the interpretation of engineering geological parameters

Open Geosciences ◽

10.1515/geo-2016-0005 ◽

2016 ◽

Vol 8 (1) ◽

Author(s):

József Kovács ◽

Nikolett Bodnár ◽

Ákos Török

Keyword(s):

Data Analysis ◽

Multivariate Data Analysis ◽

Multivariate Data ◽

Data Sets ◽

Data Set ◽

Angle Of Internal Friction ◽

Laboratory Test Results ◽

Data Points ◽

Discriminant Analyses ◽

Multivariate Cluster

AbstractThe paper presents the evaluation of engineering geological laboratory test results of core drillings along the new metro line (line 4) in Budapest by using a multivariate data analysis. A data set of 30 core drillings with a total coring length of over 1500 meters was studied. Of the eleven engineering geological parameters considered in this study, only the five most reliable (void ratio, dry bulk density, angle of internal friction, cohesion and compressive strength) representing 1260 data points were used for multivariate (cluster and discriminant) analyses. To test the results of the cluster analysis discriminant analysis was used. The results suggest that the use of multivariate analyses allows the identification of different groups of sediments even when the data sets are overlapping and contain several uncertainties. The tests also prove that the use of these methods for seemingly very scattered parameters is crucial in obtaining reliable engineering geological data for design.

Download Full-text

A multi-dimensional scaling analysis of hard decisions

PsycEXTRA Dataset ◽

10.1037/e683302011-056 ◽

1998 ◽

Author(s):

Elizabeth S. Veinott ◽

J. Frank Yates

Keyword(s):

Scaling Analysis ◽

Dimensional Scaling ◽

Multi Dimensional Scaling

Download Full-text

Language change across a lifetime: A historical micro-perspective

Linguistics Vanguard ◽

10.1515/lingvan-2020-0029 ◽

2021 ◽

Vol 7 (s2) ◽

Author(s):

Alexander Bergs

Keyword(s):

Communities Of Practice ◽

Social Factors ◽

Language Use ◽

Language Change ◽

Historical Data ◽

Data Sets ◽

The Social ◽

Challenges And Opportunities ◽

Linguistic Behavior ◽

Micro Analysis

Abstract This paper focuses on the micro-analysis of historical data, which allows us to investigate language use across the lifetime of individual speakers. Certain concepts, such as social network analysis or communities of practice, put individual speakers and their social embeddedness and dynamicity at the center of attention. This means that intra-speaker variation can be described and analyzed in quite some detail in certain historical data sets. The paper presents some exemplary empirical analyses of the diachronic linguistic behavior of individual speakers/writers in fifteenth to seventeenth century England. It discusses the social factors that influence this behavior, with an emphasis on the methodological and theoretical challenges and opportunities when investigating intra-speaker variation and change.

Download Full-text

Comparative sociolinguistic perspectives on the rate of linguistic change

Journal of Historical Sociolinguistics ◽

10.1515/jhsl-2020-0010 ◽

2020 ◽

Vol 6 (2) ◽

Author(s):

Terttu Nevalainen ◽

Tanja Säily ◽

Turo Vartiainen

Keyword(s):

Middle Ages ◽

Language Change ◽

Sociocultural Factors ◽

Specific Topic ◽

Linguistic Change ◽

Comparative Linguistics ◽

Historical Sociolinguistics ◽

Contact Linguistics ◽

Historical Comparative ◽

Social And Cultural Factors

AbstractThis issue of the Journal of Historical Sociolinguistics aims to contribute to our understanding of language change in real time by presenting a group of articles particularly focused on social and sociocultural factors underlying language diversification and change. By analysing data from a varied set of languages, including Greek, English, and the Finnic and Mongolic language families, and mainly focussing their investigation on the Middle Ages, the authors connect various social and cultural factors with the specific topic of the issue, the rate of linguistic change. The sociolinguistic themes addressed include community and population size, conflict and conquest, migration and mobility, bi- and multilingualism, diglossia and standardization. In this introduction, the field of comparative historical sociolinguistics is considered a cross-disciplinary enterprise with a sociolinguistic agenda at the crossroads of contact linguistics, historical comparative linguistics and linguistic typology.

Download Full-text

Variation and third age: A sociolinguistic perspective

Linguistics Vanguard ◽

10.1515/lingvan-2020-0030 ◽

2021 ◽

Vol 7 (s2) ◽

Author(s):

Daniel Schreier

Keyword(s):

Language Learning ◽

Language Change ◽

National Council ◽

Speech Community ◽

Linguistic Change ◽

Third Age ◽

Quantitative Evidence ◽

Style Shifting ◽

Age Grading ◽

Ethnic Group Membership

Abstract The correlation between external factors such as age, gender, ethnic group membership and language variation is one of the stalwarts of sociolinguistic theory. The repertoire of individual members of speaker groups, vis-à-vis community-wide variation, represents a somewhat slippery ground for developing and testing models of variation and change and has been researched with reference to accommodation (Bell 1984), style shifting (Rickford, John R. & MacKenzie Price. 2013. Girlz II women: Age-grading, language change and stylistic variation. Journal of Sociolinguistics 17. 143–179) and language change generally (Labov, William. 2001. Principles of linguistic change, vol. 2: Social factors. Oxford: Blackwell). This paper presents and assesses some first quantitative evidence that non-mobile older speakers from Tristan da Cunha, an island in the South Atlantic Ocean, who grew up in an utterly isolated speech community, vary and shift according to external interview parameters (interviewer, topic, place of interview). However, while they respond to the formality of the context, they display variation (both regarding speakers and variables) that is not in line with the constraints attested elsewhere. These findings are assessed with focus on the acquisition of sociolinguistic competence in third-age speakers (particularly style-shifting, Labov, William. 1964. Stages in the acquisition of Standard English. In Roger Shuy, Alva Davis & Robert Hogan (eds.), Social Dialects and Language Learning, 77–104. Champaign: National Council of Teachers of English) and across the life-span generally.

Download Full-text

TV-MV Analytics: A visual analytics framework to explore time-varying multivariate data

Information Visualization ◽

10.1177/1473871619858937 ◽

2019 ◽

Vol 19 (1) ◽

pp. 3-23

Author(s):

Aurea Soriano-Vargas ◽

Bernd Hamann ◽

Maria Cristina F de Oliveira

Keyword(s):

Visual Analytics ◽

Visual Analysis ◽

Multivariate Data ◽

Visual Exploration ◽

Data Sets ◽

Time Varying ◽

Domain Experts ◽

Data Mining Algorithms ◽

Temporal Relationships ◽

Visualization Techniques

We present an integrated interactive framework for the visual analysis of time-varying multivariate data sets. As part of our research, we performed in-depth studies concerning the applicability of visualization techniques to obtain valuable insights. We consolidated the considered analysis and visualization methods in one framework, called TV-MV Analytics. TV-MV Analytics effectively combines visualization and data mining algorithms providing the following capabilities: (1) visual exploration of multivariate data at different temporal scales, and (2) a hierarchical small multiples visualization combined with interactive clustering and multidimensional projection to detect temporal relationships in the data. We demonstrate the value of our framework for specific scenarios, by studying three use cases that were validated and discussed with domain experts.

Download Full-text

Identifying VIV Vibration Modes by Use of the Empirical Orthogonal Functions Technique

21st International Conference on Offshore Mechanics and Arctic Engineering, Volume 1 ◽

10.1115/omae2002-28425 ◽

2002 ◽

Cited By ~ 3

Author(s):

Gudmund Kleiven

Keyword(s):

Model Test ◽

Vibration Mode ◽

Multivariate Data ◽

Empirical Orthogonal Functions ◽

Mode Shapes ◽

Data Sets ◽

Vibration Modes ◽

Ocean Engineering ◽

Orthogonal Functions ◽

Related Technique

The Empirical Orthogonal Functions (EOF) technique has widely being used by oceanographers and meteorologists, while the Singular Value Decomposition (SVD being a related technique is frequently used in the statistics community. Another related technique called Principal Component Analysis (PCA) is observed being used for instance in pattern recognition. The predominant applications of these techniques are data compression of multivariate data sets which also facilitates subsequent statistical analysis of such data sets. Within Ocean Engineering the EOF technique is not yet widely in use, although there are several areas where multivariate data sets occur and where the EOF technique could represent a supplementary analysis technique. Examples are oceanographic data, in particular current data. Furthermore data sets of model- or full-scale data of loads and responses of slender bodies, such as pipelines and risers are relevant examples. One attractive property of the EOF technique is that it does not require any a priori information on the physical system by which the data is generated. In the present paper a description of the EOF technique is given. Thereafter an example on use of the EOF technique is presented. The example is analysis of response data from a model test of a pipeline in a long free span exposed to current. The model test program was carried out in order to identify the occurrence of multi-mode vibrations and vibration mode amplitudes. In the present example the EOF technique demonstrates the capability of identifying predominant vibration modes of inline as well as cross-flow vibrations. Vibration mode shapes together with mode amplitudes and frequencies are also estimated. Although the present example is not sufficient for concluding on the applicability of the EOF technique on a general basis, the results of the present example demonstrate some of the potential of the technique.

Download Full-text

ON GENERATING DIGITAL ELEVATION MODELS FROM LIDAR DATA – RESOLUTION VERSUS ACCURACY AND TOPOGRAPHIC WETNESS INDEX INDICES IN NORTHERN PEATLANDS

Geodesy and Cartography ◽

10.3846/20296991.2012.702983 ◽

2012 ◽

Vol 38 (2) ◽

pp. 57-69 ◽

Cited By ~ 12

Author(s):

Abdulghani Hasan ◽

Petter Pilesjö ◽

Andreas Persson

Keyword(s):

Large Scale ◽

Drainage Area ◽

Data Sets ◽

Topographic Wetness Index ◽

Absolute Deviation ◽

Digital Elevation ◽

Elevation Data ◽

Scale Modelling ◽

Data Points ◽

Emission Modelling

Global change and GHG emission modelling are dependent on accurate wetness estimations for predictions of e.g. methane emissions. This study aims to quantify how the slope, drainage area and the TWI vary with the resolution of DEMs for a flat peatland area. Six DEMs with spatial resolutions from 0.5 to 90 m were interpolated with four different search radiuses. The relationship between accuracy of the DEM and the slope was tested. The LiDAR elevation data was divided into two data sets. The number of data points facilitated an evaluation dataset with data points not more than 10 mm away from the cell centre points in the interpolation dataset. The DEM was evaluated using a quantile-quantile test and the normalized median absolute deviation. It showed independence of the resolution when using the same search radius. The accuracy of the estimated elevation for different slopes was tested using the 0.5 meter DEM and it showed a higher deviation from evaluation data for steep areas. The slope estimations between resolutions showed differences with values that exceeded 50%. Drainage areas were tested for three resolutions, with coinciding evaluation points. The model ability to generate drainage area at each resolution was tested by pair wise comparison of three data subsets and showed differences of more than 50% in 25% of the evaluated points. The results show that consideration of DEM resolution is a necessity for the use of slope, drainage area and TWI data in large scale modelling.

Download Full-text

An Information-Aware Framework for Exploring Multivariate Data Sets

IEEE Transactions on Visualization and Computer Graphics ◽

10.1109/tvcg.2013.133 ◽

2013 ◽

Vol 19 (12) ◽

pp. 2683-2692 ◽

Cited By ~ 36

Author(s):

Ayan Biswas ◽

Soumya Dutta ◽

Han-Wei Shen ◽

Jonathan Woodring

Keyword(s):

Multivariate Data ◽

Data Sets

Download Full-text