scholarly journals Large-Scale Diversity Estimation Through Surname Origin Inference

Author(s):  
Antoine Mazières ◽  
Camille Roth

The study of surnames as both linguistic and geographical markers of the past has proven valuable in several research fields spanning from biology and genetics to demography and social mobility. This article builds on the existing literature to conceive and develop a surname origin classifier based on a data-driven typology. This enables us to explore a methodology to describe large-scale estimates of the relative diversity of social groups, especially when such data is scarcely available. We subsequently analyze the representativeness of surname origins for 15 socio-professional groups in France.

Author(s):  
Cheng Meng ◽  
Ye Wang ◽  
Xinlian Zhang ◽  
Abhyuday Mandal ◽  
Wenxuan Zhong ◽  
...  

With advances in technologies in the past decade, the amount of data generated and recorded has grown enormously in virtually all fields of industry and science. This extraordinary amount of data provides unprecedented opportunities for data-driven decision-making and knowledge discovery. However, the task of analyzing such large-scale dataset poses significant challenges and calls for innovative statistical methods specifically designed for faster speed and higher efficiency. In this chapter, we review currently available methods for big data, with a focus on the subsampling methods using statistical leveraging and divide and conquer methods.


2016 ◽  
Vol 8 (3) ◽  
pp. 310-322 ◽  
Author(s):  
Jordan Carpenter ◽  
Daniel Preotiuc-Pietro ◽  
Lucie Flekova ◽  
Salvatore Giorgi ◽  
Courtney Hagan ◽  
...  

People associate certain behaviors with certain social groups. These stereotypical beliefs consist of both accurate and inaccurate associations. Using large-scale, data-driven methods with social media as a context, we isolate stereotypes by using verbal expression. Across four social categories—gender, age, education level, and political orientation—we identify words and phrases that lead people to incorrectly guess the social category of the writer. Although raters often correctly categorize authors, they overestimate the importance of some stereotype-congruent signal. Findings suggest that data-driven approaches might be a valuable and ecologically valid tool for identifying even subtle aspects of stereotypes and highlighting the facets that are exaggerated or misapplied.


Author(s):  
Meike Klettke ◽  
Uta Störl

AbstractData-driven methods and data science are important scientific methods in many research fields. All data science approaches require professional data engineering components. At the moment, computer science experts are needed for solving these data engineering tasks. Simultaneously, scientists from many fields (like natural sciences, medicine, environmental sciences, and engineering) want to analyse their data autonomously. The arising task for data engineering is the development of tools that can support an automated data curation and are utilisable for domain experts. In this article, we will introduce four generations of data engineering approaches classifying the data engineering technologies of the past and presence. We will show which data engineering tools are needed for the scientific landscape of the next decade.


2021 ◽  
Author(s):  
Hongfei Du ◽  
Yue Liang ◽  
Peilian Chi ◽  
Ronnel B. King

Perceptions of social mobility vary across countries. However, past studies have mostly focused on populations in Western developed countries. Little is known about perceptions of social mobility in non-Western developing countries. The current paper focuses on Chinese perceptions of social mobility using a large-scale nationally representative sample. We found that, overall, Chinese believed in upward social mobility. Moreover, different patterns of perceptions of social mobility were identified, which suggest that respondents experienced either upward or downward social mobility in the past, but all of them thought that they can move up in the future. Perceptions of social mobility were also linked to important socio-demographic and economic factors. We discuss these findings in relation to the Chinese economic context.


2021 ◽  
Author(s):  
Roel Smeets

Fiction has a major social impact, not least because it co-shapes the image that society has of various social groups. Drawing on a collection of 170 contemporary Dutch-language novels, Character Constellations presents a range of data-driven, statistical models to study depictions of characters in terms of gender, race, ethnicity, class, age, sexuality, and other identity categories. Incorporating the tools of network analysis, each chapter highlights an aspect of fictional social networks that affects the representation of social groups: their centrality, their communities, and their conflicts. While reading individual novels in light of emerging statistical patterns, combining the formal methods of social network analysis with the interpretive tools of narratology, this study shows how central societal themes such as (in)equality and emancipation, integration and segregation, and social mobility and class struggle are foregrounded, replicated, or distorted in the Dutch novel. Showcasing what character-based critiques of literary representation gain by integrating data-driven methods into the practice of critical close reading, Character Constellations contributes to societal debates on cultural representation and identity and the role fiction and art have in those debates.


Author(s):  
Dominique Barjot

AbstractHistoriography on the French post-World War Two economic purge has in the past been very limited. Recently, however, a radical change has occurred as a result of the intersection of two previously separate research fields: on the one hand economic and business life during the Occupation, and on the other hand, the purge of elites and other social groups. A conference addressing French Firms during the Occupation period paved the way for a synthesis round three axes: Firstly, it was necessary to estimate the effects of measures to seize illicit profits and to assess the impact of purges on business mobility after the War. Secondly, regional approaches could be used to define a French typology, which could then be compared to other occupied countries (Belgium for example) or occupying Nations (Germany). Thirdly, the study of branches, sectors and firms. Among these studies, two sectors have been privileged so far: the car industry as well as construction and civil engineering.


2021 ◽  
Author(s):  
Roel Smeets

Fiction has a major social impact, not least because it co-shapes the image that society has of various social groups. Drawing on a collection of 170 contemporary Dutch-language novels, Character Constellations presents a range of data-driven, statistical models to study depictions of characters in terms of gender, race, ethnicity, class, age, sexuality, and other identity categories. Incorporating the tools of network analysis, each chapter highlights an aspect of fictional social networks that affects the representation of social groups: their centrality, their communities, and their conflicts. While reading individual novels in light of emerging statistical patterns, combining the formal methods of social network analysis with the interpretive tools of narratology, this study shows how central societal themes such as (in)equality and emancipation, integration and segregation, and social mobility and class struggle are foregrounded, replicated, or distorted in the Dutch novel. Showcasing what character-based critiques of literary representation gain by integrating data-driven methods into the practice of critical close reading, Character Constellations contributes to societal debates on cultural representation and identity and the role fiction and art have in those debates.


Author(s):  
Yulia P. Melentyeva

In recent years as public in general and specialist have been showing big interest to the matters of reading. According to discussion and launch of the “Support and Development of Reading National Program”, many Russian libraries are organizing the large-scale events like marathons, lecture cycles, bibliographic trainings etc. which should draw attention of different social groups to reading. The individual forms of attraction to reading are used much rare. To author’s mind the main reason of such an issue has to be the lack of information about forms and methods of attraction to reading.


2020 ◽  
Author(s):  
Lungwani Muungo

The purpose of this review is to evaluate progress inmolecular epidemiology over the past 24 years in canceretiology and prevention to draw lessons for futureresearch incorporating the new generation of biomarkers.Molecular epidemiology was introduced inthe study of cancer in the early 1980s, with theexpectation that it would help overcome some majorlimitations of epidemiology and facilitate cancerprevention. The expectation was that biomarkerswould improve exposure assessment, document earlychanges preceding disease, and identify subgroupsin the population with greater susceptibility to cancer,thereby increasing the ability of epidemiologic studiesto identify causes and elucidate mechanisms incarcinogenesis. The first generation of biomarkers hasindeed contributed to our understanding of riskandsusceptibility related largely to genotoxic carcinogens.Consequently, interventions and policy changes havebeen mounted to reduce riskfrom several importantenvironmental carcinogens. Several new and promisingbiomarkers are now becoming available for epidemiologicstudies, thanks to the development of highthroughputtechnologies and theoretical advances inbiology. These include toxicogenomics, alterations ingene methylation and gene expression, proteomics, andmetabonomics, which allow large-scale studies, includingdiscovery-oriented as well as hypothesis-testinginvestigations. However, most of these newer biomarkershave not been adequately validated, and theirrole in the causal paradigm is not clear. There is a needfor their systematic validation using principles andcriteria established over the past several decades inmolecular cancer epidemiology.


Sign in / Sign up

Export Citation Format

Share Document