PANDORA Talks: Personality and Demographics on Reddit

Mapping Intimacies ◽

10.31234/osf.io/94xcp ◽

2020 ◽

Author(s):

Matej Gjurković ◽

Mladen Karan ◽

Iva Vukojević ◽

Mihaela Bošnjak ◽

Jan Šnajder

Keyword(s):

Social Sciences ◽

Large Scale ◽

Demographic Variables ◽

Gender Classification ◽

Big 5 ◽

Psychological Theories ◽

Large Scale Dataset ◽

Personality Models

Personality and demographics are important variables in social sciences, whilein NLP they can aid in interpretability and removal of societal biases.However, datasets with both personality and demographic labels are scarce. Toaddress this, we present PANDORA, the first large-scale dataset of Reddit commentslabeled with three personality models (including the well-established Big 5 model) and demographics (age, gender, and location) for more than 10k users. Weshowcase the usefulness of this dataset on three experiments, where we leveragethe more readily available data from other personality models to predict theBig 5 traits, analyze gender classification biases arising frompsycho-demographic variables, and carry out a confirmatory and exploratoryanalysis based on psychological theories. Finally, we present benchmarkprediction models for all personality and demographic variables.

Download Full-text

Survey of Clustering Methods for Large Scale Dataset

International Journal of Computer Sciences and Engineering ◽

10.26438/ijcse/v7i5.13381344 ◽

2019 ◽

Vol 7 (5) ◽

pp. 1338-1344

Author(s):

Anupama Jawale ◽

Ganesh Magar

Keyword(s):

Large Scale ◽

Clustering Methods ◽

Large Scale Dataset

Download Full-text

Archaeological survey and excavation of the Mingtepa Site in Andijan Region, Uzbekistan

Chinese Archaeology ◽

10.1515/char-2019-0011 ◽

2019 ◽

Vol 19 (1) ◽

pp. 150-162

Keyword(s):

Social Sciences ◽

Inner City ◽

Large Scale ◽

Rammed Earth ◽

Archaeological Survey ◽

Ancient City ◽

City Walls ◽

Academy Of Sciences ◽

Cultural Connotations

Abstract Since 2012, the Institute of Archaeology of Chinese Academy of Social Sciences and Institute of Archaeology of the Academy of Sciences of Uzbekistan organized joint archaeological team and conducted five terms of archaeological survey and excavation to the Mingtepa Ancient City Site in Uzbekistan. The excavation showed that the Mingtepa Ancient City Site is a large-scale city site with nested inner and outer cities; confirmed the coexistence relationships among the architectural sites with high rammed-earth platform foundations, city walls, gates, roads and handicraft workshop remains, which are the scientific evidences for the in-depth researches on the layout and cultural connotations of the inner city; the burials found on the east wall of the outer city provided rare data of the terminus ante quem of the abandoning of the outer city.

Download Full-text

Joint regression and learning from pairwise rankings for personalized image aesthetic assessment

Computational Visual Media ◽

10.1007/s41095-021-0207-y ◽

2021 ◽

Author(s):

Jin Zhou ◽

Qing Zhang ◽

Jian-Hao Fan ◽

Wei Sun ◽

Wei-Shi Zheng

Keyword(s):

Large Scale ◽

Assessment Model ◽

Generic Model ◽

Small Subset ◽

Deep Convolutional Neural Networks ◽

Personal Taste ◽

Hinge Loss ◽

Novel Approach ◽

Large Scale Dataset ◽

Image Pairs

AbstractRecent image aesthetic assessment methods have achieved remarkable progress due to the emergence of deep convolutional neural networks (CNNs). However, these methods focus primarily on predicting generally perceived preference of an image, making them usually have limited practicability, since each user may have completely different preferences for the same image. To address this problem, this paper presents a novel approach for predicting personalized image aesthetics that fit an individual user’s personal taste. We achieve this in a coarse to fine manner, by joint regression and learning from pairwise rankings. Specifically, we first collect a small subset of personal images from a user and invite him/her to rank the preference of some randomly sampled image pairs. We then search for the K-nearest neighbors of the personal images within a large-scale dataset labeled with average human aesthetic scores, and use these images as well as the associated scores to train a generic aesthetic assessment model by CNN-based regression. Next, we fine-tune the generic model to accommodate the personal preference by training over the rankings with a pairwise hinge loss. Experiments demonstrate that our method can effectively learn personalized image aesthetic preferences, clearly outperforming state-of-the-art methods. Moreover, we show that the learned personalized image aesthetic benefits a wide variety of applications.

Download Full-text

VIPPrint: Validating Synthetic Image Detection and Source Linking Methods on a Large Scale Dataset of Printed Documents

Journal of Imaging ◽

10.3390/jimaging7030050 ◽

2021 ◽

Vol 7 (3) ◽

pp. 50

Author(s):

Anselmo Ferreira ◽

Ehsan Nowroozi ◽

Mauro Barni

Keyword(s):

Large Scale ◽

State Of The Art ◽

Child Pornography ◽

Forensic Analysis ◽

Synthetic Image ◽

Image Detection ◽

Face Images ◽

Large Scale Dataset ◽

Scanned Images ◽

Analysis Of The Images

The possibility of carrying out a meaningful forensic analysis on printed and scanned images plays a major role in many applications. First of all, printed documents are often associated with criminal activities, such as terrorist plans, child pornography, and even fake packages. Additionally, printing and scanning can be used to hide the traces of image manipulation or the synthetic nature of images, since the artifacts commonly found in manipulated and synthetic images are gone after the images are printed and scanned. A problem hindering research in this area is the lack of large scale reference datasets to be used for algorithm development and benchmarking. Motivated by this issue, we present a new dataset composed of a large number of synthetic and natural printed face images. To highlight the difficulties associated with the analysis of the images of the dataset, we carried out an extensive set of experiments comparing several printer attribution methods. We also verified that state-of-the-art methods to distinguish natural and synthetic face images fail when applied to print and scanned images. We envision that the availability of the new dataset and the preliminary experiments we carried out will motivate and facilitate further research in this area.

Download Full-text

The Relevance of Inequality Research in Sociology for Inequality Reduction

Socius Sociological Research for a Dynamic World ◽

10.1177/23780231211020199 ◽

2021 ◽

Vol 7 ◽

pp. 237802312110201

Author(s):

Thomas A. DiPrete ◽

Brittany N. Fox-Williams

Keyword(s):

Social Sciences ◽

Social Policy ◽

Social Inequality ◽

Large Scale ◽

Sociological Research ◽

Life Chances ◽

The Past ◽

The Social ◽

Large Scale Change ◽

The Subject

Social inequality is a central topic of research in the social sciences. Decades of research have deepened our understanding of the characteristics and causes of social inequality. At the same time, social inequality has markedly increased during the past 40 years, and progress on reducing poverty and improving the life chances of Americans in the bottom half of the distribution has been frustratingly slow. How useful has sociological research been to the task of reducing inequality? The authors analyze the stance taken by sociological research on the subject of reducing inequality. They identify an imbalance in the literature between the discipline’s continual efforts to motivate the plausibility of large-scale change and its lesser efforts to identify feasible strategies of change either through social policy or by enhancing individual and local agency with the potential to cumulate into meaningful progress on inequality reduction.

Download Full-text

ShadingNet: Image Intrinsics by Fine-Grained Shading Decomposition

International Journal of Computer Vision ◽

10.1007/s11263-021-01477-5 ◽

2021 ◽

Author(s):

Anil S. Baslamisli ◽

Partha Das ◽

Hoang-An Le ◽

Sezer Karaoglu ◽

Theo Gevers

Keyword(s):

Neural Network ◽

Large Scale ◽

State Of The Art ◽

Image Decomposition ◽

Natural Environments ◽

Decomposition Algorithms ◽

Ambient Light ◽

Fine Grained ◽

Large Scale Dataset ◽

Direct Illumination

AbstractIn general, intrinsic image decomposition algorithms interpret shading as one unified component including all photometric effects. As shading transitions are generally smoother than reflectance (albedo) changes, these methods may fail in distinguishing strong photometric effects from reflectance variations. Therefore, in this paper, we propose to decompose the shading component into direct (illumination) and indirect shading (ambient light and shadows) subcomponents. The aim is to distinguish strong photometric effects from reflectance variations. An end-to-end deep convolutional neural network (ShadingNet) is proposed that operates in a fine-to-coarse manner with a specialized fusion and refinement unit exploiting the fine-grained shading model. It is designed to learn specific reflectance cues separated from specific photometric effects to analyze the disentanglement capability. A large-scale dataset of scene-level synthetic images of outdoor natural environments is provided with fine-grained intrinsic image ground-truths. Large scale experiments show that our approach using fine-grained shading decompositions outperforms state-of-the-art algorithms utilizing unified shading on NED, MPI Sintel, GTA V, IIW, MIT Intrinsic Images, 3DRMS and SRD datasets.

Download Full-text

Building Damage Detection Using U-Net with Attention Mechanism from Pre- and Post-Disaster Remote Sensing Datasets

Remote Sensing ◽

10.3390/rs13050905 ◽

2021 ◽

Vol 13 (5) ◽

pp. 905

Author(s):

Chuyi Wu ◽

Feng Zhang ◽

Junshi Xia ◽

Yichen Xu ◽

Guoqing Li ◽

...

Keyword(s):

Damage Assessment ◽

Large Scale ◽

Binary Classification ◽

Open Data ◽

Building Damage ◽

Attention Mechanism ◽

Large Scale Dataset ◽

Data Program ◽

The Impact ◽

Post Disaster

The building damage status is vital to plan rescue and reconstruction after a disaster and is also hard to detect and judge its level. Most existing studies focus on binary classification, and the attention of the model is distracted. In this study, we proposed a Siamese neural network that can localize and classify damaged buildings at one time. The main parts of this network are a variety of attention U-Nets using different backbones. The attention mechanism enables the network to pay more attention to the effective features and channels, so as to reduce the impact of useless features. We train them using the xBD dataset, which is a large-scale dataset for the advancement of building damage assessment, and compare their result balanced F (F1) scores. The score demonstrates that the performance of SEresNeXt with an attention mechanism gives the best performance, with the F1 score reaching 0.787. To improve the accuracy, we fused the results and got the best overall F1 score of 0.792. To verify the transferability and robustness of the model, we selected the dataset on the Maxar Open Data Program of two recent disasters to investigate the performance. By visual comparison, the results show that our model is robust and transferable.

Download Full-text

Microcomputer Simulations of Presidential Elections

News for Teachers of Political Science ◽

10.1017/s0197901900005110 ◽

1983 ◽

Vol 38 ◽

pp. 20-20

Author(s):

Robert S. Ross

Keyword(s):

Social Sciences ◽

Large Scale ◽

Role Playing ◽

Science Students ◽

Instructional Programs ◽

Santa Barbara ◽

Large Scale Systems ◽

The Social ◽

User Friendly ◽

Machine Interaction

Simulations have been an important adjunct to instructional programs for some time. These have ranged from games, or role playing exercises, such as SIMSOC or Internation Simulation, to student-machine interaction, such as the inter-school simulation run out of University of California, Santa Barbara in the early 70's, to the all machine activities found in some of the early SETUPS. Having social science students use the mainframe computer, however, always posed problems: it definitely was not user-friendly and most instructors had little if any training or interest in the use of large scale systems.The wide-spread use of the micro computer is not only revolutionizing areas traditionally relying upon the computer, but is going to have an impact on the social sciences as well.

Download Full-text

K-AP Clustering Algorithm for Large Scale Dataset

2011 First International Workshop on Complexity and Data Mining ◽

10.1109/iwcdm.2011.28 ◽

2011 ◽

Cited By ~ 1

Author(s):

Chao Liu ◽

Rosemary Hey ◽

Wei Wang

Keyword(s):

Large Scale ◽

Clustering Algorithm ◽

Large Scale Dataset ◽

Ap Clustering

Download Full-text

Large-scale dataset from China gives new insights into leaf margin–temperature relationships

Palaeogeography Palaeoclimatology Palaeoecology ◽

10.1016/j.palaeo.2014.03.016 ◽

2014 ◽

Vol 402 ◽

pp. 73-80 ◽

Cited By ~ 10

Author(s):

Wen-Yun Chen ◽

Tao Su ◽

Jonathan M. Adams ◽

Frédéric M.B. Jacques ◽

David K. Ferguson ◽

...

Keyword(s):

Large Scale ◽

Leaf Margin ◽

Large Scale Dataset

Download Full-text