Large-Scale Estimates of LGBQ-Heterosexual Disparities in the Presence of Potentially Mischievous Responders: A Preregistered Replication and Comparison of Methods

Although numerous survey-based studies have found that students who identify as lesbian, gay, bisexual, or questioning (LGBQ) have elevated risk for many negative academic, disciplinary, psychological, and health outcomes, the validity of the types of data on which these results rest have come under increased scrutiny. Over the past several years, a variety of data-validity screening techniques have been used in attempts to scrub data sets of “mischievous responders,” youth who systematically provide extreme and untrue responses to outcome items and who tend to falsely report being LGBQ. We conducted a preregistered replication of Cimpian et al. with the 2017 Youth Risk Behavior Survey to (1) estimate new LGBQ-heterosexual disparities on 20 outcomes; (2) test a broader, mechanistic theory relating mischievousness effects with a feature of items (i.e., item response-option extremity); and (3) compare four techniques used to address mischievous responders. Our results are consistent with Cimpian et al.’s findings that potentially mischievous responders inflate LGBQ-heterosexual disparities, do so more among boys than girls, and affect outcomes differentially. For example, we find that removing students suspected of being mischievous responders can cut male LGBQ-heterosexual disparities in half overall and can completely or mostly eliminate disparities in outcomes including fighting at school, driving drunk, and using cocaine, heroin, and ecstasy. Methodologically, we find that some methods are better than others at addressing the issue of data integrity, with boosted regressions coupled with data removal leading to potentially very large decreases in the estimates of LGBQ-heterosexual disparities, but regression adjustment having almost no effect. While the empirical focus of this article is on LGBQ youth, the issues discussed are relevant to research on other minority groups and youth generally, and speak to survey development, methodology, and the robustness and transparency of research.

Download Full-text

Advanced Bridge Automation

Marine Technology and SNAME News ◽

10.5957/mt1.1993.30.4.276 ◽

1993 ◽

Vol 30 (04) ◽

pp. 276-285

Author(s):

Edward Denham

Keyword(s):

Human Error ◽

Environmental Data ◽

Management Systems ◽

Propulsion Systems ◽

The Past ◽

Key Technologies ◽

Routine Tasks ◽

Information Management Systems ◽

Better Than ◽

Do So

The past thirty years have seen great advances in many areas of the technologies used in naval vessels. Propulsion systems, machinery automation, and information management systems have all undergone revolutionary changes. The bridges of these ships have similarly seen the advent of many new sources of navigational and environmental data. The process of correlating and interpreting all of this information has until now remained very labor-intensive, subject to human error at many stages of the process. In response to this challenge, a suite of new equipment has been developed for distributing, displaying, correlating, and logging shipboard data. This equipment automates most of the low-level, routine tasks involved in navigating a vessel at sea, significantly reducing the stress and workload of bridge personnel. This gives the humans on the bridge more time for doing the job that they do so much better than machines: making decisions. This paper focuses on the key technologies that are used in these new products and the advances in bridge design and automation they make possible. The benefits of these new capabilities to system designers, to shipbuilders, and to ship operators are also explored.

Download Full-text

Not all written in stone: interdisciplinary syntheses in echinoderm paleontology

Canadian Journal of Zoology ◽

10.1139/z00-217 ◽

2001 ◽

Vol 79 (7) ◽

pp. 1209-1231 ◽

Cited By ~ 16

Author(s):

Rich Mooi

Keyword(s):

Evolutionary History ◽

Large Scale ◽

Body Wall ◽

Developmental Trajectories ◽

Large Data ◽

Large Data Sets ◽

Data Sets ◽

Primary Target ◽

The Past ◽

Broad Scale

The fossil record of the Echinodermata is relatively complete, and is represented by specimens retaining an abundance of features comparable to that found in extant forms. This yields a half-billion-year record of evolutionary novelties unmatched in any other major group, making the Echinodermata a primary target for studies of biological change. Not all of this change can be understood by studying the rocks alone, leading to synthetic research programs. Study of literature from the past 20 years indicates that over 1400 papers on echinoderm paleontology appeared in that time, and that overall productivity has remained almost constant. Analysis of papers appearing since 1990 shows that research is driven by new finds including, but not restricted to, possible Precambrian echinoderms, bizarre new edrioasteroids, early crinoids, exquisitely preserved homalozoans, echinoids at the K-T boundary, and Antarctic echinoids, stelleroids, and crinoids. New interpretations of echinoderm body wall homologies, broad-scale syntheses of embryological information, the study of developmental trajectories through molecular markers, and the large-scale ecological and phenotypic shifts being explored through morphometry and analyses of large data sets are integrated with study of the fossils themselves. Therefore, recent advances reveal a remarkable and continuing synergistic expansion in our understanding of echinoderm evolutionary history.

Download Full-text

Time Orientation: Past, Present, and Future Perceptions

Psychological Reports ◽

10.2466/pr0.1989.64.3c.1199 ◽

1989 ◽

Vol 64 (3_suppl) ◽

pp. 1199-1205 ◽

Cited By ~ 26

Author(s):

Leonard A. Jason ◽

Jennifer Schade ◽

Louise Furo ◽

Arne Reichler ◽

Clifford Brickman

Keyword(s):

Quality Of Life ◽

Religious Orientation ◽

Large Scale ◽

Time Orientation ◽

Nuclear War ◽

Future Expectations ◽

The Past ◽

Future Perceptions ◽

Better Than

A survey was conducted to assess people's time orientation or where they spend most of their thinking time: past, present or future. 100 women were also asked about their expectations for the quality of life in 20 co 30 yr. and about the odds of a large-scale nuclear war within 30 yr. Respondents thought almost twice as much about the present and future as the past. They rated the quality of life in 20 to 30 yr. as being the same as or slightly better than now. A nuclear war within 30 yr. was considered possible; religious orientation had a strong effect. No significant relationship was found between time orientation and future expectations.

Download Full-text

Understanding Offline Password-Cracking Methods: A Large-Scale Empirical Study

Security and Communication Networks ◽

10.1155/2021/5563884 ◽

2021 ◽

Vol 2021 ◽

pp. 1-16

Author(s):

Ruixin Shi ◽

Yongbin Zhou ◽

Yong Li ◽

Weili Han

Keyword(s):

Empirical Study ◽

Large Scale ◽

Ad Hoc ◽

Empirical Evaluation ◽

Comparative Investigation ◽

Academic Community ◽

Data Sets ◽

The Past ◽

Cracking Performance ◽

Password Cracking

Researchers proposed several data-driven methods to efficiently guess user-chosen passwords for password strength metering or password recovery in the past decades. However, these methods are usually evaluated under ad hoc scenarios with limited data sets. Thus, this motivates us to conduct a systematic and comparative investigation with a very large-scale data corpus for such state-of-the-art cracking methods. In this paper, we present the large-scale empirical study on password-cracking methods proposed by the academic community since 2005, leveraging about 220 million plaintext passwords leaked from 12 popular websites during the past decade. Specifically, we conduct our empirical evaluation in two cracking scenarios, i.e., cracking under extensive-knowledge and limited-knowledge. The evaluation concludes that no cracking method may outperform others from all aspects in these offline scenarios. The actual cracking performance is determined by multiple factors, including the underlying model principle along with dataset attributes such as length and structure characteristics. Then, we perform further evaluation by analyzing the set of cracked passwords in each targeting dataset. We get some interesting observations that make sense of many cracking behaviors and come up with some suggestions on how to choose a more effective password-cracking method under these two offline cracking scenarios.

Download Full-text

Why the pretense of pursuing democracy? The necessity and rationality of democratic slogans for civil revolutions

Asian Journal of Comparative Politics ◽

10.1177/2057891118792627 ◽

2018 ◽

Vol 4 (3) ◽

pp. 242-257

Author(s):

Taku Yukawa ◽

Kaoru Hidaka

Keyword(s):

Social Inequality ◽

Large Scale ◽

International Support ◽

Current Regime ◽

The Past ◽

Orange Revolution ◽

Per Se ◽

Do So

While democratic revolutions are not uniform in their pursuit of democracy, they do have something in common: those calling for revolution and participating in demonstrations do so under the banner of democracy. However, studies have revealed that these citizens were not at first committed to democracy per se; rather, they took the opportunity to vent their frustration against the current regime because of their struggle against poverty and social inequality. Why, then, do citizens who are not pursuing democracy per se participate in revolutions under the banner of democracy? Previous studies have failed to clarify this point. To fill this gap, we outline three strategic rationalities and necessities behind the use of “democracy” as a common slogan to justify civil revolutions: 1) organizing large scale dissident movements in a country; 2) attracting international support; and 3) imitating successful examples from the past. Evidence from the 2003 Rose Revolution in Georgia and the 2005 Orange Revolution in Ukraine supports this theory.

Download Full-text

Expert Administrators in Popular Government

American Political Science Review ◽

10.2307/1944361 ◽

1913 ◽

Vol 7 (1) ◽

pp. 45-62 ◽

Cited By ~ 3

Author(s):

A. Lawrence Lowell

Keyword(s):

Large Scale ◽

Public Affairs ◽

Human Reason ◽

The Past ◽

Self Confidence ◽

Popular Government ◽

Charles Sumner ◽

The Government ◽

Better Than ◽

Generally True

Presidents, governors and mayors certainly cannot be experts in all the matters with which they are called upon to deal, nor, as a rule, are they thoroughly expert in any of them; and in fact this is generally true of officers elected to administer public affairs. We cannot, therefore, avoid the question whether they do, or do not, need expert assistance if the government is to be efficiently conducted. The problem is not new, for the world struggled with it two thousand years ago. The fate of institutions has sometimes turned upon it, and so may the great experiment we are trying today—that of the permanence of democracy on a large scale. Americans pay little heed to the lessons taught by the painful experience of other lands, and Charles Sumner expressed a common sentiment when he remarked sarcastically his thankfulness that they knew no history in Washington. Our people have an horizon so limited, a knowledge of the past so small, a self-confidence so sublime, a conviction that they are altogether better than their fathers so profound, that they hardly realize the difficulty of their task. We assume unconsciously, as a witty writer has put it, that human reason began about thirty years ago; and yet a candid study of history shows that the essential qualities of human nature have not changed radically; that men have little more capacity or force of character than at other favored epochs. Some improvement in standards has, no doubt, taken place, and certainly the bounds of human sympathy have widened vastly; but there has been no such transformation as to justify a confidence that the men of the present day can accomplish easily and without sacrifice what to earlier generations was unattainable.

Download Full-text

A systematic comparison of two new releases of exome sequencing products: the aim of use determines the choice of product

Biological Chemistry ◽

10.1515/hsz-2015-0300 ◽

2016 ◽

Vol 397 (8) ◽

pp. 791-801 ◽

Cited By ~ 7

Author(s):

Janine Altmüller ◽

Susanne Motameny ◽

Christian Becker ◽

Holger Thiele ◽

Sreyoshi Chatterjee ◽

...

Keyword(s):

Exome Sequencing ◽

Developmental Disorders ◽

Large Scale ◽

Data Sets ◽

Early Access ◽

Whole Exome ◽

Agilent Sureselect ◽

Disease Associated Genes ◽

Diagnostic Setting ◽

Better Than

Abstract We received early access to the newest releases of exome sequencing products, namely Agilent SureSelect v6 (Agilent, Santa Clara, CA, USA) and NimbleGen MedExome (Roche NimbleGen, Basel, Switzerland), and we conducted whole exome sequencing (WES) of several DNA samples with each of these products in order to assess their performance. Here, we provide a detailed evaluation of the original, normalized (with respect to the different target sizes), and trimmed data sets and compare them in terms of the amount of duplicates, the reads on target, and the enrichment evenness. In addition to these general statistics, we performed a detailed analysis of the frequently mutated and newly described genes found in ‘The Deciphering Developmental Disorders Study’ published very recently (Fitzgerald, T.W., Gerety, S.S., Jones, W.D., van Kogelenberg, M., King, D.A., McRae, J., Morley, K.I., Parthiban, V., Al-Turki, S., Ambridge, K., et al. (2015). Large-scale discovery of novel genetic causes of developmental disorders. Nature 519, 223–228.). In our comparison, the Agilent v6 exome performs better than the NimbleGen’s MedExome both in terms of efficiency and evenness of coverage distribution. With its larger target size, it is also more comprehensive, and therefore the better choice in research projects that aim to identify novel disease-associated genes. In contrast, if the exomes are mainly used in a diagnostic setting, we see advantages for the new NimbleGen MedExome. We find a superior coverage here in those genes of high clinical relevance that likely allows for a better detection of relevant, disease-causing mutations.

Download Full-text

RNA Sequencing Data: Hitchhiker's Guide to Expression Analysis

Annual Review of Biomedical Data Science ◽

10.1146/annurev-biodatasci-072018-021255 ◽

2019 ◽

Vol 2 (1) ◽

pp. 139-173 ◽

Cited By ~ 23

Author(s):

Koen Van den Berge ◽

Katharina M. Hembach ◽

Charlotte Soneson ◽

Simone Tiberi ◽

Lieven Clement ◽

...

Keyword(s):

Gene Expression ◽

Rna Sequencing ◽

Large Scale ◽

Science Studies ◽

Data Sets ◽

Rna Seq ◽

Sequencing Data ◽

Data Types ◽

The Past ◽

Long Read

Gene expression is the fundamental level at which the results of various genetic and regulatory programs are observable. The measurement of transcriptome-wide gene expression has convincingly switched from microarrays to sequencing in a matter of years. RNA sequencing (RNA-seq) provides a quantitative and open system for profiling transcriptional outcomes on a large scale and therefore facilitates a large diversity of applications, including basic science studies, but also agricultural or clinical situations. In the past 10 years or so, much has been learned about the characteristics of the RNA-seq data sets, as well as the performance of the myriad of methods developed. In this review, we give an overview of the developments in RNA-seq data analysis, including experimental design, with an explicit focus on the quantification of gene expression and statistical approachesfor differential expression. We also highlight emerging data types, such as single-cell RNA-seq and gene expression profiling using long-read technologies.

Download Full-text

Distributed Pareto Optimization for Subset Selection

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2018/207 ◽

2018 ◽

Cited By ~ 2

Author(s):

Chao Qian ◽

Guiying Li ◽

Chao Feng ◽

Ke Tang

Keyword(s):

Real World ◽

Large Scale ◽

State Of The Art ◽

Subset Selection ◽

Data Sets ◽

Mapreduce Framework ◽

Real World Data ◽

Real World Applications ◽

Approximation Guarantee ◽

Better Than

The subset selection problem that selects a few items from a ground set arises in many applications such as maximum coverage, influence maximization, sparse regression, etc. The recently proposed POSS algorithm is a powerful approximation solver for this problem. However, POSS requires centralized access to the full ground set, and thus is impractical for large-scale real-world applications, where the ground set is too large to be stored on one single machine. In this paper, we propose a distributed version of POSS (DPOSS) with a bounded approximation guarantee. DPOSS can be easily implemented in the MapReduce framework. Our extensive experiments using Spark, on various real-world data sets with size ranging from thousands to millions, show that DPOSS can achieve competitive performance compared with the centralized POSS, and is almost always better than the state-of-the-art distributed greedy algorithm RandGreeDi.

Download Full-text

Maps, Distant Reading and the Internet Movie Database

VIEW Journal of European Television History and Culture ◽

10.18146/2213-0969.2018.jethc151 ◽

2018 ◽

Vol 7 (14) ◽

pp. 24 ◽

Cited By ~ 1

Author(s):

Giulia Taurino ◽

Marta Boni

Keyword(s):

Cultural Production ◽

Large Scale ◽

Data Sets ◽

Distant Reading ◽

Television Production ◽

The Past ◽

Large Scale Data ◽

Production Distribution ◽

Digital Landscape ◽

Scale Data

The presence of large-scale data sets, made available thanks to information technology, fostered in the past few years a new scholarly interest for the use of computational methods to extract, visualize and observe data in the Humanities. Scholars from various disciplines work on new models of analysis to detect and understand major patterns in cultural production, circulation and reception, following the lead, among others, of Lev Manovich’s cultural analytics. The aim is to use existing raw information in order to develop new questions and offer more answers about today’s digital landscape. Starting from these premises, and witnessing the current digitisation of television production, distribution, and reception, in this paper we ask what digital approaches based on big data can bring to the study of television series and their movements in the global mediascape.

Download Full-text