Continuously Bulk Loading over Range Partitioned Tables for Large Scale Historical Data

Historical data pose a variety of problems to those who seek statistically based understandings of the past. Quantitative historical analysis has been limited by researcher’s reliance on rigid statistics collected by individuals or agencies, or else by researcher access to small samples of raw data. Even digital technologies by themselves have not been enough to overcome the challenges of working with manuscript sources and aligning dis-aggregated data. However, by coupling the facilities enabled by the web with the enthusiasm of the public for explorations of the past, history has started to make the same strides towards big data evident in other fields. While the use of citizens to crowdsource research data was first pioneered within the sciences, a number of projects have similarly begun to draw on the help of citizen historians. This article explores the particular example of the Prosecution Project, which since 2014 has been using crowdsourced volunteers on a research collaboration to build a large-scale relational database of criminal prosecutions throughout Australia from the early 1800s to 1960s. The article outlines the opportunities and challenges faced by projects seeking to use web technologies to access, store and re-use historical data in an environment that increasingly enables creative collaborations between researchers and other users of social and historical data.

Download Full-text

An Analysis of Northeast Asian Maritime Territorial Disputes Using Korean News Articles as Large-Scale Historical Data

International Area Studies Review ◽

10.21212/iasr.24.3.4 ◽

2020 ◽

Vol 24 (3) ◽

pp. 71-100

Author(s):

Jangseop Byeon ◽

Jumong Na ◽

Mihwa Shim

Keyword(s):

Large Scale ◽

Historical Data ◽

Territorial Disputes ◽

Northeast Asian

Download Full-text

Penguin: Efficient Query-Based Framework for Replaying Large Scale Historical Data

IEEE Transactions on Parallel and Distributed Systems ◽

10.1109/tpds.2018.2829759 ◽

2018 ◽

Vol 29 (10) ◽

pp. 2333-2345 ◽

Cited By ~ 1

Author(s):

Rong Gu ◽

Yufa Zhou ◽

Zhaokang Wang ◽

Chunfeng Yuan ◽

Yihua Huang

Keyword(s):

Large Scale ◽

Historical Data

Download Full-text

Regional response to large-scale emergency events: Building on historical data

International Journal of Critical Infrastructure Protection ◽

10.1016/j.ijcip.2015.07.003 ◽

2015 ◽

Vol 11 ◽

pp. 12-21 ◽

Cited By ~ 5

Author(s):

Carol Romanowski ◽

Rajendra Raj ◽

Jennifer Schneider ◽

Sumita Mishra ◽

Vinay Shivshankar ◽

...

Keyword(s):

Large Scale ◽

Historical Data ◽

Emergency Events

Download Full-text

The Ịjọ element in Berbice Dutch

Language in Society ◽

10.1017/s0047404500012124 ◽

1987 ◽

Vol 16 (1) ◽

pp. 49-89 ◽

Cited By ~ 24

Author(s):

Norval S. H. Smith ◽

Ian E. Robertson ◽

Kay Williamson

Keyword(s):

Large Scale ◽

Historical Data ◽

Test Case ◽

European Language ◽

Atlantic Region ◽

The Third ◽

Sequence Of Events ◽

Slave Traders ◽

Creole Languages ◽

Substrate Influence

ABSTRACTBerbice Dutch is one of two recently rediscovered Dutch-based Creole languages spoken in Guyana. It is spoken in the county of Berbice, which corresponds to the former Dutch colony of Berbice, founded in the early seventeenth century.This language possesses certain features that make it unique in comparison to other European language-based Creoles spoken in the Atlantic region. Because of these unique features, it represents a promising test case for the presence of substrate influence, and as such, is of obvious relevance for the present creolist debate between substratists and universalists.The article discusses four different conceivable hypotheses to explain the origin of Berbice Dutch. The first of these assumes that a mixed Dutch–Kalaịarḅ trading jargon was developed in Africa as a result of the operations of the slave traders, and that this formed the basis of Berbice Dutch.The second hypothesis depends critically on the ethnic homogeneity of the slaves. This hypothesis would assume that the planters/overseers in Berbice attempted to learn those aspects of Eastern Ịjọ that could be utilized on the plantations.The third hypothesis assumes that Berbice Dutch is genetically descended from Eastern Ịjọ, but that this is not obvious due to large-scale relexification.The fourth hypothesis assumes that Eastern Ịjọ was replaced by Berbice Dutch under the catalysing influence of (creole) Dutch, rather as the fully inflected Romani language was replaced in England by the creolized Anglo-romani under the catalysing influence of English.The hypothesis that is selected as probably the best is the fourth, where it is argued that Berbice Dutch was adopted as the language of the Berbice slaves because it offered a means of expressing the identity of a newly created “ethnic” group.The most important moral that can be drawn from this article is that the development of each Creole must be examined individually. Only after such an examination has taken place for a significant number of Creoles will it be possible to define what is meant by creolization. In addition to the detailed linguistic examination required, it will also be necessary to carry out detailed (socio)historical work demonstrating if possible that the linguistic sequence of events is supported by the available historical data. (Creole language, substrate, Ịjọ language, ethnicity, mixed language)

Download Full-text

Novel Coronavirus Disease (COVID-19) and Suicide: Conceptual and Practical Considerations

10.31234/osf.io/z7drs ◽

2020 ◽

Author(s):

Ravi Philip Rajkumar

Keyword(s):

Large Scale ◽

Preventive Measures ◽

Historical Data ◽

Disease Outbreaks ◽

Distinctive Features ◽

Global Pandemic ◽

Degree Of Control ◽

Novel Coronavirus ◽

Spread Of Infection

The global pandemic of novel coronavirus disease (COVID-19) has had an adverse impact on the mental health of millions. Historical data shows that large-scale disease outbreaks are associated with elevated rates of suicide in both the short and the long term. There are certain distinctive features of the COVID-19 outbreak, from a biological as well as a psychological and social perspective, that make it likely that it will be associated with a significant increase in suicidality which may persist even after a certain degree of control has been achieved over the spread of infection. In this article, relevant historical and current literature pertaining to the association between COVID-19 and suicide are summarized and analyzed, and recommendations for preventive measures are outlined.

Download Full-text

Practical Applications of Diagnostic Data Science in Drilling and Completions

10.2118/206234-ms ◽

2021 ◽

Author(s):

Chad Senters ◽

Swathika Jayakumar ◽

Mark Warren ◽

Mike Wells ◽

Rachel Harper ◽

...

Keyword(s):

Case Studies ◽

Large Scale ◽

Data Science ◽

Oil And Gas ◽

Historical Data ◽

Oil And Gas Industry ◽

Diagnostic Data ◽

Practical Applications ◽

The Right

Abstract The application of data science remains relatively new to the oil and gas industry but continues to gain traction on many projects due to its potential to assist in solving complex problems. The amount and quality of the right type of data can be as much of a limitation as the complex algorithms and programing required. The scope of any data science project should look for easy wins early on and not attempt an all-encompassing solution with the click of a button (although that would be amazing). This paper focuses on several specific applications of data applied to a sizable database to extract useful solutions and provide an approach for data science on future projects. The first step when applying data analytics is to build a suitable database. This might appear rudimentary at first glance, but historical data is seldom catalogued optimally for future projects. This is especially true if specific portions of the recorded data were not known to be of use in solving future problems. The approach to improving the quality of the database for this paper is to establish requirements for the data science objectives and apply this to past, present and future data. Once the data are in the right "format", the extensive process of quality control can begin. Although this part of the paper is not the most exciting, it might be the most important, as most programing yields the same "garbage in = garbage out" equation. After the data have found a home and are quality checked, the data science can be applied. Case studies are presented based on the application of diagnostic data from an extensive project/well database. To leverage historical data in new projects, metrics are created as a benchmarking tool. The case studies in this paper include metrics such as the Known Lateral Contribution (KLC), Heel-to-Toe Ratio (HTR), Communication Intensity (CI), Proppant Efficiency (PE) and stage level performance. These results are compared to additional stimulation and geological information. This paper includes case studies that apply data science to diagnostics on a large scale to deliver actionable results. The results discussed will allow for the utilization of this approach in future projects and provide a roadmap to better understand diagnostic results as they relate to drilling and completion activity.

Download Full-text

Large-scale changes in the littoral fish communities of lakes in southeastern Ontario, Canada

Canadian Journal of Zoology ◽

10.1139/cjz-2017-0080 ◽

2018 ◽

Vol 96 (7) ◽

pp. 753-759

Author(s):

Paul A. Finigan ◽

Nicholas E. Mandrak ◽

Bruce L. Tufts

Keyword(s):

Fish Community ◽

Large Scale ◽

Historical Data ◽

Fish Communities ◽

Biodiversity Loss ◽

Data Sets ◽

Inland Lakes ◽

Biodiversity Change ◽

North American Freshwater Fish ◽

Management Approaches

Biodiversity loss is a serious issue for freshwater fishes in temperate climates and there is a need for more information in this area. A study was conducted to assess fish community changes in the littoral zone of 22 lakes over a 45 year period (compared years 1969–1979 and year 2014). To compare fish communities, historical seining records were compiled for 22 inland lakes and compared with contemporary data sampled using the same protocol. Fish abundance data analyzed using a multivariate approach identified a shift from cyprinid-dominated communities to centrarchid-dominated communities between time periods. There was no evidence to support a strong influence of invasive species on these communities, but there have been significant changes in temperature and land use around these lakes since the historical data sets were collected. This is an important contribution to our understanding of biodiversity change in North American freshwater fish communities and may influence fisheries management approaches in the future.

Download Full-text