The Changing Space of Families: A Genealogical Approach

2019 ◽  
Vol 43 (1) ◽  
pp. 1-29
Author(s):  
Alice Bee Kasakoff

This article highlights the usefulness of family trees for visualizing and understanding changing patterns of kin dispersion over time. Such spatial patterns are important in gauging how families influence outcomes such as health and social mobility. The article describes how rapidly growing families, originally from England, dispersed over the US North and established hubs where they originally settled that lasted hundreds of years, even as they repeated the process moving West. Fathers lived much closer to their adult sons in 1850 than they do today and many more had an adult son within a radius of 30 miles. Big Data from genealogical websites is now available to map large numbers of families. Comparing one such data set with the US Census of 1880 shows that the native-born population is well represented, but there are not as many foreign born or African Americans in these data sets. Pedigrees become less and less representative the further back in time they go because they only include lines that have survived into the present. Despite these and other limitations, Big Data make it possible to study family spatial dispersion going back many generations and to map past spatial connections in a wider variety of historical contexts and at a scale never before possible.

2016 ◽  
Vol 39 (11) ◽  
pp. 1477-1501 ◽  
Author(s):  
Victoria Goode ◽  
Nancy Crego ◽  
Michael P. Cary ◽  
Deirdre Thornlow ◽  
Elizabeth Merwin

Researchers need to evaluate the strengths and weaknesses of data sets to choose a secondary data set to use for a health care study. This research method review informs the reader of the major issues necessary for investigators to consider while incorporating secondary data into their repertoire of potential research designs and shows the range of approaches the investigators may take to answer nursing research questions in a variety of context areas. The researcher requires expertise in locating and judging data sets and in the development of complex data management skills for managing large numbers of records. There are important considerations such as firm knowledge of the research question supported by the conceptual framework and the selection of appropriate databases, which guide the researcher in delineating the unit of analysis. Other more complex issues for researchers to consider when conducting secondary data research methods include data access, management and security, and complex variable construction.


SPE Journal ◽  
2017 ◽  
Vol 23 (03) ◽  
pp. 719-736 ◽  
Author(s):  
Quan Cai ◽  
Wei Yu ◽  
Hwa Chi Liang ◽  
Jenn-Tai Liang ◽  
Suojin Wang ◽  
...  

Summary The oil-and-gas industry is entering an era of “big data” because of the huge number of wells drilled with the rapid development of unconventional oil-and-gas reservoirs during the past decade. The massive amount of data generated presents a great opportunity for the industry to use data-analysis tools to help make informed decisions. The main challenge is the lack of the application of effective and efficient data-analysis tools to analyze and extract useful information for the decision-making process from the enormous amount of data available. In developing tight shale reservoirs, it is critical to have an optimal drilling strategy, thereby minimizing the risk of drilling in areas that would result in low-yield wells. The objective of this study is to develop an effective data-analysis tool capable of dealing with big and complicated data sets to identify hot zones in tight shale reservoirs with the potential to yield highly productive wells. The proposed tool is developed on the basis of nonparametric smoothing models, which are superior to the traditional multiple-linear-regression (MLR) models in both the predictive power and the ability to deal with nonlinear, higher-order variable interactions. This data-analysis tool is capable of handling one response variable and multiple predictor variables. To validate our tool, we used two real data sets—one with 249 tight oil horizontal wells from the Middle Bakken and the other with 2,064 shale gas horizontal wells from the Marcellus Shale. Results from the two case studies revealed that our tool not only can achieve much better predictive power than the traditional MLR models on identifying hot zones in the tight shale reservoirs but also can provide guidance on developing the optimal drilling and completion strategies (e.g., well length and depth, amount of proppant and water injected). By comparing results from the two data sets, we found that our tool can achieve model performance with the big data set (2,064 Marcellus wells) with only four predictor variables that is similar to that with the small data set (249 Bakken wells) with six predictor variables. This implies that, for big data sets, even with a limited number of available predictor variables, our tool can still be very effective in identifying hot zones that would yield highly productive wells. The data sets that we have access to in this study contain very limited completion, geological, and petrophysical information. Results from this study clearly demonstrated that the data-analysis tool is certainly powerful and flexible enough to take advantage of any additional engineering and geology data to allow the operators to gain insights on the impact of these factors on well performance.


2007 ◽  
Vol 7 (3-4) ◽  
pp. 212-229 ◽  
Author(s):  
Chang Huh ◽  
A.J. Singh

The 2000 Census of Population indicated that 50 million Americans, or 19.3 per cent of the US population, were people with disabilities and covered under the Americans with Disabilities Act. It is estimated that the number of families with a member with a disability will grow significantly. Although people with disabilities and their families have sufficient discretionary income and time to take pleasure trips, tourism and hospitality marketers and practitioners to date generally have not much considered this group to be a focal market segment. The objective of the study was to determine whether families with a member with a disability should be considered a viable niche market by tourism and hospitality industry. Two secondary data sets from US Census reports and a six-state longitudinal travel market survey were used to evaluate the viability of this group as a market segment according to Kotler's criteria for market segmentation. Substantiality, differentiability and actionability were identified as the three most important criteria to determine that this segment is a viable niche tourism market. The findings indicate that this market can possibly be attracted through discount deals and reached through auto club publications and specially designed web pages. The marketing implications of this study are discussed.


2014 ◽  
Vol 38 (1-2) ◽  
pp. 251-271 ◽  
Author(s):  
Ann L. Magennis ◽  
Michael G. Lacy

This paper analyzes admissions to the Colorado Insane Asylum from 1879 to 1900. We estimate and compare admission rates across sex, age, marital, occupation, and immigration status using original admission records in combination with US census data from 1870 to1900. We show the extent to which persons in various status groups, who varied in power and social advantage, differed in their risk of being institutionalized in the context of nineteenth-century Colorado. Our analysis showed that admission or commitment to the Asylum did not entail permanent incarceration, as more than half of those admitted were discharged within six months. Men were admitted at higher rates than women, even after adjusting for age. Marital status also affected the risk of admission; single and divorced persons were admitted at about 1.5 times the rate of their married counterparts. Widows of either sex were even more likely to be admitted to the Asylum, and the risk increased with age. Persons in lower income/lower prestige occupations were more likely to be institutionalized. This included occupations in the domestic and personal service category in the US census, and this was evident for both males and females. Foreign-born men and women were admitted at, respectively, twice and three times the rate of their native counterparts, with particularly elevated rates observed among the Irish. In general, admission to the Colorado Insane Asylum appears to differ only in a slightly greater admission of males when compared to similar contemporaneous institutions in the East, despite the obvious differences in the Colorado population size and urban concentration.


2020 ◽  
Vol 8 (6) ◽  
pp. 3704-3708

Big data analytics is a field in which we analyse and process information from large or convoluted data sets to be managed by methods of data-processing. Big data analytics is used in analysing the data and helps in predicting the best outcome from the data sets. Big data analytics can be very useful in predicting crime and also gives the best possible solution to solve that crime. In this system we will be using the past crime data set to find out the pattern and through that pattern we will be predicting the range of the incident. The range of the incident will be determined by the decision model and according to the range the prediction will be made. The data sets will be nonlinear and in the form of time series so in this system we will be using the prophet model algorithm which is used to analyse the non-linear time series data. The prophet model categories in three main category and i.e. trends, seasonality, and holidays. This system will help crime cell to predict the possible incident according to the pattern which will be developed by the algorithm and it also helps to deploy right number of resources to the highly marked area where there is a high chance of incidents to occur. The system will enhance the crime prediction system and will help the crime department to use their resources more efficiently.


2020 ◽  
Vol 10 (7) ◽  
pp. 2539 ◽  
Author(s):  
Toan Nguyen Mau ◽  
Yasushi Inoguchi

It is challenging to build a real-time information retrieval system, especially for systems with high-dimensional big data. To structure big data, many hashing algorithms that map similar data items to the same bucket to advance the search have been proposed. Locality-Sensitive Hashing (LSH) is a common approach for reducing the number of dimensions of a data set, by using a family of hash functions and a hash table. The LSH hash table is an additional component that supports the indexing of hash values (keys) for the corresponding data/items. We previously proposed the Dynamic Locality-Sensitive Hashing (DLSH) algorithm with a dynamically structured hash table, optimized for storage in the main memory and General-Purpose computation on Graphics Processing Units (GPGPU) memory. This supports the handling of constantly updated data sets, such as songs, images, or text databases. The DLSH algorithm works effectively with data sets that are updated with high frequency and is compatible with parallel processing. However, the use of a single GPGPU device for processing big data is inadequate, due to the small memory capacity of GPGPU devices. When using multiple GPGPU devices for searching, we need an effective search algorithm to balance the jobs. In this paper, we propose an extension of DLSH for big data sets using multiple GPGPUs, in order to increase the capacity and performance of the information retrieval system. Different search strategies on multiple DLSH clusters are also proposed to adapt our parallelized system. With significant results in terms of performance and accuracy, we show that DLSH can be applied to real-life dynamic database systems.


2019 ◽  
Vol 491 (3) ◽  
pp. 3290-3317 ◽  
Author(s):  
Oliver H E Philcox ◽  
Daniel J Eisenstein ◽  
Ross O’Connell ◽  
Alexander Wiegand

ABSTRACT To make use of clustering statistics from large cosmological surveys, accurate and precise covariance matrices are needed. We present a new code to estimate large-scale galaxy two-point correlation function (2PCF) covariances in arbitrary survey geometries that, due to new sampling techniques, runs ∼104 times faster than previous codes, computing finely binned covariance matrices with negligible noise in less than 100 CPU-hours. As in previous works, non-Gaussianity is approximated via a small rescaling of shot noise in the theoretical model, calibrated by comparing jackknife survey covariances to an associated jackknife model. The flexible code, rascalc, has been publicly released, and automatically takes care of all necessary pre- and post-processing, requiring only a single input data set (without a prior 2PCF model). Deviations between large-scale model covariances from a mock survey and those from a large suite of mocks are found to be indistinguishable from noise. In addition, the choice of input mock is shown to be irrelevant for desired noise levels below ∼105 mocks. Coupled with its generalization to multitracer data sets, this shows the algorithm to be an excellent tool for analysis, reducing the need for large numbers of mock simulations to be computed.


2020 ◽  
Vol 7 (1) ◽  
pp. 163-180
Author(s):  
Saagar S Kulkarni ◽  
Kathryn E Lorenz

This paper examines two CDC data sets in order to provide a comprehensive overview and social implications of COVID-19 related deaths within the United States over the first eight months of 2020. By analyzing the first data set during this eight-month period with the variables of age, race, and individual states in the United States, we found correlations between COVID-19 deaths and these three variables. Overall, our multivariable regression model was found to be statistically significant.  When analyzing the second CDC data set, we used the same variables with one exception; gender was used in place of race. From this analysis, it was found that trends in age and individual states were significant. However, since gender was not found to be significant in predicting deaths, we concluded that, gender does not play a significant role in the prognosis of COVID-19 induced deaths. However, the age of an individual and his/her state of residence potentially play a significant role in determining life or death. Socio-economic analysis of the US population confirms Qualitative socio-economic Logic based Cascade Hypotheses (QLCH) of education, occupation, and income affecting race/ethnicity differently. For a given race/ethnicity, education drives occupation then income, where a person lives, and in turn his/her access to healthcare coverage. Considering socio-economic data based QLCH framework, we conclude that different races are poised for differing effects of COVID-19 and that Asians and Whites are in a stronger position to combat COVID-19 than Hispanics and Blacks.


2021 ◽  
Author(s):  
Mohammad Shehata ◽  
Hideki Mizunaga

<p>Long-period magnetotelluric and gravity data were acquired to investigate the US cordillera's crustal structure. The magnetotelluric data are being acquired across the continental USA on a quasi-regular grid of ∼70 km spacing as an electromagnetic component of the National Science Foundation EarthScope/USArray Program. International Gravimetreique Bureau compiled gravity Data at high spatial resolution. Due to the difference in data coverage density, the geostatistical joint integration was utilized to map the subsurface structures with adequate resolution. First, a three-dimensional inversion of each data set was applied separately.</p><p>The inversion results of both data sets show a similarity of structure for data structuralizing. The individual result of both data sets is resampled at the same locations using the kriging method by considering each inversion model to estimate the coefficient. Then, the Layer Density Correction (LDC) process's enhanced density distribution was applied to MT data's spatial expansion process. Simple Kriging with varying Local Means (SKLM) was applied to the residual analysis and integration. For this purpose, the varying local means of the resistivity were estimated using the corrected gravity data by the Non-Linear Indicator Transform (NLIT), taking into account the spatial correlation. After that, the spatial expansion analysis of MT data obtained sparsely was attempted using the estimated local mean values and SKLM method at the sections where the MT survey was carried out and for the entire area where density distributions exist. This research presents the integration results and the stand-alone inversion results of three-dimensional gravity and magnetotelluric data.</p>


Author(s):  
Subodh Kesharwani

The whole ball of wax we create leaves a digital footprint. Big data had ascended as a catchword in recent years. Principally, it means a prodigious aggregate of information that is stimulated as trails or by-products of online and offline doings — what we get using credit cards, where we travel via GPS, what we ‘like’ on Facebook or retweet on Twitter, or what we bargain either through “apnidukaan” via amazon, and so on. In this era and stage, the Data as a Service (DaaS) battle is gaining force, spurring one of the fastest growing industries in the world.“Big data” is a jargon for data sets that are so gigantic or multi-layered that old-style data processing application software’s are deprived to concord with them. Challenges contain apprehension, storage, analysis, data curation, search, sharing, and transmission, visualization, querying, and updating information privacy. The term “big data” usually refers self-effacingly to the use of extrapolative analytics, user behaviour analytics, or sure other advanced data analytics methods that extract value from data, and infrequently to a separable size of data set


Sign in / Sign up

Export Citation Format

Share Document