scholarly journals Upscaling human activity data: A statistical ecology approach

PLoS ONE ◽  
2021 ◽  
Vol 16 (7) ◽  
pp. e0253461
Author(s):  
Anna Tovo ◽  
Samuele Stivanello ◽  
Amos Maritan ◽  
Samir Suweis ◽  
Stefano Favaro ◽  
...  

Big data require new techniques to handle the information they come with. Here we consider four datasets (email communication, Twitter posts, Wikipedia articles and Gutenberg books) and propose a novel statistical framework to predict global statistics from random samples. More precisely, we infer the number of senders, hashtags and words of the whole dataset and how their abundances (i.e. the popularity of a hashtag) change through scales from a small sample of sent emails per sender, posts per hashtag and word occurrences. Our approach is grounded on statistical ecology as we map inference of human activities into the unseen species problem in biodiversity. Our findings may have applications to resource management in emails, collective attention monitoring in Twitter and language learning process in word databases.

Entropy ◽  
2021 ◽  
Vol 23 (6) ◽  
pp. 740
Author(s):  
Hoshin V. Gupta ◽  
Mohammad Reza Ehsani ◽  
Tirthankar Roy ◽  
Maria A. Sans-Fuentes ◽  
Uwe Ehret ◽  
...  

We develop a simple Quantile Spacing (QS) method for accurate probabilistic estimation of one-dimensional entropy from equiprobable random samples, and compare it with the popular Bin-Counting (BC) and Kernel Density (KD) methods. In contrast to BC, which uses equal-width bins with varying probability mass, the QS method uses estimates of the quantiles that divide the support of the data generating probability density function (pdf) into equal-probability-mass intervals. And, whereas BC and KD each require optimal tuning of a hyper-parameter whose value varies with sample size and shape of the pdf, QS only requires specification of the number of quantiles to be used. Results indicate, for the class of distributions tested, that the optimal number of quantiles is a fixed fraction of the sample size (empirically determined to be ~0.25–0.35), and that this value is relatively insensitive to distributional form or sample size. This provides a clear advantage over BC and KD since hyper-parameter tuning is not required. Further, unlike KD, there is no need to select an appropriate kernel-type, and so QS is applicable to pdfs of arbitrary shape, including those with discontinuous slope and/or magnitude. Bootstrapping is used to approximate the sampling variability distribution of the resulting entropy estimate, and is shown to accurately reflect the true uncertainty. For the four distributional forms studied (Gaussian, Log-Normal, Exponential and Bimodal Gaussian Mixture), expected estimation bias is less than 1% and uncertainty is low even for samples of as few as 100 data points; in contrast, for KD the small sample bias can be as large as -10% and for BC as large as -50%. We speculate that estimating quantile locations, rather than bin-probabilities, results in more efficient use of the information in the data to approximate the underlying shape of an unknown data generating pdf.


2016 ◽  
Vol 41 (5) ◽  
pp. 472-505 ◽  
Author(s):  
Elizabeth Tipton ◽  
Kelly Hallberg ◽  
Larry V. Hedges ◽  
Wendy Chan

Background: Policy makers and researchers are frequently interested in understanding how effective a particular intervention may be for a specific population. One approach is to assess the degree of similarity between the sample in an experiment and the population. Another approach is to combine information from the experiment and the population to estimate the population average treatment effect (PATE). Method: Several methods for assessing the similarity between a sample and population currently exist as well as methods estimating the PATE. In this article, we investigate properties of six of these methods and statistics in the small sample sizes common in education research (i.e., 10–70 sites), evaluating the utility of rules of thumb developed from observational studies in the generalization case. Result: In small random samples, large differences between the sample and population can arise simply by chance and many of the statistics commonly used in generalization are a function of both sample size and the number of covariates being compared. The rules of thumb developed in observational studies (which are commonly applied in generalization) are much too conservative given the small sample sizes found in generalization. Conclusion: This article implies that sharp inferences to large populations from small experiments are difficult even with probability sampling. Features of random samples should be kept in mind when evaluating the extent to which results from experiments conducted on nonrandom samples might generalize.


Entropy ◽  
2020 ◽  
Vol 22 (10) ◽  
pp. 1084
Author(s):  
Stefano Garlaschi ◽  
Anna Fochesato ◽  
Anna Tovo

Recent technological and computational advances have enabled the collection of data at an unprecedented rate. On the one hand, the large amount of data suddenly available has opened up new opportunities for new data-driven research but, on the other hand, it has brought into light new obstacles and challenges related to storage and analysis limits. Here, we strengthen an upscaling approach borrowed from theoretical ecology that allows us to infer with small errors relevant patterns of a dataset in its entirety, although only a limited fraction of it has been analysed. In particular we show that, after reducing the input amount of information on the system under study, by applying our framework it is still possible to recover two statistical patterns of interest of the entire dataset. Tested against big ecological, human activity and genomics data, our framework was successful in the reconstruction of global statistics related to both the number of types and their abundances while starting from limited presence/absence information on small random samples of the datasets. These results pave the way for future applications of our procedure in different life science contexts, from social activities to natural ecosystems.


2020 ◽  
pp. 1321103X1989917
Author(s):  
Dawn Joseph

Work–life balance has become a buzzword in many corporate settings. This study situates itself at a higher education institute in Melbourne (Australia) where African music (singing and drumming) was used as a lever for faculty staff to “break from work” and “learn about a new music and culture”. Drawing on email communication, questionnaire data, and anecdotal feedback, a phenomenological approach was adopted to explore the benefits, challenges, and opportunities staff experienced as a recreational group music activity. Data were analyzed using interpretative phenomenological analysis as a tool. Two overarching themes emerged (group participation and learning, and challenges) and are discussed in the findings. The workshops proved successful and are worthy to be replicated in other places and spaces. Further research is needed to gain insight into whether regular music workshops can influence work–life balance and professional learning for staff.


2012 ◽  
Vol 35 (3) ◽  
pp. 271-289 ◽  
Author(s):  
Michelle Kohler

In the Australian education context, there are typically two cohorts of language learners at the secondary school level, those who commence their study of the target language early in their primary schooling (early starters), and those who commence their study later, at the beginning of secondary school (late starters). The two groups may have undertaken their language study under quite different program conditions, in particular in relation to “time-on-task”i. There is little empirical evidence about the nature of student achievement in languages at the end of primary and in junior secondary and its relationship to time-on-task. This paper compares the achievements of a sample of early and late start students of Indonesian in Australia using score data gathered from common measures of achievement. In addition, a small sample of student written responses are analysed in order to highlight issues related to eliciting and describing student achievement that may not be evident from the quantitative data alone. The findings of the study reveal the nature of achievement by early and late starters of Indonesian in the SAALE study, as well as the complexity of investigating a single variable such as time-on-task in relation to student achievement. The paper concludes by recommending that assessment of student achievement in language learning take into consideration methodologies that may capture more holistically a constellation of variables that impact on students’ language learning.


2002 ◽  
Vol 9 (1) ◽  
pp. 38-48 ◽  
Author(s):  
Paul Westhead ◽  
Martin Binks ◽  
Deniz Ucbasaran ◽  
Mike Wright

In 1990/91, survey responses were gathered from 621 independent businesses located in Great Britain. A follow‐on telephone survey was conducted with 150 surviving firms in 1997. This survey gathered information surrounding the propensity of firms to export their goods or services abroad as well as other performance and goal outcomes. Organizational and external environmental variables collected in 1990 are used to explain within a multivariate statistical framework the propensity of a firm to be an exporter in 1997, and the intensity of internationalization activity. Data collected in 1990 is also used to explain variations in several performance variables (i.e. whether exporting was regarded as a path to firm growth; profit performance reported in 1997 relative to competition; and the propensity to report employment growth over the 1990 to 1997 period).


2016 ◽  
Vol 96 (12) ◽  
pp. 1982-1993 ◽  
Author(s):  
Taryn M. Jones ◽  
Blake F. Dear ◽  
Julia M. Hush ◽  
Nickolai Titov ◽  
Catherine M. Dean

Abstract Background People living with acquired brain injury (ABI) are more likely to be physically inactive and highly sedentary and, therefore, to have increased risks of morbidity and mortality. However, many adults with ABI experience barriers to participation in effective physical activity interventions. Remotely delivered self-management programs focused on teaching patients how to improve and maintain their physical activity levels have the potential to improve the overall health of adults with ABI. Objective The study objective was to evaluate the acceptability and feasibility of a remotely delivered self-management program aimed at increasing physical activity among adults who dwell in the community and have ABI. Design A single-group design involving comparison of baseline measures with those taken immediately after intervention and at a 3-month follow-up was used in this study. Methods The myMoves Program comprises 6 modules delivered over 8 weeks via email. Participants were provided with regular weekly contact with an experienced physical therapist via email and telephone. The primary outcomes were the feasibility (participation, attrition, clinician time, accessibility, and adverse events) and acceptability (satisfaction, worthiness of time, and recommendation) of the myMoves Program. The secondary outcomes were objective physical activity data collected from accelerometers, physical activity self-efficacy, psychological distress, and participation. Results Twenty-four participants commenced the program (20 with stroke, 4 with traumatic injury), and outcomes were collected for 23 and 22 participants immediately after the program and at a 3-month follow-up, respectively. The program required very little clinician contact time, with an average of 32.8 minutes (SD=22.8) per participant during the 8-week program. Acceptability was very high, with more than 95% of participants being either very satisfied or satisfied with the myMoves Program and stating that it was worth their time. All participants stated that they would recommend the program to others with ABI. Limitations The results were obtained from a small sample; hence, the results may not be generalizable to a larger ABI population. Conclusions A remotely delivered self-management program aimed at increasing physical activity is feasible and acceptable for adults with ABI. Further large-scale efficacy trials are warranted.


ReCALL ◽  
2009 ◽  
Vol 21 (1) ◽  
pp. 76-95 ◽  
Author(s):  
M’hammed Abdous ◽  
Margaret M. Camarena ◽  
Betty Rose Facer

AbstractIntegrating Mobile Assisted Language Learning (MALL) technology (personal multimedia players, cell phones, and handheld devices) into the foreign language curriculum is becoming commonplace in many secondary and higher education institutions. Current research has identified both pedagogically sound applications and important benefits to students. In this paper, we present the results of an initial study which compares the academic benefits of integrating podcasts into the curriculum against using them as a supplemental/review tool. The study’s findings indicate that when instructors use podcasts for multiple instructional purposes (e.g., to critique student projects and exams, for student video presentations, for student paired interviews, to complete specific assignments, dictations, in roundtable discussions, or for guest lectures), students are more likely to use this technology and to report academic benefits. While the study is limited by small sample sizes and by some within-group variation in instructional techniques, the study provides initial evidence that podcast technology has the potential to provide greater benefits if it is used more than simply as a tool for reviewing. The study’s positive findings indicate that additional research to examine the effects of specific instructional uses of podcast technology is merited.


2021 ◽  
Author(s):  
◽  
Janie Tito

<p>The aim of this study was to examine the issues surrounding Maori language use in secondary schools. This was to test the hypothesis that the learning experience for Maori students is influenced by a school's responsiveness to Maori needs. In particular the focus was on the use of te reo Maori e.g. pronunciation. It was found that when features of te ao Maori are reflected positively in secondary school practices, values and environment, the overall learning experience may be enhanced and become more positive for Maori students. Ultimately such practice has the potential to reduce the disparity between Maori and non-Maori educational achievement. The prevalence and quality of Maori language learning opportunities during and after teacher training, is currently not meeting the needs of students and teachers. This shortcoming requires further research and investigation. This mixed method qualitative study followed kaupapa Maori research principles and ethics. It incorporated interviews, repeated focus groups and surveys. Participants were teachers and Maori students from selected Wellington secondary schools. The sixty-four student participants raised issues around teachers and their teaching practice. They saw teachers as important role models for positive attitudes and behaviours towards te reo and tikanga Maori. In particular, correct language use and pronunciation was important. The small sample of teachers reported a variety of concerns. One frequent complaint was their lack of knowledge in using te reo and few chances to learn and improve. This study identified a need for more professional development programmes and educational policy to be introduced in secondary schools, which include aspects of Maori language and tikanga learning. This would help address some of the difficulties faced by teachers when using te reo in the classroom and improve overall teaching and learning for Maori students.</p>


Sign in / Sign up

Export Citation Format

Share Document