Large scale wildlife monitoring studies: statistical methods for design and analysis

2002 ◽  
Vol 13 (2) ◽  
pp. 105-119 ◽  
Author(s):  
Kenneth H. Pollock ◽  
James D. Nichols ◽  
Theodore R. Simons ◽  
George L. Farnsworth ◽  
Larissa L. Bailey ◽  
...  
2017 ◽  
Vol 27 (9) ◽  
pp. 2872-2882 ◽  
Author(s):  
Zhuozhao Zhan ◽  
Geertruida H de Bock ◽  
Edwin R van den Heuvel

Clinical trials may apply or use a sequential introduction of a new treatment to determine its efficacy or effectiveness with respect to a control treatment. The reasons for choosing a particular switch design have different origins. For instance, they may be implemented for ethical or logistic reasons or for studying disease-modifying effects. Large-scale pragmatic trials with complex interventions often use stepped wedge designs (SWDs), where all participants start at the control group, and during the trial, the control treatment is switched to the new intervention at different moments. They typically use cross-sectional data and cluster randomization. On the other hand, new drugs for inhibition of cognitive decline in Alzheimer’s or Parkinson’s disease typically use delayed start designs (DSDs). Here, participants start in a parallel group design and at a certain moment in the trial, (part of) the control group switches to the new treatment. The studies are longitudinal in nature, and individuals are being randomized. Statistical methods for these unidirectional switch designs (USD) are quite complex and incomparable, and they have been developed by various authors under different terminologies, model specifications, and assumptions. This imposes unnecessary barriers for researchers to compare results or choose the most appropriate method for their own needs. This paper provides an overview of past and current statistical developments for the USDs (SWD and DSD). All designs are formulated in a unified framework of treatment patterns to make comparisons between switch designs easier. The focus is primarily on statistical models, methods of estimation, sample size calculation, and optimal designs for estimation of the treatment effect. Other relevant open issues are being discussed as well to provide suggestions for future research in USDs.


2017 ◽  
Vol 76 (3) ◽  
pp. 213-219 ◽  
Author(s):  
Johanna Conrad ◽  
Ute Nöthlings

Valid estimation of usual dietary intake in epidemiological studies is a topic of present interest. The aim of the present paper is to review recent literature on innovative approaches focussing on: (1) the requirements to assess usual intake and (2) the application in large-scale settings. Recently, a number of technology-based self-administered tools have been developed, including short-term instruments such as web-based 24-h recalls, mobile food records or simple closed-ended questionnaires that assess the food intake of the previous 24 h. Due to their advantages in terms of feasibility and cost-effectiveness these tools may be superior to conventional assessment methods in large-scale settings. New statistical methods have been developed to combine dietary information from repeated 24-h dietary recalls and FFQ. Conceptually, these statistical methods presume that the usual food intake of a subject equals the probability of consuming a food on a given day, multiplied by the average amount of intake of that food on a typical consumption day. Repeated 24-h recalls from the same individual provide information on consumption probability and amount. In addition, the FFQ can add information on intake frequency of rarely consumed foods. It has been suggested that this combined approach may provide high-quality dietary information. A promising direction for estimation of usual intake in large-scale settings is the integration of both statistical methods and new technologies. Studies are warranted to assess the validity of estimated usual intake in comparison with biomarkers.


2021 ◽  
pp. 112827
Author(s):  
Zongwei Ma ◽  
Sagnik Dey ◽  
Sundar Christopher ◽  
Riyang Liu ◽  
Jun Bi ◽  
...  

Author(s):  
Cheng Meng ◽  
Ye Wang ◽  
Xinlian Zhang ◽  
Abhyuday Mandal ◽  
Wenxuan Zhong ◽  
...  

With advances in technologies in the past decade, the amount of data generated and recorded has grown enormously in virtually all fields of industry and science. This extraordinary amount of data provides unprecedented opportunities for data-driven decision-making and knowledge discovery. However, the task of analyzing such large-scale dataset poses significant challenges and calls for innovative statistical methods specifically designed for faster speed and higher efficiency. In this chapter, we review currently available methods for big data, with a focus on the subsampling methods using statistical leveraging and divide and conquer methods.


Water Science ◽  
2018 ◽  
Vol 32 (2) ◽  
pp. 362-379 ◽  
Author(s):  
Muhammad Ashraf ◽  
Ahmed Hussein Soliman ◽  
Entesar El-Ghorab ◽  
Alaa El Zawahry

2012 ◽  
Vol 12 (13) ◽  
pp. 5755-5771 ◽  
Author(s):  
A. Sanchez-Lorenzo ◽  
P. Laux ◽  
H.-J. Hendricks Franssen ◽  
J. Calbó ◽  
S. Vogl ◽  
...  

Abstract. Several studies have claimed to have found significant weekly cycles of meteorological variables appearing over large domains, which can hardly be related to urban effects exclusively. Nevertheless, there is still an ongoing scientific debate whether these large-scale weekly cycles exist or not, and some other studies fail to reproduce them with statistical significance. In addition to the lack of the positive proof for the existence of these cycles, their possible physical explanations have been controversially discussed during the last years. In this work we review the main results about this topic published during the recent two decades, including a summary of the existence or non-existence of significant weekly weather cycles across different regions of the world, mainly over the US, Europe and Asia. In addition, some shortcomings of common statistical methods for analyzing weekly cycles are listed. Finally, a brief summary of supposed causes of the weekly cycles, focusing on the aerosol-cloud-radiation interactions and their impact on meteorological variables as a result of the weekly cycles of anthropogenic activities, and possible directions for future research, is presented.


2012 ◽  
Vol 43 (1) ◽  
pp. 33-43
Author(s):  
Milan Klement ◽  
Jiri Dostál

E-learning has become an integral part of the present day tertiary education, not only within the framework of combined, but, to an increasingly larger extent, also of full-time study modes. However, its deployment on a large scale resulted in the emergence of many problems which shall become a subject of research and investigation. The present study reflects the attitudes of university students toward e-learning within their course of study, and submits partial results of the research investigation implemented throughout the years 2007 to 2011. The hereinabove described survey research focused on monitoring and evaluating students` attitudes to the teaching through e-learning based on the use of electronic study supports enriched with multimedia elements. The research data collection was carried out by means of a non-standardized research questionnaire and the data were subsequently analysed using nonparametric statistical methods. Selected outputs of the research, focused primarily on the assessment of the level of students’ satisfaction with the organization of teaching through e-learning, and on the identification of the elements of electronic study materials preferred by the students, are the subject of this study. Key words: e-learning, e-learning support, nonparametric statistical methods, pedagogical research, survey research, tertiary education.


2015 ◽  
Author(s):  
Xiaobei Zhou ◽  
Charity W Law ◽  
Mark D Robinson

benchmarkR is an R package designed to assess and visualize the performance of statistical methods for datasets that have an independent truth (e.g., simulations or datasets with large-scale validation), in particular for methods that claim to control false discovery rates (FDR). We augment some of the standard performance plots (e.g., receiver operating characteristic, or ROC, curves) with information about how well the methods are calibrated (i.e., whether they achieve their expected FDR control). For example, performance plots are extended with a point to highlight the power or FDR at a user-set threshold (e.g., at a method's estimated 5% FDR). The package contains general containers to store simulation results (SimResults) and methods to create graphical summaries, such as receiver operating characteristic curves (rocX), false discovery plots (fdX) and power-to-achieved FDR plots (powerFDR); each plot is augmented with some form of calibration information. We find these plots to be an improved way to interpret relative performance of statistical methods for genomic datasets where many hypothesis tests are performed. The strategies, however, are general and will find applications in other domains.


2019 ◽  
Vol 21 (4) ◽  
pp. 1209-1223 ◽  
Author(s):  
Raphael Petegrosso ◽  
Zhuliu Li ◽  
Rui Kuang

Abstract   Single-cell RNAsequencing (scRNA-seq) technologies have enabled the large-scale whole-transcriptome profiling of each individual single cell in a cell population. A core analysis of the scRNA-seq transcriptome profiles is to cluster the single cells to reveal cell subtypes and infer cell lineages based on the relations among the cells. This article reviews the machine learning and statistical methods for clustering scRNA-seq transcriptomes developed in the past few years. The review focuses on how conventional clustering techniques such as hierarchical clustering, graph-based clustering, mixture models, $k$-means, ensemble learning, neural networks and density-based clustering are modified or customized to tackle the unique challenges in scRNA-seq data analysis, such as the dropout of low-expression genes, low and uneven read coverage of transcripts, highly variable total mRNAs from single cells and ambiguous cell markers in the presence of technical biases and irrelevant confounding biological variations. We review how cell-specific normalization, the imputation of dropouts and dimension reduction methods can be applied with new statistical or optimization strategies to improve the clustering of single cells. We will also introduce those more advanced approaches to cluster scRNA-seq transcriptomes in time series data and multiple cell populations and to detect rare cell types. Several software packages developed to support the cluster analysis of scRNA-seq data are also reviewed and experimentally compared to evaluate their performance and efficiency. Finally, we conclude with useful observations and possible future directions in scRNA-seq data analytics. Availability All the source code and data are available at https://github.com/kuanglab/single-cell-review.


Sign in / Sign up

Export Citation Format

Share Document