incomplete sampling
Recently Published Documents


TOTAL DOCUMENTS

79
(FIVE YEARS 20)

H-INDEX

18
(FIVE YEARS 2)

2022 ◽  
Author(s):  
Joshua W. Lambert ◽  
Pedro Santos Neves ◽  
Richel Bilderbeek ◽  
Luis Valente ◽  
Rampal S. Etienne

Understanding macroevolution on islands requires knowledge of the closest relatives of island species on the mainland. The evolutionary relationships between island and mainland species can be reconstructed using phylogenies, to which models can be fitted to understand the dynamical processes of colonisation and diversification. But how much information on the mainland is needed to gain insight into macroevolution on islands? Here we first test whether species turnover on the mainland and incomplete mainland sampling leave recognisable signatures in community phylogenetic data. We find predictable phylogenetic patterns: colonisation times become older and the perceived proportion of endemic species increases as mainland turnover and incomplete knowledge increase. We then analyse the influence of these factors on the inference performance of the island biogeography model DAISIE, a whole-island community phylogenetic model that assumes that mainland species do not diversify, and that the mainland is fully sampled in the phylogeny. We find that colonisation and diversification rate are estimated with little bias in the presence of mainland extinction and incomplete sampling. By contrast, the rate of anagenesis is overestimated under high levels of mainland extinction and incomplete sampling, because these increase the perceived level of island endemism. We conclude that community-wide phylogenetic and endemism datasets of island species carry a signature of mainland extinction and sampling. The robustness of parameter estimates suggests that island diversification and colonisation can be studied even with limited knowledge of mainland dynamics.


2021 ◽  
Vol 923 (1) ◽  
pp. 54
Author(s):  
Joseph M. Michail ◽  
Mark Wardle ◽  
Farhad Yusef-Zadeh ◽  
Devaky Kunneriath

Abstract We present and analyze ALMA submillimeter observations from a multiwavelength campaign of Sgr A* during 2019 July 18. In addition to the submillimeter, we utilize concurrent mid-infrared (mid-IR; Spitzer) and X-ray (Chandra) observations. The submillimeter emission lags less than δ t ≈ 30 minutes behind the mid-IR data. However, the entire submillimeter flare was not observed, raising the possibility that the time delay is a consequence of incomplete sampling of the light curve. The decay of the submillimeter emission is not consistent with synchrotron cooling. Therefore, we analyze these data adopting an adiabatically expanding synchrotron source that is initially optically thick or thin in the submillimeter, yielding time-delayed or synchronous flaring with the IR, respectively. The time-delayed model is consistent with a plasma blob of radius 0.8 R S (Schwarzschild radius), electron power-law index p = 3.5 (N(E) ∝ E −p ), equipartition magnetic field of B eq ≈ 90 Gauss, and expansion velocity v exp ≈ 0.004 c . The simultaneous emission is fit by a plasma blob of radius 2 R S, p = 2.5, B eq ≈ 27 Gauss, and v exp ≈ 0.014 c . Since the submillimeter time delay is not completely unambiguous, we cannot definitively conclude which model better represents the data. This observation presents the best evidence for a unified flaring mechanism between submillimeter and X-ray wavelengths and places significant constraints on the source size and magnetic field strength. We show that concurrent observations at lower frequencies would be able to determine if the flaring emission is initially optically thick or thin in the submillimeter.


Author(s):  
Pieter Huybrechts ◽  
Maarten Trekels ◽  
Quentin Groom

There are an estimated 8.7 million eukaryotic species globally and knowledge of those organisms is organised about their scientific names and the specimens we have of those species (Sweetlove 2011, Mora et al. 2011). Likewise there are between 1.2 and 2.1 billion (109) specimens held in biodiversity collections globally (Ariño 2010). These collections constitute an infrastructure and scientific tool to understand, catalogue and study biodiversity. Yet we find it hard to answer the simple question, how many species are in a collection? This is not trivial to answer, collections are not completely inventoried, do not use the same taxonomy, and the volume of data is vast (Samy et al. 2013, Ariño 2010). We have developed a method that allows us to take a list of collections and to estimate the species richness contained within them. By doing this we will have a deeper insight into the scientific value of the world's biodiversity collections. Dealing with non-homogeneous and non-random, but incomplete, sampling of sites is a common issue that occurs in many ecological studies (Magurran and McGill 2011, Colwell et al. 2012, Gotelli and Colwell 2001). By using techniques and toolboxes, such as iNEXT (Chao et al. 2014b) and vegan (Oksanen et al. 2020) we can estimate species richness under these conditions. In the case of collections we consider not only the digitized and published proportion of preserved collections, but make extrapolations to the specimens that have not made their way to the Global Biodiversity Information Facility (GBIF) yet. Nevertheless, to calculate on such large datasets we need to employ innovative Big Data analytic tools. GBIF contains 1.8 billion observations that amount to 120 GB of data compressed. This can then be interrogated in the cloud or locally using tools such as Galaxy, which has made it possible to process large numbers of records in a single batch. We can now evaluate the biodiversity within collections, and divide the result by taxon and geographical region, and compare them to one another. Ultimately, this work will allow individual collections and consortia to evaluate their coverage of biodiversity and help them better target their collecting strategies.


Author(s):  
Guang-Xuan Lan ◽  
Jun-Jie Wei ◽  
Hou-Dun Zeng ◽  
Ye Li ◽  
Xue-Feng Wu

Abstract In this work, we update and enlarge the long gamma-ray burst (GRB) sample detected by the Swift satellite. Given the incomplete sampling of the faint bursts and the low completeness in redshift measurement, we carefully select a subsample of bright Swift bursts to revisit the GRB luminosity function (LF) and redshift distribution by taking into account the probability of redshift measurement. Here we also explore two general expressions for the GRB LF, i.e. a broken power-law LF and a triple power-law LF. Our results suggest that a strong redshift evolution in luminosity (with an evolution index of $\delta =1.92^{+0.25}_{-0.37}$) or in density ($\delta =1.26^{+0.33}_{-0.34}$) is required in order to well account for the observations, independent of the assumed expression of the GRB LF. However, in a one-on-one comparison using the Akaike information criterion, the best-fitting evolution model involving the triple power-law LF is statistically preferred over the best-fitting one involving the broken power-law LF with a relative probability of ∼94.3 per cent versus ∼5.7 per cent. Extrapolating our fitting results to the flux limit of the whole Swift sample, and considering the trigger probability of Swift/Burst Alert Telescope in detail, we find that the expectations from our evolution models provide a good representation of the observed distributions of the whole sample without the need for any adjustment of the model free parameters. This further confirms the reliability of our analysis results.


2021 ◽  
Vol 8 (8) ◽  
pp. 202143
Author(s):  
Manabu Sakamoto ◽  
Michael J. Benton ◽  
Chris Venditti

Through phylogenetic modelling, we previously presented strong support for diversification decline in the three major subclades of dinosaurs (Sakamoto et al . 2016 Proc. Natl Acad. Sci. USA 113 , 5036–5040. ( doi:10.1073/pnas.1521478113 )). Recently, our support for this model has been criticized (Bonsor et al . 2020 R. Soc. Open Sci. 7 , 201195. ( doi:10.1098/rsos.201195 )). Here, we highlight that these criticisms seem to largely stem from a misunderstanding of our study: contrary to Bonsor et al .'s claims, our model accounts for heterogeneity in diversification dynamics, was selected based on deviance information criterion (DIC) scores (not parameter significance), and intercepts were estimated to account for uncertainties in the root age of the phylogenetic tree. We also demonstrate that their new analyses are not comparable to our models: they fit simple, Dinosauria-wide models as a direct comparison to our group-wise models, and their additional trees are subclades that are limited in taxonomic coverage and temporal span, i.e. severely affected by incomplete sampling. We further present results of new analyses on larger, better-sampled trees ( N = 961) of dinosaurs, showing support for the time-quadratic model. Disagreements in how we interpret modelled diversification dynamics are to be expected, but criticisms should be based on sound logic and understanding of the model under discussion.


2021 ◽  
Author(s):  
Diana M Tordoff ◽  
Alexander L Greninger ◽  
Pavitra Roychoudhury ◽  
Lasata Shrestha ◽  
Hong Xie ◽  
...  

Background: The first confirmed case of SARS-CoV-2 in North America was identified in Washington state on January 21, 2020. We aimed to quantify the number and temporal trends of out-of-state introductions of SARS-CoV-2 into Washington. Methods: We conducted a phylogenetic analysis of 11,422 publicly available whole genome SARS-CoV-2 sequences from GISAID sampled between December 2019 and September 2020. We used maximum parsimony ancestral state reconstruction methods on time-calibrated phylogenies to enumerate introductions/exports, their likely geographic source (e.g. US, non-US, and between eastern and western Washington), and estimated date of introduction. To incorporate phylogenetic uncertainty into our estimates, we conducted 5,000 replicate analyses by generating 25 random time-stratified samples of non-Washington reference sequences, 20 random polytomy resolutions, and 10 random resolutions of the reconstructed ancestral state. Results: We estimated a minimum 287 separate introductions (median, range 244-320) into Washington and 204 exported lineages (range 188-227) of SARS-CoV-2 out of Washington. Introductions began in mid-January and peaked on March 29, 2020. Lineages with the Spike D614G variant accounted for the majority (88%) of introductions. Overall, 61% (range 55-65%) of introductions into Washington likely originated from a source elsewhere within the US, while the remaining 39% (range 35-45%) likely originated from outside of the US. Intra-state transmission accounted for 65% and 28% of introductions into eastern and western Washington, respectively. Conclusions: There is phylogenetic evidence that the SARS-CoV-2 epidemic in Washington is continually seeded by a large number of introductions, and that there was significant inter- and intra-state transmission. Due to incomplete sampling our data underestimate the true number of introductions.


2021 ◽  
Author(s):  
Ke Cao ◽  
Jens-Christian Svenning ◽  
Chuan Yan ◽  
Jintun Zhang ◽  
Xiangcheng Mi ◽  
...  

AbstractMeasures of β-diversity are known to be highly constrained by the variation in γ-diversity across regions (i.e., γ-dependence), making it challenging to infer underlying ecological processes. Undersampling correction methods have attempted to estimate the actual β-diversity in order to minimize the effects of γ-dependence arising from the problem of incomplete sampling. However, no study has systematically tested their effectiveness in removing γ-dependence, and examined how well undersampling-corrected β-metrics reflect true β-diversity patterns that respond to ecological gradients. Here, we conduct these tests by comparing two undersampling correction methods with the widely used individual-based null model approach, using both empirical data and simulated communities along a known ecological gradient across a wide range of γ-diversity and sample sizes. We found that undersampling correction methods using diversity accumulation curves were generally more effective than the null model approach in removing γ-dependence. In particular, the undersampling-corrected β-Shannon diversity index was most independent on γ-diversity and was the most reflective of the true β-diversity pattern along the ecological gradient. Moreover, the null model-corrected Jaccard-Chao index removed γ-dependence more effectively than either approach alone. Our validation of undersampling correction methods as effective tools for accommodating γ-dependence greatly facilitates the comparison of β-diversity across regions.


2021 ◽  
Vol 7 (1) ◽  
pp. eabd8180
Author(s):  
Orlando B. Giorgetti ◽  
Prashant Shingate ◽  
Connor P. O’Meara ◽  
Vydianathan Ravi ◽  
Nisha E. Pillai ◽  
...  

The rules underlying the structure of antigen receptor repertoires are not yet fully defined, despite their enormous importance for the understanding of adaptive immunity. With current technology, the large antigen receptor repertoires of mice and humans cannot be comprehensively studied. To circumvent the problems associated with incomplete sampling, we have studied the immunogenetic features of one of the smallest known vertebrates, the cyprinid fish Paedocypris sp. “Singkep” (“minifish”). Despite its small size, minifish has the key genetic facilities characterizing the principal vertebrate lymphocyte lineages. As described for mammals, the frequency distributions of immunoglobulin and T cell receptor clonotypes exhibit the features of fractal systems, demonstrating that self-similarity is a fundamental property of antigen receptor repertoires of vertebrates, irrespective of body size. Hence, minifish achieve immunocompetence via a few thousand lymphocytes organized in robust scale-free networks, thereby ensuring immune reactivity even when cells are lost or clone sizes fluctuate during immune responses.


2020 ◽  
Author(s):  
Sahand Farhoodi ◽  
Uri Eden

Generalized Linear Models (GLMs) have been used extensively in statistical models of spike train data. However, the IRLS algorithm, which is often used to fit such models, can fail to converge in situations where response and non-response can be separated by a single predictor or a linear combination of multiple predictors. Such situations are likely to arise in many neural systems due to properties such as refractoriness and incomplete sampling of the signals that influence spiking. In this paper, we describe multiple classes of approaches to address this problem: Standard IRLS with a fixed iteration limit, computing the maximum likelihood solution in the limit, Bayesian estimation, regularization, change of basis, and modifying the search parameters. We demonstrate a specific application of each of these methods to spiking data from rat somatosensory cortex and discuss the advantages and disadvantages of each. We also provide an example of a roadmap for selecting a method based on the problem’s particular analysis issues and scientific goals.


2020 ◽  
Author(s):  
Søren Wichmann ◽  
Taraka Rama

AbstractTwo families of quantitative methods have been used to infer geographical homelands of language families: Bayesian phylogeography and the ‘diversity method’. Bayesian methods model how populations may have moved using a phylogenetic tree as a backbone, while the diversity method assumes that the geographical area where linguistic diversity is highest likely corresponds to the homeland. No systematic tests of the performances of the different methods in a linguistic context have so far been published. Here we carry out performance testing by simulating language families, including branching structures and word lists, along with speaker populations moving in space. We test six different methods: two versions of BayesTraits; the relaxed random walk model of BEAST 2; our own RevBayes implementations of a fixed rates and a variable rates random walk model; and the diversity method. As a result of the tests we propose a hierarchy of performance of the different methods. Factors such as geographical idiosyncrasies, incomplete sampling, tree imbalance, and small family sizes all have a negative impact on performance, but mostly across the board, the performance hierarchy generally being impervious to such factors.


Sign in / Sign up

Export Citation Format

Share Document