kernel density estimates
Recently Published Documents


TOTAL DOCUMENTS

106
(FIVE YEARS 21)

H-INDEX

20
(FIVE YEARS 2)

Author(s):  
Kerstin Erfurth ◽  
Marcus Groß ◽  
Ulrich Rendtel ◽  
Timo Schmid

AbstractComposite spatial data on administrative area level are often presented by maps. The aim is to detect regional differences in the concentration of subpopulations, like elderly persons, ethnic minorities, low-educated persons, voters of a political party or persons with a certain disease. Thematic collections of such maps are presented in different atlases. The standard presentation is by Choropleth maps where each administrative unit is represented by a single value. These maps can be criticized under three aspects: the implicit assumption of a uniform distribution within the area, the instability of the resulting map with respect to a change of the reference area and the discontinuities of the maps at the borderlines of the reference areas which inhibit the detection of regional clusters.In order to address these problems we use a density approach in the construction of maps. This approach does not enforce a local uniform distribution. It does not depend on a specific choice of area reference system and there are no discontinuities in the displayed maps. A standard estimation procedure of densities are Kernel density estimates. However, these estimates need the geo-coordinates of the single units which are not at disposal as we have only access to the aggregates of some area system. To overcome this hurdle, we use a statistical simulation concept. This can be interpreted as a Simulated Expectation Maximisation (SEM) algorithm of Celeux et al (1996). We simulate observations from the current density estimates which are consistent with the aggregation information (S-step). Then we apply the Kernel density estimator to the simulated sample which gives the next density estimate (E-Step).This concept has been first applied for grid data with rectangular areas, see Groß et al (2017), for the display of ethnic minorities. In a second application we demonstrated the use of this approach for the so-called “change of support” (Bradley et al 2016) problem. Here Groß et al (2020) used the SEM algorithm to recalculate case numbers between non-hierarchical administrative area systems. Recently Rendtel et al (2021) applied the SEM algorithm to display spatial-temporal clusters of Corona infections in Germany.Here we present three modifications of the basic SEM algorithm: 1) We introduce a boundary correction which removes the underestimation of kernel density estimates at the borders of the population area. 2) We recognize unsettled areas, like lakes, parks and industrial areas, in the computation of the kernel density. 3) We adapt the SEM algorithm for the computation of local percentages which are important especially in voting analysis.We evaluate our approach against several standard maps by means of the local voting register with known addresses. In the empirical part we apply our approach for the display of voting results for the 2016 election of the Berlin parliament. We contrast our results against Choropleth maps and show new possibilities for reporting spatial voting results.


2021 ◽  
Author(s):  
Jared Adolf-Bryfogle ◽  
Jason W Labonte ◽  
John C Kraft ◽  
Maxim Shapavolov ◽  
Sebastian Raemisch ◽  
...  

Carbohydrates and glycoproteins modulate key biological functions. Computational approaches inform function to aid in carbohydrate structure prediction, structure determination, and design. However, experimental structure determination of sugar polymers is notoriously difficult as glycans can sample a wide range of low energy conformations, thus limiting the study of glycan-mediated molecular interactions. In this work, we expanded the RosettaCarbohydrate framework, developed and benchmarked effective tools for glycan modeling and design, and extended the Rosetta software suite to better aid in structural analysis and benchmarking tasks through the SimpleMetrics framework. We developed a glycan-modeling algorithm, GlycanTreeModeler, that computationally builds glycans layer-by-layer, using adaptive kernel density estimates (KDE) of common glycan conformations derived from data in the Protein Data Bank (PDB) and from quantum mechanics (QM) calculations. After a rigorous optimization of kinematic and energetic considerations to improve near-native sampling enrichment and decoy discrimination, GlycanTreeModeler was benchmarked on a test set of diverse glycan structures, or "trees". Structures predicted by GlycanTreeModeler agreed with native structures at high accuracy for both de novo modeling and experimental density-guided building. GlycanTreeModeler algorithms and associated tools were employed to design de novo glycan trees into a protein nanoparticle vaccine that are able to direct the immune response by shielding regions of the scaffold from antibody recognition. This work will inform glycoprotein model prediction, aid in both X-ray and electron microscopy density solutions and refinement, and help lead the way towards a new era of computational glycobiology.


2021 ◽  
Vol 17 (7) ◽  
pp. e1009225
Author(s):  
Shirit Dvorkin ◽  
Reut Levi ◽  
Yoram Louzoun

Recent advances in T cell repertoire (TCR) sequencing allow for the characterization of repertoire properties, as well as the frequency and sharing of specific TCR. However, there is no efficient measure for the local density of a given TCR. TCRs are often described either through their Complementary Determining region 3 (CDR3) sequences, or theirV/J usage, or their clone size. We here show that the local repertoire density can be estimated using a combined representation of these components through distance conserving autoencoders and Kernel Density Estimates (KDE). We present ELATE–an Encoder-based LocAl Tcr dEnsity and show that the resulting density of a sample can be used as a novel measure to study repertoire properties. The cross-density between two samples can be used as a similarity matrix to fully characterize samples from the same host. Finally, the same projection in combination with machine learning algorithms can be used to predict TCR-peptide binding through the local density of known TCRs binding a specific target.


2021 ◽  
Vol 2 (3) ◽  
pp. 351-369
Author(s):  
Shauna McBride-Kebert ◽  
Christina N. Toms

Common bottlenose dolphins, Tursiops truncatus, can suffer health complications from prolonged freshwater exposure; however, little is known about how dolphins behaviorally respond to flood events. We investigated whether dolphins mitigated their freshwater exposure by moving south towards the estuary mouth and/or towards deeper areas with higher salinities in response to a record-breaking flood in Pensacola Bay, Florida. In total, 144 dolphin groups observed during 45 population dynamic surveys were analyzed across two flood-impacted sampling sessions and their respective seasonal control sessions. Kernel density estimates demonstrated southern movement towards the estuary mouth during flood-impacted sessions, but this distribution change was limited. Species distribution models showed that dolphins did not move to deeper areas after the flood and dolphin distribution was not substantially altered by flood-induced salinity changes. The estuary system exhibits strongly stratified waters with broad salinity ranges even during the flood. Dolphins may have mitigated the severity of freshwater exposure by capitalizing on these stratified areas as they continued to use habitat affected by the flood. A lack of avoidance of low salinity could result in this dolphin population being at greater risk for health problems, which should be considered in future population management and conservation.


2021 ◽  
Author(s):  
Shike Gao ◽  
Bin Xie ◽  
Wenwen Yu ◽  
Chengyu Huang ◽  
Xiao Zhang ◽  
...  

Abstract The successful construction of marine protected areas (MPAs) in temperate waters largely depends on our understanding of the distribution and coexistence of organisms with varying habitat preferences, which helps us to better understand the community patterns mediated by connectivity in coastal areas. This study was conducted to examine the connectivity of nekton assemblages in artificial reefs and adjacent waters, which included five habitats: the artificial reef area (AR), aquaculture area (AA), natural area (NA), estuary area (EA) and comprehensive effect area (CEA), in Haizhou Bay in October 2020. Analysis of variance (ANOVA) showed that there were significant differences in the characteristics and abundances of nekton in each habitat (P<0.05). Approximately 38.2% of the individuals were found in at least three habitats, and very few species were present in only a single habitat. Several highly abundant nekton species were selected according to the kernel density estimates (KDEs), and their body lengths varied gradationally among habitats, potentially indicating migration and diffusion during their life history. The results showed that artificial reefs and adjacent waters in Haizhou Bay are related by similar nekton assemblages and ontogenetic variation. Finally, this study has implications for the conservation and monitoring of nekton assemblages in artificial reefs and adjacent waters, highlighting that the principle of connectivity should be taken into consideration in the design of MPAs and MPA networks that can be applied in different stages of implementation and in different combinations of scenarios.


Author(s):  
Adam Brown ◽  
Omer Bobrowski ◽  
Elizabeth Munch ◽  
Bei Wang

AbstractWe study the probabilistic convergence between the mapper graph and the Reeb graph of a topological space $${\mathbb {X}}$$ X equipped with a continuous function $$f: {\mathbb {X}}\rightarrow \mathbb {R}$$ f : X → R . We first give a categorification of the mapper graph and the Reeb graph by interpreting them in terms of cosheaves and stratified covers of the real line $$\mathbb {R}$$ R . We then introduce a variant of the classic mapper graph of Singh et al. (in: Eurographics symposium on point-based graphics, 2007), referred to as the enhanced mapper graph, and demonstrate that such a construction approximates the Reeb graph of $$({\mathbb {X}}, f)$$ ( X , f ) when it is applied to points randomly sampled from a probability density function concentrated on $$({\mathbb {X}}, f)$$ ( X , f ) . Our techniques are based on the interleaving distance of constructible cosheaves and topological estimation via kernel density estimates. Following Munch and Wang (In: 32nd international symposium on computational geometry, volume 51 of Leibniz international proceedings in informatics (LIPIcs), Dagstuhl, Germany, pp 53:1–53:16, 2016), we first show that the mapper graph of $$({\mathbb {X}}, f)$$ ( X , f ) , a constructible $$\mathbb {R}$$ R -space (with a fixed open cover), approximates the Reeb graph of the same space. We then construct an isomorphism between the mapper of $$({\mathbb {X}},f)$$ ( X , f ) to the mapper of a super-level set of a probability density function concentrated on $$({\mathbb {X}}, f)$$ ( X , f ) . Finally, building on the approach of Bobrowski et al. (Bernoulli 23(1):288–328, 2017b), we show that, with high probability, we can recover the mapper of the super-level set given a sufficiently large sample. Our work is the first to consider the mapper construction using the theory of cosheaves in a probabilistic setting. It is part of an ongoing effort to combine sheaf theory, probability, and statistics, to support topological data analysis with random data.


Author(s):  
Shirit Dvorkin ◽  
Reut Levi ◽  
Yoram Louzoun

AbstractRecent advances in T cell repertoire (TCR) sequencing allow for characterization of repertoire properties, as well as the frequency and sharing of specific TCR. However, there is no efficient measure for the local density of a given TCR. TCRs are often described either through their Complementary Determining region 3 (CDR3) sequences, or theirV/J usage or their clone size. We here show that the local repertoire density can be estimated using a combined representation of these components through distance conserving autoencoders and Kernel Density Estimates (KDE).We present ELATE – an Encoder based LocAl Tcr dEnsity and show that the resulting density of a sample can be used as a novel measure to study repertoire properties. The cross-density between two samples can be used as a similarity matrix to fully characterize samples from the same host. Finally, the same projection in combination with machine learning algorithms can be used to predict TCR-peptide binding through the local density of known TCRs binding a specific target.Code availability- https://github.com/louzounlab/Autoencoder


Sign in / Sign up

Export Citation Format

Share Document