coupon collector
Recently Published Documents


TOTAL DOCUMENTS

51
(FIVE YEARS 9)

H-INDEX

9
(FIVE YEARS 0)

Author(s):  
Pawan Poojary ◽  
Sharayu Moharir ◽  
Krishna Jagannathan
Keyword(s):  

2021 ◽  
Vol 5 (ICFP) ◽  
pp. 1-30
Author(s):  
Martin Avanzini ◽  
Gilles Barthe ◽  
Ugo Dal Lago

We define a continuation-passing style (CPS) translation for a typed λ-calculus with probabilistic choice, unbounded recursion, and a tick operator — for modeling cost. The target language is a (non-probabilistic) λ-calculus, enriched with a type of extended positive reals and a fixpoint operator. We then show that applying the CPS transform of an expression M to the continuation λ v . 0 yields the expected cost of M . We also introduce a formal system for higher-order logic, called EHOL, prove it sound, and show it can derive tight upper bounds on the expected cost of classic examples, including Coupon Collector and Random Walk. Moreover, we relate our translation to Kaminski et al.’s ert-calculus, showing that the latter can be recovered by applying our CPS translation to (a generalization of) the classic embedding of imperative programs into λ-calculus. Finally, we prove that the CPS transform of an expression can also be used to compute pre-expectations and to reason about almost sure termination.


2021 ◽  
Author(s):  
Kirsten Van Huffel ◽  
Michiel Stock ◽  
Bernard De Baets

In combinatorial biotechnology, it is crucial for screening experiments to sufficiently cover the design space. In the BioCCP.jl package, we provide functions for minimum sample size determination based on the mathematical framework coined the Coupon Collector Problem. BioCCP.jl, including source code, documentation and a Pluto notebook is available at https://github.com/kirstvh/BioCCP.


Informatics ◽  
2021 ◽  
Vol 18 (1) ◽  
pp. 25-42
Author(s):  
V. N. Yarmolik ◽  
V. A. Levantsevich ◽  
D. V. Demenkovets ◽  
I. Mrozek

The urgency of the problem of testing storage devices of modern computer systems is shown. The mathematical models of their faults and the methods used for testing the most complex cases by classical march tests are investigated. Passive pattern sensitive faults (PNPSFk) are allocated, in which arbitrary k from N memory cells participate, where k << N, and N is the memory capacity in bits. For these faults, analytical expressions are given for the minimum and maximum fault coverage that is achievable within the march tests. The concept of a primitive is defined, which describes in terms of march test elements the conditions for activation and fault detection of PNPSFk of storage devices. Examples of march tests with maximum fault coverage, as well as march tests with a minimum time complexity equal to 18N are given. The efficiency of a single application of tests such as MATS ++, March C− and March PS is investigated for different number of k ≤ 9 memory cells involved in PNPSFk fault. The applicability of multiple testing with variable address sequences is substantiated, when the use of random sequences of addresses is proposed. Analytical expressions are given for the fault coverage of complex PNPSFk faults depending on the multiplicity of the test. In addition, the estimates of the mean value of the multiplicity of the MATS++, March C− and March PS tests, obtained on the basis of a mathematical model describing the problem of the coupon collector, and ensuring the detection of all k2k PNPSFk faults are given. The validity of analytical estimates is experimentally shown and the high efficiency of PNPSFk fault detection is confirmed by tests of the March PS type.


Informatics ◽  
2020 ◽  
Vol 17 (2) ◽  
pp. 54-70
Author(s):  
V. N. Yarmolik ◽  
I. Mrozek ◽  
S. V. Yarmolik

The relevance of testing of memory devices of modern computing systems is shown. The methods and algorithms for implementing test procedures based on classical March tests are analyzed. Multiple March tests are highlighted to detect complex pattern-sensitive memory faults. To detect them, the necessary condition that test procedures must satisfy to deal complex faults, is substantiated. This condition is in the formation of a pseudo-exhaustive test for a given number of arbitrary memory cells. We study the effectiveness of single and double application of tests like MATS ++, March C– and March A, and also give its analytical estimates for a different number of k ≤ 10 memory cells participating in a malfunction. The applicability of the mathematical model of the combinatorial problem of the coupon collector for describing multiple memory testing is substantiated. The values of the average, minimum, and maximum multiplicity of multiple tests are presented to provide an exhaustive set of binary combinations for a given number of arbitrary memory cells. The validity of analytical estimates is experimentally shown and the high efficiency of the formation of a pseudo-exhaustive coverage by tests of the March A type is confirmed.


Author(s):  
Marc Beunardeau ◽  
Fatima-Ezzahra El Orche ◽  
Diana Maimuţ ◽  
David Naccache ◽  
Peter B. Rønne ◽  
...  

mSystems ◽  
2019 ◽  
Vol 4 (5) ◽  
Author(s):  
Taylor M. Royalty ◽  
Andrew D. Steen

ABSTRACT We applied theoretical and simulation-based approaches to characterize how microbial community structure influences the amount of sequencing effort to reconstruct metagenomes that are assembled from short-read sequences. First, a coupon collector equation was proposed as an analytical model for predicting sequencing effort as a function of microbial community structure. Characterization was performed by varying community structure properties such as richness, evenness, and genome size. Simulations demonstrated that while community richness and evenness influenced the sequencing effort required to sequence a community metagenome to exhaustion, the effort necessary to sequence an individual genome to a target fraction of exhaustion depended only on the relative abundance of the genome and its genome size. A second analysis evaluated the quantity, completion, and contamination of metagenome-assembled genomes (MAGs) as a function of sequencing effort on four preexisting sequence read data sets from different environments. These data sets were subsampled to various degrees of completeness to simulate the effect of sequencing effort on MAG retrieval. Modeling suggested that sequencing efforts beyond what is typical in published experiments (1 to 10 Gbp) would generate diminishing returns in terms of MAG binning. A software tool, Genome Relative Abundance to Sequencing Effort (GRASE), was created to assist investigators to further explore this relationship. Reevaluation of the relationship between sequencing effort and binning success in the context of genome relative abundance, as opposed to base pairs, provides a constraint on sequencing experiments based on the relative abundance of microbes in an environment rather than arbitrary levels of sequencing effort. IMPORTANCE Short-read sequencing with Illumina sequencing technology provides an accurate, high-throughput method for characterizing the metabolic potential of microbial communities. Short-read sequences can be assembled and binned into metagenome-assembled genomes, thus shedding light on the function of microbial ecosystems that are important for health, agriculture, and Earth system processes. The work presented here provides an analytical framework for selecting sequencing effort as a function of genome relative abundance. As such, experimental goals in metagenome-assembled genome creation projects can select sequencing effort based on the rarest target genome as a constrained threshold. We hope that the results presented here, as well as GRASE, will be valuable to researchers planning sequencing experiments.


2019 ◽  
Vol 23 ◽  
pp. 739-769
Author(s):  
Paweł Lorek

For a given absorbing Markov chain X* on a finite state space, a chain X is a sharp antidual of X* if the fastest strong stationary time (FSST) of X is equal, in distribution, to the absorption time of X*. In this paper, we show a systematic way of finding such an antidual based on some partial ordering of the state space. We use a theory of strong stationary duality developed recently for Möbius monotone Markov chains. We give several sharp antidual chains for Markov chain corresponding to a generalized coupon collector problem. As a consequence – utilizing known results on the limiting distribution of the absorption time – we indicate separation cutoffs (with their window sizes) in several chains. We also present a chain which (under some conditions) has a prescribed stationary distribution and its FSST is distributed as a prescribed mixture of sums of geometric random variables.


2018 ◽  
Author(s):  
Josep Arús-Pous ◽  
Thomas Blaschke ◽  
Silas Ulander ◽  
Jean-Louis Reymond ◽  
Hongming Chen ◽  
...  

Recent applications of Recurrent Neural Networks enable training models that sample the chemical space. In this study we train RNN with molecular string representations (SMILES) with a subset of the enumerated database GDB-13 (975 million molecules). We show that a model trained with 1 million structures (0.1 % of the database) reproduces 68.9 % of the entire database after training, when sampling 2 billion molecules. We also developed a method to assess the quality of the training process using log-likelihood plots. Furthermore, we use a mathematical model based on the “coupon collector problem” that compares the trained model to an upper bound, which shows that complex molecules with many rings and heteroatoms are more difficult to sample. We also suggest that the metrics obtained from this analysis can be used as a tool to benchmark any molecular generative model.<br>


Sign in / Sign up

Export Citation Format

Share Document