scholarly journals Estimating the undetected infections in the Covid-19 outbreak by harnessing capture-recapture methods

Author(s):  
Dankmar Böhning ◽  
Irene Rocchetti ◽  
Antonello Maruotti ◽  
Heinz Holling

AbstractA major open question, affecting the policy makers decisions, is the estimation of the true size of COVID-19 infections. Most of them are undetected, because of a large number of asymptomatic cases. We provide an efficient, easy to compute and robust lower bound estimator for the number of undetected cases. A “modified” version of the Chao estimator is proposed, based on the cumulative time-series distribution of cases and deaths. Heterogeneity has been accounted for by assuming a geometrical distribution underlying the data generation process. An (approximated) analytical variance formula has been properly derived to compute reliable confidence intervals at 95%. An application to Austrian situation is provided and results from other European Countries are mentioned in the discussion.

2002 ◽  
Vol 18 (2) ◽  
pp. 313-348 ◽  
Author(s):  
Pentti Saikkonen ◽  
Helmut Lütkepohl

Unit root tests for time series with level shifts of general form are considered when the timing of the shift is unknown. It is proposed to estimate the nuisance parameters of the data generation process including the shift date in a first step and apply standard unit root tests to the residuals. The estimation of the nuisance parameters is done in such a way that the unit root tests on the residuals have the same limiting distributions as for the case of a known break date. Simulations are performed to investigate the small sample properties of the tests, and empirical examples are discussed to illustrate the procedure.


1994 ◽  
Vol 10 (3-4) ◽  
pp. 764-773 ◽  
Author(s):  
Jae-Young Kim

Asymptotic normality of the Bayesian posterior is a well-known result for stationary dynamic models or nondynamic models. This paper extends the analysis to a time series model with a possible nonstationary process. We spell out conditions under which asymptotic normality of the posterior is obtained even if the true data-generation process is a nonstationary process.


2021 ◽  
Vol 15 (6) ◽  
pp. 1-22
Author(s):  
Yashen Wang ◽  
Huanhuan Zhang ◽  
Zhirun Liu ◽  
Qiang Zhou

For guiding natural language generation, many semantic-driven methods have been proposed. While clearly improving the performance of the end-to-end training task, these existing semantic-driven methods still have clear limitations: for example, (i) they only utilize shallow semantic signals (e.g., from topic models) with only a single stochastic hidden layer in their data generation process, which suffer easily from noise (especially adapted for short-text etc.) and lack of interpretation; (ii) they ignore the sentence order and document context, as they treat each document as a bag of sentences, and fail to capture the long-distance dependencies and global semantic meaning of a document. To overcome these problems, we propose a novel semantic-driven language modeling framework, which is a method to learn a Hierarchical Language Model and a Recurrent Conceptualization-enhanced Gamma Belief Network, simultaneously. For scalable inference, we develop the auto-encoding Variational Recurrent Inference, allowing efficient end-to-end training and simultaneously capturing global semantics from a text corpus. Especially, this article introduces concept information derived from high-quality lexical knowledge graph Probase, which leverages strong interpretability and anti-nose capability for the proposed model. Moreover, the proposed model captures not only intra-sentence word dependencies, but also temporal transitions between sentences and inter-sentence concept dependence. Experiments conducted on several NLP tasks validate the superiority of the proposed approach, which could effectively infer meaningful hierarchical concept structure of document and hierarchical multi-scale structures of sequences, even compared with latest state-of-the-art Transformer-based models.


Processes ◽  
2021 ◽  
Vol 9 (7) ◽  
pp. 1115
Author(s):  
Gilseung Ahn ◽  
Hyungseok Yun ◽  
Sun Hur ◽  
Si-Yeong Lim

Accurate predictions of remaining useful life (RUL) of equipment using machine learning (ML) or deep learning (DL) models that collect data until the equipment fails are crucial for maintenance scheduling. Because the data are unavailable until the equipment fails, collecting sufficient data to train a model without overfitting can be challenging. Here, we propose a method of generating time-series data for RUL models to resolve the problems posed by insufficient data. The proposed method converts every training time series into a sequence of alphabetical strings by symbolic aggregate approximation and identifies occurrence patterns in the converted sequences. The method then generates a new sequence and inversely transforms it to a new time series. Experiments with various RUL prediction datasets and ML/DL models verified that the proposed data-generation model can help avoid overfitting in RUL prediction model.


Sensors ◽  
2021 ◽  
Vol 21 (6) ◽  
pp. 2144
Author(s):  
Stefan Reitmann ◽  
Lorenzo Neumann ◽  
Bernhard Jung

Common Machine-Learning (ML) approaches for scene classification require a large amount of training data. However, for classification of depth sensor data, in contrast to image data, relatively few databases are publicly available and manual generation of semantically labeled 3D point clouds is an even more time-consuming task. To simplify the training data generation process for a wide range of domains, we have developed the BLAINDER add-on package for the open-source 3D modeling software Blender, which enables a largely automated generation of semantically annotated point-cloud data in virtual 3D environments. In this paper, we focus on classical depth-sensing techniques Light Detection and Ranging (LiDAR) and Sound Navigation and Ranging (Sonar). Within the BLAINDER add-on, different depth sensors can be loaded from presets, customized sensors can be implemented and different environmental conditions (e.g., influence of rain, dust) can be simulated. The semantically labeled data can be exported to various 2D and 3D formats and are thus optimized for different ML applications and visualizations. In addition, semantically labeled images can be exported using the rendering functionalities of Blender.


2019 ◽  
Vol 5 (2) ◽  
pp. 76-82
Author(s):  
Cornelius Mellino Sarungu ◽  
Liliana Liliana

Project management practice used many tools to support the process of recording and tracking data generated along the whole project. Project analytics provide deeper insights to be used on decision making. To conduct project analytics, one should explore the tools and techniques required. The mostcommon tool is Microsoft Excel. Its simplicity and flexibility make project manager or project team members can utilize it to do almost any kind of activities. We combine MS Excel with R Studio to brought data analytics into the project management process. While the data input process still using the old way that the project manager already familiar, the analytic engine could extract data from it and create visualization of needed parameters in a single output report file. This kind of approach deliver a low cost solution of project analytics for the organization. We can implement it with relatively low cost technology onone side, some of them are free, while maintaining the simple way of data generation process. This solution can also be proposed to improve project management process maturity level to the next stage, like CMMI level 4 that promote project analytics. Index Terms—project management, project analytics, data analytics.


Author(s):  
Muhammad Salih Memon ◽  
Raheem Bux Soomro ◽  
Sajid Hussain Mirani ◽  
Mansoor Ahmed Soomro

Economic stability is remained on topmost priority of every country, and different measures were suggested by the researchers worldwide, by moving on the same track study was carried out to predict the currency valuation factors, data were collected from export promotion bureau, state bank of Pakistan, and ministry of finance for 25 years (1989-2013), by using linear regression; currency valuation as dependent variable, exports, changes in external debt, and total reserves as independent variables and concluded that only the exports of Pakistan is a right predictor of currency valuation of the country which policy makers must have incorporate in formation of economic policies and setting the targets before fiscal policy. 


2021 ◽  
Vol 22 (4) ◽  
pp. 1-30
Author(s):  
Sam Buss ◽  
Dmitry Itsykson ◽  
Alexander Knop ◽  
Artur Riazanov ◽  
Dmitry Sokolov

This article is motivated by seeking lower bounds on OBDD(∧, w, r) refutations, namely, OBDD refutations that allow weakening and arbitrary reorderings. We first work with 1 - NBP ∧ refutations based on read-once nondeterministic branching programs. These generalize OBDD(∧, r) refutations. There are polynomial size 1 - NBP(∧) refutations of the pigeonhole principle, hence 1-NBP(∧) is strictly stronger than OBDD}(∧, r). There are also formulas that have polynomial size tree-like resolution refutations but require exponential size 1-NBP(∧) refutations. As a corollary, OBDD}(∧, r) does not simulate tree-like resolution, answering a previously open question. The system 1-NBP(∧, ∃) uses projection inferences instead of weakening. 1-NBP(∧, ∃ k is the system restricted to projection on at most k distinct variables. We construct explicit constant degree graphs G n on n vertices and an ε > 0, such that 1-NBP(∧, ∃ ε n ) refutations of the Tseitin formula for G n require exponential size. Second, we study the proof system OBDD}(∧, w, r ℓ ), which allows ℓ different variable orders in a refutation. We prove an exponential lower bound on the complexity of tree-like OBDD(∧, w, r ℓ ) refutations for ℓ = ε log n , where n is the number of variables and ε > 0 is a constant. The lower bound is based on multiparty communication complexity.


2020 ◽  
pp. 002234332096215
Author(s):  
Sophia Dawkins

This article examines what scholars can learn about civilian killings from newswire data in situations of non-random missingness. It contributes to this understanding by offering a unique view of the data-generation process in the South Sudanese civil war. Drawing on 40 hours of interviews with 32 human rights advocates, humanitarian workers, and journalists who produce ACLED and UCDP-GED’s source data, the article illustrates how non-random missingness leads to biases of inconsistent magnitude and direction. The article finds that newswire data for contexts like South Sudan suffer from a self-fulfilling narrative bias, where journalists select stories and human rights investigators target incidents that conform to international views of what a conflict is about. This is compounded by the way agencies allocate resources to monitor specific locations and types of violence to fit strategic priorities. These biases have two implications: first, in the most volatile conflicts, point estimates about violence using newswire data may be impossible, and most claims of precision may be false; secondly, body counts reveal little if divorced from circumstance. The article presents a challenge to political methodologists by asking whether social scientists can build better cross-national fatality measures given the biases inherent in the data-generation process.


Sign in / Sign up

Export Citation Format

Share Document