statistical algorithms Latest Research Papers

As national statistical offices (NSOs) modernize, interest in integrating machine learning (ML) into official statisticians’ toolbox is growing. Two challenges to such an integration are the potential loss of transparency from using “black-boxes” and the need to develop a quality framework. In 2019, the High-Level Group for the Modernisation of Official Statistics (HLG-MOS) launched a project on machine learning with one of the objectives being to address these two challenges. One of the outputs of the HLG-MOS project is a Quality Framework for Statistical Algorithms (QF4SA). While many quality frameworks exist, they have been conceived with traditional methods in mind, and they tend to target statistical outputs. Currently, machine learning methods are being looked at for use in processes producing intermediate outputs, which lead to a final statistical output. Therefore, the QF4SA does not replace existing quality frameworks; it complements them. As the QF4SA targets intermediate outputs and not necessarily the final statistical output, it should be used in conjunction with existing quality frameworks to ensure that high-quality outputs are produced. This paper presents the QF4SA, as well as some recommendations for NSOs considering the use of machine learning in the production of official statistics.

Download Full-text

DBMS and Oracle Datamining

10.20944/preprints202103.0640.v1 ◽

2021 ◽

Author(s):

John N. Bernal ◽

Johanna P. Rodriguez ◽

Jorge Portella

Keyword(s):

Data Mining ◽

Association Rules ◽

Database Management ◽

Evolutionary History ◽

Database Management Systems ◽

The Other ◽

Data Sources ◽

Management Systems ◽

Statistical Algorithms

Databases are by far the most valuable asset of companies. Since the need was seen not only to count but also to have some type of record of elements such as crops, animals, money, properties and that this record could be consulted and modified according to the situation, that is where the first database was born. , and after that, these databases cannot be disorganized, they also need to be managed and administered under established standards that facilitate their understanding and management not only by their creators but by the other people who subsequently administer them. Databases and database management systems have an interesting evolutionary history that deserves to be analyzed and this is the objective of this document, where it is sought to understand. Along with databases and their management systems, data mining or Data mining arises that in order not to extend ourselves so much, it is the job of finding common patterns in various data sources and in what way they can be used to predict situations or results of various circumstances; We also focus on the other topic that we will present, Oracle data mining, which roughly is to merge data mining with Oracle, which makes it a powerful tool for obtaining information and predicting results based on statistics.In this article we will study and analyze the ideas, concepts and basic examples that make up SGBD and Data Mining and, we will try to go deeper into this topic, the use of decision techniques such as advanced statistical algorithms. We also present a fictitious example of the application of these techniques: predicting which products can be sold based on their relationship with others. we will give a brief explanation of association rules, data mining cycle and the types of learning and the evolution that data mining has had.

Download Full-text

Calling differential DNA methylation at cell-type resolution: avoiding misconceptions and promoting best practices

10.1101/2021.02.28.433245 ◽

2021 ◽

Author(s):

Han Jing ◽

Shijie C. Zheng ◽

Charles E. Breeze ◽

Stephan Beck ◽

Andrew E. Teschendorff

Keyword(s):

Dna Methylation ◽

Best Practices ◽

Important Task ◽

Disease Development ◽

Cell Type ◽

Recent Commentary ◽

Accurate Detection ◽

Cell Type Specific ◽

Statistical Algorithms

AbstractThe accurate detection of cell-type specific DNA methylation alterations in the context of general epigenome studies is an important task to improve our understanding of epigenomics in disease development. Although a number of statistical algorithms designed to address this problem have emerged, the task remains challenging. Here we show that a recent commentary by Rahmani et al, that aims to address misconceptions and best practices in the field, continues to suffer from critical misconceptions in how statistical algorithms should be compared and evaluated. In addition, we report contradictory results on real EWAS datasets.

Download Full-text

Empirical Study on Effects of Self-Correction in Crowdsourced Microtasks

Human Computation ◽

10.15346/hc.v8i.122 ◽

2021 ◽

Vol 8 ◽

Author(s):

Masaki Kobayashi ◽

Hiromi Morita ◽

Masaki Matsubara ◽

Nobuyuki Shimizu ◽

Atsuyuki Morishima

Keyword(s):

Perceptual Learning ◽

Learning Effect ◽

The Self ◽

Long Term Effects ◽

Learning Potential ◽

Second Stage ◽

Complementary Approach ◽

Additional Costs ◽

Statistical Algorithms

Self-correction for crowdsourced tasks is a two-stage setting that allows a crowd worker to review the task results of other workers; the worker is then given a chance to update their results according to the review.Self-correction was proposed as a complementary approach to statistical algorithms, in which workers independently perform the same task.It can provide higher-quality results with low additional costs. However, thus far, the effects have only been demonstrated in simulations, and empirical evaluations are required.In addition, as self-correction provides feedback to workers, an interesting question arises: whether perceptual learning is observed in self-correction tasks.This paper reports our experimental results on self-corrections with a real-world crowdsourcing service.We found that:(1) Self-correction is effective for making workers reconsider their judgments.(2) Self-correction is effective more if workers are shown the task results of higher-quality workers during the second stage.(3) A perceptual learning effect is observed in some cases. Self-correction can provide feedback that shows workers how to provide high-quality answers in future tasks.(4) A Perceptual learning effect is observed, particularly with workers who moderately change answers in the second stage. This suggests that we can measure the learning potential of workers.These findings imply that requesters/crowdsourcing services can construct a positive loop for improved task results by the self-correction approach.However, (5) no long-term effects of the self-correction task were transferred to other similar tasks in two different settings.

Download Full-text

Empirical Study on Effects of Self-Correction in Crowdsourced Microtasks

Human Computation ◽

10.15346/hc.v8i1.1 ◽

2021 ◽

Vol 8 ◽

Author(s):

Masaki Kobayashi ◽

Hiromi Morita ◽

Masaki Matsubara ◽

Nobuyuki Shimizu ◽

Atsuyuki Morishima

Keyword(s):

Perceptual Learning ◽

Learning Effect ◽

The Self ◽

Long Term Effects ◽

Learning Potential ◽

Second Stage ◽

Complementary Approach ◽

Additional Costs ◽

Statistical Algorithms

Self-correction for crowdsourced tasks is a two-stage setting that allows a crowd worker to review the task results of other workers; the worker is then given a chance to update their results according to the review.Self-correction was proposed as a complementary approach to statistical algorithms, in which workers independently perform the same task.It can provide higher-quality results with low additional costs. However, thus far, the effects have only been demonstrated in simulations, and empirical evaluations are required.In addition, as self-correction provides feedback to workers, an interesting question arises: whether perceptual learning is observed in self-correction tasks.This paper reports our experimental results on self-corrections with a real-world crowdsourcing service.We found that:(1) Self-correction is effective for making workers reconsider their judgments.(2) Self-correction is effective more if workers are shown the task results of higher-quality workers during the second stage.(3) A perceptual learning effect is observed in some cases. Self-correction can provide feedback that shows workers how to provide high-quality answers in future tasks.(4) A Perceptual learning effect is observed, particularly with workers who moderately change answers in the second stage. This suggests that we can measure the learning potential of workers.These findings imply that requesters/crowdsourcing services can construct a positive loop for improved task results by the self-correction approach.However, (5) no long-term effects of the self-correction task were transferred to other similar tasks in two different settings.

Download Full-text

Empirical Study on Effects of Self-Correction in Crowdsourced Microtasks

Human Computation ◽

10.15346/hc.v8i.1 ◽

2021 ◽

Vol 8 ◽

Author(s):

Masaki Kobayashi ◽

Hiromi Morita ◽

Masaki Matsubara ◽

Nobuyuki Shimizu ◽

Atsuyuki Morishima

Keyword(s):

Perceptual Learning ◽

Learning Effect ◽

The Self ◽

Long Term Effects ◽

Learning Potential ◽

Second Stage ◽

Complementary Approach ◽

Additional Costs ◽

Statistical Algorithms

Self-correction for crowdsourced tasks is a two-stage setting that allows a crowd worker to review the task results of other workers; the worker is then given a chance to update their results according to the review.Self-correction was proposed as a complementary approach to statistical algorithms, in which workers independently perform the same task.It can provide higher-quality results with low additional costs. However, thus far, the effects have only been demonstrated in simulations, and empirical evaluations are required.In addition, as self-correction provides feedback to workers, an interesting question arises: whether perceptual learning is observed in self-correction tasks.This paper reports our experimental results on self-corrections with a real-world crowdsourcing service.We found that:(1) Self-correction is effective for making workers reconsider their judgments.(2) Self-correction is effective more if workers are shown the task results of higher-quality workers during the second stage.(3) A perceptual learning effect is observed in some cases. Self-correction can provide feedback that shows workers how to provide high-quality answers in future tasks.(4) A Perceptual learning effect is observed, particularly with workers who moderately change answers in the second stage. This suggests that we can measure the learning potential of workers.These findings imply that requesters/crowdsourcing services can construct a positive loop for improved task results by the self-correction approach.However, (5) no long-term effects of the self-correction task were transferred to other similar tasks in two different settings.

Download Full-text

The Era of Radiogenomics in Precision Medicine: An Emerging Approach to Support Diagnosis, Treatment Decisions, and Prognostication in Oncology

Frontiers in Oncology ◽

10.3389/fonc.2020.570465 ◽

2021 ◽

Vol 10 ◽

Author(s):

Lin Shui ◽

Haoyu Ren ◽

Xi Yang ◽

Jian Li ◽

Ziwei Chen ◽

...

Keyword(s):

New Technologies ◽

State Of The Art ◽

Individualized Medicine ◽

Rapid Development ◽

Cost Effective ◽

Therapeutic Strategies ◽

Cost Effective Approach ◽

Routine Clinical Setting ◽

Statistical Algorithms ◽

Aided Diagnosis

With the rapid development of new technologies, including artificial intelligence and genome sequencing, radiogenomics has emerged as a state-of-the-art science in the field of individualized medicine. Radiogenomics combines a large volume of quantitative data extracted from medical images with individual genomic phenotypes and constructs a prediction model through deep learning to stratify patients, guide therapeutic strategies, and evaluate clinical outcomes. Recent studies of various types of tumors demonstrate the predictive value of radiogenomics. And some of the issues in the radiogenomic analysis and the solutions from prior works are presented. Although the workflow criteria and international agreed guidelines for statistical methods need to be confirmed, radiogenomics represents a repeatable and cost-effective approach for the detection of continuous changes and is a promising surrogate for invasive interventions. Therefore, radiogenomics could facilitate computer-aided diagnosis, treatment, and prediction of the prognosis in patients with tumors in the routine clinical setting. Here, we summarize the integrated process of radiogenomics and introduce the crucial strategies and statistical algorithms involved in current studies.

Download Full-text

Computer Eye-Tracking Model to Investigate Influence of the Viewer’s Perception of the Graphic Information

10.20948/graphicon-2021-3027-720-728 ◽

2021 ◽

Author(s):

Ekaterina Borevich ◽

Serg Mescheryakov ◽

Victor Yanchus

Keyword(s):

Eye Tracking ◽

Visual Information ◽

Movement Activity ◽

Parametric Data ◽

Effective Training ◽

Graphic Information ◽

Tracking Model ◽

Statistical Algorithms ◽

Photorealistic Images ◽

Stimulus Materials

The goal is to study the visual perception of graphic composition of various styles. An original author's method of conducting an experiment has been developed, which includes the preparation of stimulus material, collecting data, and statistical algorithms to analyze parametric data. The stimulus materials were based on graphic images in the cubism and abstractionism styles as well as on photorealistic images. An eye-tracking equipment was used to record eye movement activity and collect experimental data. The statistical analysis of the parametric data of the observer's viewing pattern has revealed that the viewer’s perception of visual information is more effective by observers with art education. The results are of importance for developing effective training and test systems for operators, users, GUI developers, etc.

Download Full-text

Developing political-ecological theory: The need for many-task computing

PLoS ONE ◽

10.1371/journal.pone.0226861 ◽

2020 ◽

Vol 15 (11) ◽

pp. e0226861

Author(s):

Timothy Haas

Keyword(s):

Prediction Error ◽

Ecological Model ◽

Ecological Systems ◽

East African ◽

Acinonyx Jubatus ◽

Management Plans ◽

Statistical Algorithms ◽

The One ◽

Cluster Computers ◽

Downloadable Code

Models of political-ecological systems can inform policies for managing ecosystems that contain endangered species. To increase the credibility of these models, massive computation is needed to statistically estimate the model’s parameters, compute confidence intervals for these parameters, determine the model’s prediction error rate, and assess its sensitivity to parameter misspecification. To meet this statistical and computational challenge, this article delivers statistical algorithms and a method for constructing ecosystem management plans that are coded as distributed computing applications. These applications can run on cluster computers, the cloud, or a collection of in-house workstations. This downloadable code is used to address the challenge of conserving the East African cheetah (Acinonyx jubatus). This demonstration means that the new standard of credibility that any political-ecological model needs to meet is the one given herein.

Download Full-text

Spatiotemporal ozone pollution LUR models: Suitable statistical algorithms and time scales for a megacity scale

Atmospheric Environment ◽

10.1016/j.atmosenv.2020.117671 ◽

2020 ◽

Vol 237 ◽

pp. 117671 ◽

Cited By ~ 2

Author(s):

Jiawei Wang ◽

Daniel S. Cohan ◽

He Xu

Keyword(s):

Time Scales ◽

Ozone Pollution ◽

Statistical Algorithms

Download Full-text

statistical algorithms
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

A quality framework for statistical algorithms

DBMS and Oracle Datamining

Calling differential DNA methylation at cell-type resolution: avoiding misconceptions and promoting best practices

Empirical Study on Effects of Self-Correction in Crowdsourced Microtasks

Empirical Study on Effects of Self-Correction in Crowdsourced Microtasks

Empirical Study on Effects of Self-Correction in Crowdsourced Microtasks

The Era of Radiogenomics in Precision Medicine: An Emerging Approach to Support Diagnosis, Treatment Decisions, and Prognostication in Oncology

Computer Eye-Tracking Model to Investigate Influence of the Viewer’s Perception of the Graphic Information

Developing political-ecological theory: The need for many-task computing

Spatiotemporal ozone pollution LUR models: Suitable statistical algorithms and time scales for a megacity scale

Export Citation Format

statistical algorithmsRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

A quality framework for statistical algorithms

DBMS and Oracle Datamining

Calling differential DNA methylation at cell-type resolution: avoiding misconceptions and promoting best practices

Empirical Study on Effects of Self-Correction in Crowdsourced Microtasks

Empirical Study on Effects of Self-Correction in Crowdsourced Microtasks

Empirical Study on Effects of Self-Correction in Crowdsourced Microtasks

The Era of Radiogenomics in Precision Medicine: An Emerging Approach to Support Diagnosis, Treatment Decisions, and Prognostication in Oncology

Computer Eye-Tracking Model to Investigate Influence of the Viewer’s Perception of the Graphic Information

Developing political-ecological theory: The need for many-task computing

Spatiotemporal ozone pollution LUR models: Suitable statistical algorithms and time scales for a megacity scale

statistical algorithms
Recently Published Documents