Reasoning about outliers by modelling noisy data

Author(s):  
John X Wu ◽  
Gongxian Cheng ◽  
Xiaohui Liu
Keyword(s):  
2014 ◽  
Vol 2 (1) ◽  
pp. 1
Author(s):  
Richard Schwartz
Keyword(s):  

Entropy ◽  
2021 ◽  
Vol 23 (6) ◽  
pp. 727
Author(s):  
Eric J. Ma ◽  
Arkadij Kummer

We present a case study applying hierarchical Bayesian estimation on high-throughput protein melting-point data measured across the tree of life. We show that the model is able to impute reasonable melting temperatures even in the face of unreasonably noisy data. Additionally, we demonstrate how to use the variance in melting-temperature posterior-distribution estimates to enable principled decision-making in common high-throughput measurement tasks, and contrast the decision-making workflow against simple maximum-likelihood curve-fitting. We conclude with a discussion of the relative merits of each workflow.


2021 ◽  
Vol 15 ◽  
pp. 174830262110084
Author(s):  
Bishnu P Lamichhane ◽  
Elizabeth Harris ◽  
Quoc Thong Le Gia

We compare a recently proposed multivariate spline based on mixed partial derivatives with two other standard splines for the scattered data smoothing problem. The splines are defined as the minimiser of a penalised least squares functional. The penalties are based on partial differential operators, and are integrated using the finite element method. We compare three methods to two problems: to remove the mixture of Gaussian and impulsive noise from an image, and to recover a continuous function from a set of noisy observations.


2021 ◽  
Vol 11 (11) ◽  
pp. 5123
Author(s):  
Maiada M. Mahmoud ◽  
Nahla A. Belal ◽  
Aliaa Youssif

Transcription factors (TFs) are proteins that control the transcription of a gene from DNA to messenger RNA (mRNA). TFs bind to a specific DNA sequence called a binding site. Transcription factor binding sites have not yet been completely identified, and this is considered to be a challenge that could be approached computationally. This challenge is considered to be a classification problem in machine learning. In this paper, the prediction of transcription factor binding sites of SP1 on human chromosome1 is presented using different classification techniques, and a model using voting is proposed. The highest Area Under the Curve (AUC) achieved is 0.97 using K-Nearest Neighbors (KNN), and 0.95 using the proposed voting technique. However, the proposed voting technique is more efficient with noisy data. This study highlights the applicability of the voting technique for the prediction of binding sites, and highlights the outperformance of KNN on this type of data. The study also highlights the significance of using voting.


2021 ◽  
Vol 11 (9) ◽  
pp. 3876
Author(s):  
Weiming Mai ◽  
Raymond S. T. Lee

Chart patterns are significant for financial market behavior analysis. Lots of approaches have been proposed to detect specific patterns in financial time series data, most of them can be categorized as distance-based or training-based. In this paper, we applied a trainable continuous Hopfield Neural Network for financial time series pattern matching. The Perceptually Important Points (PIP) segmentation method is used as the data preprocessing procedure to reduce the fluctuation. We conducted a synthetic data experiment on both high-level noisy data and low-level noisy data. The result shows that our proposed method outperforms the Template Based (TB) and Euclidean Distance (ED) and has an advantage over Dynamic Time Warping (DTW) in terms of the processing time. That indicates the Hopfield network has a potential advantage over other distance-based matching methods.


AIChE Journal ◽  
2021 ◽  
Author(s):  
Zhe Wu ◽  
David Rincon ◽  
Junwei Luo ◽  
Panagiotis D. Christofides

Sign in / Sign up

Export Citation Format

Share Document