Statistical Model Fitting: A State-of-the-Art Review

JAMES R. KLUEGEL

doi:10.1037/016731

Imbalanced Learning Based on Logistic Discrimination

Computational Intelligence and Neuroscience ◽

10.1155/2016/5423204 ◽

2016 ◽

Vol 2016 ◽

pp. 1-10 ◽

Cited By ~ 3

Author(s):

Huaping Guo ◽

Weimei Zhi ◽

Hongbing Liu ◽

Mingliang Xu

Keyword(s):

Statistical Model ◽

Cost Function ◽

State Of The Art ◽

Class Imbalance ◽

Imbalanced Learning ◽

Learning Problem ◽

Logistic Discrimination ◽

Positive Class ◽

Negative Class ◽

Novel Method

In recent years, imbalanced learning problem has attracted more and more attentions from both academia and industry, and the problem is concerned with the performance of learning algorithms in the presence of data with severe class distribution skews. In this paper, we apply the well-known statistical model logistic discrimination to this problem and propose a novel method to improve its performance. To fully consider the class imbalance, we design a new cost function which takes into account the accuracies of both positive class and negative class as well as the precision of positive class. Unlike traditional logistic discrimination, the proposed method learns its parameters by maximizing the proposed cost function. Experimental results show that, compared with other state-of-the-art methods, the proposed one shows significantly better performance on measures of recall,g-mean,f-measure, AUC, and accuracy.

Download Full-text

barry and the BAO model comparison

Monthly Notices of the Royal Astronomical Society ◽

10.1093/mnras/staa361 ◽

2020 ◽

Vol 493 (3) ◽

pp. 4078-4093 ◽

Cited By ~ 5

Author(s):

Samuel R Hinton ◽

Cullan Howlett ◽

Tamara M Davis

Keyword(s):

Model Comparison ◽

State Of The Art ◽

Model Fitting ◽

Power Spectra ◽

Acoustic Oscillation ◽

Effective Field ◽

Sloan Digital Sky Survey ◽

Data Sets ◽

New Public ◽

Sky Survey

ABSTRACT We compare the performance of four state-of-the-art models for extracting isotropic measurements of the baryon acoustic oscillation (BAO) scale. To do this, we created a new, public, modular code barry, which contains data sets, model fitting tools, and model implementations incorporating different descriptions of non-linear physics and algorithms for isolating the BAO feature. These are then evaluated for bias, correlation, and fitting strength using mock power spectra and correlation functions developed for the Sloan Digital Sky Survey Data Release 12. Our main findings are as follows: (1) all of the models can recover unbiased constraints when fit to the pre- and post-reconstruction simulations. (2) Models that provide physical descriptions of the damping of the BAO feature (using e.g. standard perturbation or effective-field theory arguments) report smaller errors on average, although the distribution of mock χ2 values indicates these are underestimated. (3) Allowing the BAO damping scale to vary can provide tighter constraints for some mocks, but is an artificial improvement that only arises when noise randomly sharpens the BAO peak. (4) Unlike recent claims in the literature when utilizing a BAO Extractor technique, we find no improvement in the accuracy of the recovered BAO scale. (5) We implement a procedure for combining all models into a single consensus result that improves over the standard method without obviously underestimating the uncertainties. Overall, barry provides a framework for performing the cosmological analyses for upcoming surveys, and for rapidly testing and validating new models.

Download Full-text

Outlier Detection Based on Residual Histogram Preference for Geometric Multi-Model Fitting

Sensors ◽

10.3390/s20113037 ◽

2020 ◽

Vol 20 (11) ◽

pp. 3037

Author(s):

Xi Zhao ◽

Yun Zhang ◽

Shoulie Xie ◽

Qianqing Qin ◽

Shiqian Wu ◽

...

Keyword(s):

Outlier Detection ◽

State Of The Art ◽

Geometric Model ◽

Model Fitting ◽

Distribution Model ◽

Data Sets ◽

Analysis Method ◽

Detection Scheme ◽

The Impact ◽

Better Than

Geometric model fitting is a fundamental issue in computer vision, and the fitting accuracy is affected by outliers. In order to eliminate the impact of the outliers, the inlier threshold or scale estimator is usually adopted. However, a single inlier threshold cannot satisfy multiple models in the data, and scale estimators with a certain noise distribution model work poorly in geometric model fitting. It can be observed that the residuals of outliers are big for all true models in the data, which makes the consensus of the outliers. Based on this observation, we propose a preference analysis method based on residual histograms to study the outlier consensus for outlier detection in this paper. We have found that the outlier consensus makes the outliers gather away from the inliers on the designed residual histogram preference space, which is quite convenient to separate outliers from inliers through linkage clustering. After the outliers are detected and removed, a linkage clustering with permutation preference is introduced to segment the inliers. In addition, in order to make the linkage clustering process stable and robust, an alternative sampling and clustering framework is proposed in both the outlier detection and inlier segmentation processes. The experimental results also show that the outlier detection scheme based on residual histogram preference can detect most of the outliers in the data sets, and the fitting results are better than most of the state-of-the-art methods in geometric multi-model fitting.

Download Full-text

Statistical Model Fitting of Remote Induction Sounding Data from Underground Coal Gasification Site---Hanna II, Phases 2 And 3

IEEE Transactions on Geoscience and Remote Sensing ◽

10.1109/tgrs.1981.350325 ◽

1981 ◽

Vol GE-19 (1) ◽

pp. 29-42 ◽

Cited By ~ 1

Author(s):

Edmund A. Quincy ◽

Jack H. Richmond ◽

Mark L. Rhoades ◽

Mizanur M. Rahman

Keyword(s):

Statistical Model ◽

Coal Gasification ◽

Model Fitting ◽

Underground Coal Gasification ◽

Induction Sounding

Download Full-text

Robust segmentation of visual data using ranked unbiased scale estimate

Robotica ◽

10.1017/s0263574799001812 ◽

1999 ◽

Vol 17 (6) ◽

pp. 649-660 ◽

Cited By ~ 62

Author(s):

Alireza Bab-Hadiashar ◽

David Suter

Keyword(s):

Statistical Model ◽

Model Fitting ◽

Estimation Method ◽

Image Motion ◽

Range Data ◽

Data Segmentation ◽

Robust Segmentation ◽

Optimization Routine ◽

Complex Optimization ◽

Scale Estimate

A method of data segmentation, based upon robust least K-th order statistical model fitting (LKS), is proposed and applied to image motion and range data segmentation. The estimation method differs from other approaches using versions of LKS in a number of important ways. Firstly, the value of K is not determined by a complex optimization routine. Secondly, having chosen a fit, the estimation of scale of the noise is not based upon the K-th order statistic of the residuals. Other aspects of the full segmentation scheme include the use of segment contiguity to: (a) reduce the number of random sample fits used in the LKS stage, and (b) to “fill-in” holes caused by isolated miss-classified data.

Download Full-text

Statistical Metaphor Processing

Computational Linguistics ◽

10.1162/coli_a_00124 ◽

2013 ◽

Vol 39 (2) ◽

pp. 301-353 ◽

Cited By ~ 36

Author(s):

Ekaterina Shutova ◽

Simone Teufel ◽

Anna Korhonen

Keyword(s):

Statistical Model ◽

Real World ◽

State Of The Art ◽

High Accuracy ◽

Lexical Acquisition ◽

Open Domain ◽

Text Model ◽

Metaphor Interpretation ◽

Metaphor Processing ◽

Minimally Supervised

Metaphor is highly frequent in language, which makes its computational processing indispensable for real-world NLP applications addressing semantic tasks. Previous approaches to metaphor modeling rely on task-specific hand-coded knowledge and operate on a limited domain or a subset of phenomena. We present the first integrated open-domain statistical model of metaphor processing in unrestricted text. Our method first identifies metaphorical expressions in running text and then paraphrases them with their literal paraphrases. Such a text-to-text model of metaphor interpretation is compatible with other NLP applications that can benefit from metaphor resolution. Our approach is minimally supervised, relies on the state-of-the-art parsing and lexical acquisition technologies (distributional clustering and selectional preference induction), and operates with a high accuracy.

Download Full-text

Hypergraph Optimization for Multi-Structural Geometric Model Fitting

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33018730 ◽

2019 ◽

Vol 33 ◽

pp. 8730-8737

Author(s):

Shuyuan Lin ◽

Guobao Xiao ◽

Yan Yan ◽

David Suter ◽

Hanzi Wang

Keyword(s):

Spectral Clustering ◽

Input Data ◽

State Of The Art ◽

Geometric Model ◽

Model Fitting ◽

Synthetic Data ◽

Estimation Algorithm ◽

Sampling Efficiency ◽

Data Points ◽

Fitting In

Recently, some hypergraph-based methods have been proposed to deal with the problem of model fitting in computer vision, mainly due to the superior capability of hypergraph to represent the complex relationship between data points. However, a hypergraph becomes extremely complicated when the input data include a large number of data points (usually contaminated with noises and outliers), which will significantly increase the computational burden. In order to overcome the above problem, we propose a novel hypergraph optimization based model fitting (HOMF) method to construct a simple but effective hypergraph. Specifically, HOMF includes two main parts: an adaptive inlier estimation algorithm for vertex optimization and an iterative hyperedge optimization algorithm for hyperedge optimization. The proposed method is highly efficient, and it can obtain accurate model fitting results within a few iterations. Moreover, HOMF can then directly apply spectral clustering, to achieve good fitting performance. Extensive experimental results show that HOMF outperforms several state-of-the-art model fitting methods on both synthetic data and real images, especially in sampling efficiency and in handling data with severe outliers.

Download Full-text

Using mixed-effects modeling to estimate decay kinetics of response to SARS-CoV-2 infection

Antibody Therapeutics ◽

10.1093/abt/tbab013 ◽

2021 ◽

Author(s):

D Bottino ◽

G Hather ◽

L Yuan ◽

M Stoddard ◽

L White ◽

...

Keyword(s):

Longitudinal Data ◽

Statistical Model ◽

Half Life ◽

Model Fitting ◽

Mixed Effects ◽

Natural Immunity ◽

Decay Kinetics ◽

Mixed Effects Modeling ◽

Kinetics Of ◽

Over Time

Abstract The duration of natural immunity in response to SARS-CoV-2 is a matter of some debate in the literature at present. For example, in a recent publication characterizing SARS-CoV-2 immunity over time, the authors fit pooled longitudinal data, using fitted slopes to infer the duration of SARS-CoV-2 immunity. In fact, such approaches can lead to misleading conclusions as a result of statistical model-fitting artifacts. To exemplify this phenomenon, we reanalyzed one of the markers (pseudovirus neutralizing titer) in the publication, using mixed-effects modeling, a methodology better suited to longitudinal datasets like these. Our findings showed that the half-life was both longer and more variable than reported by the authors. The example selected by us here illustrates the utility of mixed-effects modeling in provide more accurate estimates of the duration and heterogeneity of half-lives of molecular and cellular biomarkers of SARS-CoV-2 immunity.

Download Full-text

Fruit Detection and Pose Estimation for Grape Cluster–Harvesting Robot Using Binocular Imagery Based on Deep Neural Networks

Frontiers in Robotics and AI ◽

10.3389/frobt.2021.626989 ◽

2021 ◽

Vol 8 ◽

Author(s):

Wei Yin ◽

Hanjin Wen ◽

Zhengtong Ning ◽

Jian Ye ◽

Zhiqiang Dong ◽

...

Keyword(s):

Point Cloud ◽

Deep Neural Networks ◽

State Of The Art ◽

Model Fitting ◽

Average Precision ◽

Detection Algorithms ◽

Point Cloud Segmentation ◽

Cylinder Model ◽

Ransac Algorithm ◽

Harvesting Robot

Reliable and robust fruit-detection algorithms in nonstructural environments are essential for the efficient use of harvesting robots. The pose of fruits is crucial to guide robots to approach target fruits for collision-free picking. To achieve accurate picking, this study investigates an approach to detect fruit and estimate its pose. First, the state-of-the-art mask region convolutional neural network (Mask R-CNN) is deployed to segment binocular images to output the mask image of the target fruit. Next, a grape point cloud extracted from the images was filtered and denoised to obtain an accurate grape point cloud. Finally, the accurate grape point cloud was used with the RANSAC algorithm for grape cylinder model fitting, and the axis of the cylinder model was used to estimate the pose of the grape. A dataset was acquired in a vineyard to evaluate the performance of the proposed approach in a nonstructural environment. The fruit detection results of 210 test images show that the average precision, recall, and intersection over union (IOU) are 89.53, 95.33, and 82.00%, respectively. The detection and point cloud segmentation for each grape took approximately 1.7 s. The demonstrated performance of the developed method indicates that it can be applied to grape-harvesting robots.

Download Full-text

G2MF-WA: Geometric multi-model fitting with weakly annotated data

Computational Visual Media ◽

10.1007/s41095-020-0166-8 ◽

2020 ◽

Vol 6 (2) ◽

pp. 135-145

Author(s):

Chao Zhang ◽

Xuequan Lu ◽

Katsuya Hotta ◽

Xi Yang

Keyword(s):

Prior Knowledge ◽

High Probability ◽

State Of The Art ◽

Model Fitting ◽

Homography Estimation ◽

Data Points ◽

Novel Method ◽

Art Techniques

Abstract In this paper we address the problem of geometric multi-model fitting using a few weakly annotated data points, which has been little studied so far. In weak annotating (WA), most manual annotations are supposed to be correct yet inevitably mixed with incorrect ones. SuchWA data can naturally arise through interaction in various tasks. For example, in the case of homography estimation, one can easily annotate points on the same plane or object with a single label by observing the image. Motivated by this, we propose a novel method to make full use of WA data to boost multi-model fitting performance. Specifically, a graph for model proposal sampling is first constructed using the WA data, given the prior that WA data annotated with the same weak label has a high probability of belonging to the same model. By incorporating this prior knowledge into the calculation of edge probabilities, vertices (i.e., data points) lying on or near the latent model are likely to be associated and further form a subset or cluster for effective proposal generation. Having generated proposals, a-expansion is used for labeling, and our method in return updates the proposals. This procedure works in an iterative way. Extensive experiments validate our method and show that it produces noticeably better results than state-of-the-art techniques in most cases.

Download Full-text