Getting the model right: an information criterion for spectroscopy

Abstract The optically stimulated luminescence (OSL) decay curve is assumed to consist of a number of first-order exponential components. Improper estimation of the number of components leads to under-or over-fitting of the curve under consideration. Hence, correct estimation of the number of components is important to accurately analyze an OSL decay curve. In this study, we investigated the possibility of using the Bayesian Information Criterion to estimate the optimal number of components in an OSL decay curve. We tested the reliability of this method using several hundred measured decay curves and three simulation scenarios. Our results demonstrate that the quality of the identification can be influenced by several factors: the measurement time and the number of channels; the variability of the decay constants; and the signal-to-noise ratios of a decaying component. The results also suggest that the Bayesian Information Criterion has great potential to estimate the number of components in an OSL decay curve with a moderate to high signal-to-noise ratio.

Download Full-text

SEMIPARAMETRIC CLUSTERING METHOD FOR MICROARRAY DATA ANALYSIS

Journal of Bioinformatics and Computational Biology ◽

10.1142/s021972000800345x ◽

2008 ◽

Vol 06 (02) ◽

pp. 261-282 ◽

Cited By ~ 2

Author(s):

AO YUAN ◽

WENQING HE

Keyword(s):

Data Analysis ◽

Microarray Data ◽

Mixture Distribution ◽

Information Criterion ◽

Optimal Number ◽

Microarray Data Analysis ◽

Parametric Methods ◽

Clustering Methods ◽

Microarray Gene Expression ◽

Data Set

Clustering is a major tool for microarray gene expression data analysis. The existing clustering methods fall mainly into two categories: parametric and nonparametric. The parametric methods generally assume a mixture of parametric subdistributions. When the mixture distribution approximately fits the true data generating mechanism, the parametric methods perform well, but not so when there is nonnegligible deviation between them. On the other hand, the nonparametric methods, which usually do not make distributional assumptions, are robust but pay the price for efficiency loss. In an attempt to utilize the known mixture form to increase efficiency, and to free assumptions about the unknown subdistributions to enhance robustness, we propose a semiparametric method for clustering. The proposed approach possesses the form of parametric mixture, with no assumptions to the subdistributions. The subdistributions are estimated nonparametrically, with constraints just being imposed on the modes. An expectation-maximization (EM) algorithm along with a classification step is invoked to cluster the data, and a modified Bayesian information criterion (BIC) is employed to guide the determination of the optimal number of clusters. Simulation studies are conducted to assess the performance and the robustness of the proposed method. The results show that the proposed method yields reasonable partition of the data. As an illustration, the proposed method is applied to a real microarray data set to cluster genes.

Download Full-text

Research on LIFT Technology for 3D Seismic Denoising in Coalfield

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.556-562.6328 ◽

2014 ◽

Vol 556-562 ◽

pp. 6328-6331

Author(s):

Su Zhen Shi ◽

Yi Chen Zhao ◽

Li Biao Yang ◽

Yao Tang ◽

Juan Li

Keyword(s):

Signal To Noise Ratio ◽

High Signal ◽

3D Seismic ◽

Signal To Noise ◽

Denoising Method ◽

Seismic Processing ◽

Work Area ◽

Noise Data ◽

Noise Ratio ◽

Effective Signal

The LIFT technology has applied in process of denoising to ensure the imaging precision of minor faults and structure in 3D coalfield seismic processing. The paper focused on the denoising process in two study areas where the LIFT technology is used. The separation of signal and noise is done firstly. Then denoising would be done in the noise data. The Data of weak effective signal that is from the noise data could be blended with the original effective signal to reconstruct the denoising data, so the result which has high signal-to-noise ratio and preserved amplitude is acquired. Thus the fact shows that LIFT is an effective denoising method for 3D seismic in coalfield and could be used widely in other work area.

Download Full-text

Recent Results on Abundance Determinations

International Astronomical Union Colloquium ◽

10.1017/s0252921100020273 ◽

1993 ◽

Vol 138 ◽

pp. 27-41

Author(s):

Saul J. Adelman

Keyword(s):

Signal To Noise Ratio ◽

Model Atmosphere ◽

Bulletin Board ◽

High Signal ◽

Rotating Stars ◽

Signal To Noise ◽

Peculiar Stars ◽

Noise Data ◽

Noise Ratio ◽

Moving Groups

AbstractI review abundance determinations of normal B5-F4 and peculiar stars published since 1984. Several analyses performed with photographic spectrograms indicate interesting stars which should be analyzed with high signal-to-noise data. Studies of stars of known ages which belong to clusters, associations, and moving groups should led to the most direct confrontations with theory. The increase in signal-to-noise ratio provided by electronic detectors with respect to photographic plates should allow accurate analyses of moderating rotating stars. High resolution, high signal-to-noise ratio studies have revealed crucial information about the line profiles of Sirius, Vega, and other A stars. It would aid comparison of analyses if we could agree on a standard set of gf-values and line damping constants. A computer bulletin board would be a useful means to provide and maintain such data as well as model atmosphere codes.

Download Full-text

Assisted traveltime picking of crosshole GPR data

Geophysics ◽

10.1190/1.3141002 ◽

2009 ◽

Vol 74 (4) ◽

pp. J35-J48 ◽

Cited By ~ 9

Author(s):

Bernard Giroux ◽

Abderrezak Bouchedda ◽

Michel Chouteau

Keyword(s):

Time Window ◽

Signal To Noise Ratio ◽

Real Data ◽

Information Criterion ◽

High Signal ◽

Data Sets ◽

Signal To Noise ◽

Arrival Times ◽

Common Time ◽

Waveform Similarity

We introduce two new traveltime picking schemes developed specifically for crosshole ground-penetrating radar (GPR) applications. The main objective is to automate, at least partially, the traveltime picking procedure and to provide first-arrival times that are closer in quality to those of manual picking approaches. The first scheme is an adaptation of a method based on cross-correlation of radar traces collated in gathers according to their associated transmitter-receiver angle. A detector is added to isolate the first cycle of the radar wave and to suppress secon-dary arrivals that might be mistaken for first arrivals. To improve the accuracy of the arrival times obtained from the crosscorrelation lags, a time-rescaling scheme is implemented to resize the radar wavelets to a common time-window length. The second method is based on the Akaike information criterion(AIC) and continuous wavelet transform (CWT). It is not tied to the restrictive criterion of waveform similarity that underlies crosscorrelation approaches, which is not guaranteed for traces sorted in common ray-angle gathers. It has the advantage of being automated fully. Performances of the new algorithms are tested with synthetic and real data. In all tests, the approach that adds first-cycle isolation to the original crosscorrelation scheme improves the results. In contrast, the time-rescaling approach brings limited benefits, except when strong dispersion is present in the data. In addition, the performance of crosscorrelation picking schemes degrades for data sets with disparate waveforms despite the high signal-to-noise ratio of the data. In general, the AIC-CWT approach is more versatile and performs well on all data sets. Only with data showing low signal-to-noise ratios is the AIC-CWT superseded by the modified crosscorrelation picker.

Download Full-text

Spectropolarimetry of Type IIn SN2010jl: Peering Into the Heart of a Monster

Proceedings of the International Astronomical Union ◽

10.1017/s1743921312013178 ◽

2011 ◽

Vol 7 (S279) ◽

pp. 325-326 ◽

Cited By ~ 1

Author(s):

Franz E. Bauer ◽

Paula Zelaya ◽

Alejandro Clocchiatti ◽

Justyn Maund

Keyword(s):

High Signal ◽

Signal To Noise ◽

Polarization Signal ◽

Noise Data ◽

The Continuum ◽

Complex Origin

AbstractWe report results for two epochs of spectropolarimetry on the luminous type IIn SN2010jl, taken at ≈36 and 85 days post-explosion with VLT FORS2-PMOS. The high signal-to-noise data demonstrate distinct evolution in the continuum and the broad lines point to a complex origin for the various emission components and to a potentially common polarization signal for the type IIn class even over 1-2 orders of magnitude in luminosity output.

Download Full-text

Optimal exponent-pairs for the Bertalanffy-Pütter growth model

10.7287/peerj.preprints.27152 ◽

2018 ◽

Author(s):

Katharina Renner-Martin ◽

Norbert Brunner ◽

Manfred Kühleitner ◽

Werner-Georg Nowak ◽

Klaus Scheicher

Keyword(s):

Differential Equation ◽

Growth Model ◽

Explanatory Power ◽

Information Criterion ◽

Optimal Growth ◽

Growth Function ◽

Data Fitting ◽

Model Parameters ◽

Data Set ◽

Von Bertalanffy Growth Function

The Bertalanffy-Pütter growth model describes mass m at age t by means of the differential equation dm/dt = p⋅ma−q⋅mb. The special case using the Bertalanffy exponent-pair a=2/3 and b=1 is most common (it corresponds to the von Bertalanffy growth function VBGF for length in fishery literature). For data fitting using general exponents, five model parameters need to be optimized, the pair a<b of non-negative exponents, the non-negative constants p and q, and a positive initial value m0 for the differential equation. For the case b=1 it is known that for most fish data any exponent a<1 could be used to model growth without affecting the fit to the data significantly (when the other parameters p, q, m0 were optimized). Thereby, data fitting used the method of least squares, minimizing the sum of squared errors (SSE). It was conjectured that the optimization of both exponents would result in a significantly better fit of the optimal growth function to the data and thereby reduce SSE. This conjecture was tested for a data set for the mass-growth of Walleye (Sander vitreus), a fish from Lake Erie, USA. Compared to the Bertalanffy exponent-pair the optimal exponent-pair achieved a reduction of SSE by 10%. However, when the optimization of additional parameters was penalized, using the Akaike information criterion (AIC), then the optimal exponent-pair model had a higher (worse) AIC, when compared to the Bertalanffy exponent-pair. Thereby SSE and AIC are different ways to compare models. SSE is used, when predictive power is needed alone, and AIC is used, when simplicity of the model and explanatory power are needed.

Download Full-text

A review and appraisal of arrival-time picking methods for downhole microseismic data

Geophysics ◽

10.1190/geo2014-0500.1 ◽

2016 ◽

Vol 81 (2) ◽

pp. KS71-KS91 ◽

Cited By ~ 75

Author(s):

Jubran Akram ◽

David W. Eaton

Keyword(s):

Arrival Time ◽

Hybrid Methods ◽

Optimal Parameter ◽

Signal To Noise Ratio ◽

Information Criterion ◽

High Signal ◽

Data Set ◽

Single Level ◽

Microseismic Data ◽

Arrival Time Picking

We have evaluated arrival-time picking algorithms for downhole microseismic data. The picking algorithms that we considered may be classified as window-based single-level methods (e.g., energy-ratio [ER] methods), nonwindow-based single-level methods (e.g., Akaike information criterion), multilevel- or array-based methods (e.g., crosscorrelation approaches), and hybrid methods that combine a number of single-level methods (e.g., Akazawa’s method). We have determined the key parameters for each algorithm and developed recommendations for optimal parameter selection based on our analysis and experience. We evaluated the performance of these algorithms with the use of field examples from a downhole microseismic data set recorded in western Canada as well as with pseudo-synthetic microseismic data generated by adding 100 realizations of Gaussian noise to high signal-to-noise ratio microseismic waveforms. ER-based algorithms were found to be more efficient in terms of computational speed and were therefore recommended for real-time microseismic data processing. Based on the performance on pseudo-synthetic and field data sets, we found statistical, hybrid, and multilevel crosscorrelation methods to be more efficient in terms of accuracy and precision. Pick errors for S-waves are reduced significantly when data are preconditioned by applying a transformation into ray-centered coordinates.

Download Full-text

Statistical methods of fracture characterization using acoustic borehole televiewer log interpretation

10.26686/wgtn.13876259 ◽

2021 ◽

Author(s):

C Massiot ◽

John Townend ◽

A Nicol ◽

DD McNamara

Keyword(s):

Probability Distributions ◽

Likelihood Estimation ◽

Information Criterion ◽

Geothermal Field ◽

Information Criteria ◽

Clustering Methods ◽

Agglomerative Clustering ◽

Data Set ◽

Data Points ◽

Borehole Televiewer

©2017. American Geophysical Union. All Rights Reserved. Acoustic borehole televiewer (BHTV) logs provide measurements of fracture attributes (orientations, thickness, and spacing) at depth. Orientation, censoring, and truncation sampling biases similar to those described for one-dimensional outcrop scanlines, and other logging or drilling artifacts specific to BHTV logs, can affect the interpretation of fracture attributes from BHTV logs. K-means, fuzzy K-means, and agglomerative clustering methods provide transparent means of separating fracture groups on the basis of their orientation. Fracture spacing is calculated for each of these fracture sets. Maximum likelihood estimation using truncated distributions permits the fitting of several probability distributions to the fracture attribute data sets within truncation limits, which can then be extrapolated over the entire range where they naturally occur. Akaike Information Criterion (AIC) and Schwartz Bayesian Criterion (SBC) statistical information criteria rank the distributions by how well they fit the data. We demonstrate these attribute analysis methods with a data set derived from three BHTV logs acquired from the high-temperature Rotokawa geothermal field, New Zealand. Varying BHTV log quality reduces the number of input data points, but careful selection of the quality levels where fractures are deemed fully sampled increases the reliability of the analysis. Spacing data analysis comprising up to 300 data points and spanning three orders of magnitude can be approximated similarly well (similar AIC rankings) with several distributions. Several clustering configurations and probability distributions can often characterize the data at similar levels of statistical criteria. Thus, several scenarios should be considered when using BHTV log data to constrain numerical fracture models.

Download Full-text

Statistical methods of fracture characterization using acoustic borehole televiewer log interpretation

10.26686/wgtn.13876259.v1 ◽

2021 ◽

Author(s):

C Massiot ◽

John Townend ◽

A Nicol ◽

DD McNamara

Keyword(s):

Probability Distributions ◽

Likelihood Estimation ◽

Information Criterion ◽

Geothermal Field ◽

Information Criteria ◽

Clustering Methods ◽

Agglomerative Clustering ◽

Data Set ◽

Data Points ◽

Borehole Televiewer

©2017. American Geophysical Union. All Rights Reserved. Acoustic borehole televiewer (BHTV) logs provide measurements of fracture attributes (orientations, thickness, and spacing) at depth. Orientation, censoring, and truncation sampling biases similar to those described for one-dimensional outcrop scanlines, and other logging or drilling artifacts specific to BHTV logs, can affect the interpretation of fracture attributes from BHTV logs. K-means, fuzzy K-means, and agglomerative clustering methods provide transparent means of separating fracture groups on the basis of their orientation. Fracture spacing is calculated for each of these fracture sets. Maximum likelihood estimation using truncated distributions permits the fitting of several probability distributions to the fracture attribute data sets within truncation limits, which can then be extrapolated over the entire range where they naturally occur. Akaike Information Criterion (AIC) and Schwartz Bayesian Criterion (SBC) statistical information criteria rank the distributions by how well they fit the data. We demonstrate these attribute analysis methods with a data set derived from three BHTV logs acquired from the high-temperature Rotokawa geothermal field, New Zealand. Varying BHTV log quality reduces the number of input data points, but careful selection of the quality levels where fractures are deemed fully sampled increases the reliability of the analysis. Spacing data analysis comprising up to 300 data points and spanning three orders of magnitude can be approximated similarly well (similar AIC rankings) with several distributions. Several clustering configurations and probability distributions can often characterize the data at similar levels of statistical criteria. Thus, several scenarios should be considered when using BHTV log data to constrain numerical fracture models.

Download Full-text