Advances in Adaptive Data Analysis
Latest Publications


TOTAL DOCUMENTS

157
(FIVE YEARS 0)

H-INDEX

25
(FIVE YEARS 0)

Published By World Scientific

1793-7175, 1793-5369

2015 ◽  
Vol 07 (01n02) ◽  
pp. 1550002 ◽  
Author(s):  
Stan Lipovetsky ◽  
Michael Conklin

Maximum difference (MaxDiff) is a discrete choice modeling approach widely used in marketing research for finding utilities and preference probabilities among multiple alternatives. It can be seen as an extension of the paired comparison in Thurstone and Bradley–Terry techniques for the simultaneous presenting of three, four or more items to respondents. A respondent identifies the best and the worst ones, so the remaining are deemed intermediate by preference alternatives. Estimation of individual utilities is usually performed in a hierarchical Bayesian (HB)-multinomial-logit (MNL) modeling. MNL model can be reduced to a logit model by the data composed of two specially constructed design matrices of the prevalence from the best and the worst sides. The composed data can be of a large size which makes logistic modeling less precise and very consuming in computer time and memory. This paper describes how the results for utilities and choice probabilities can be obtained from the raw data, and instead of HB methods the empirical Bayes techniques can be applied. This approach enriches MaxDiff and is useful for estimations on large data sets. The results of analytical approach are compared with HB-MNL and several other techniques.


2015 ◽  
Vol 07 (01n02) ◽  
pp. 1550005 ◽  
Author(s):  
Leonard J. Pietrafesa ◽  
Shaowu Bao ◽  
Tingzhuang Yan ◽  
Michael Slattery ◽  
Paul T. Gayes

Significant portions of the United States (U.S.) property, commerce and ecosystem assets are located at or near the coast, making them vulnerable to sea level variability and change, especially relative rises. Although global mean sea level (MSL) and sea level rise (SLR) are fundamental considerations, regional mean sea level (RSL) variability along the boundaries of U.S. along the two ocean basins are critical, particularly if the amplitudes of seasonal to annual to inter-annual variability is high. Of interest is that the conventional wisdom of the U.S. agencies, the National Aeronautics and Space Administration (NASA) and the National Oceanic and Atmospheric Administration (NOAA) which both contend that the sources of sea level rise are related principally to heat absorption and release by the ocean(s) to the atmosphere and vice versa, and by Polar glacier melting and freshwater input into the ocean(s). While these phenomena are of great importance to SLR and sea level variability (SLV), we assess a suite of climate factors and the Gulf Stream, for evidence of correlations and thus possible influences; though causality is beyond the scope of this study. In this study, climate factors related to oceanic and atmospheric heat purveyors and reservoirs are analyzed and assessed for possible correlations with sea level variability and overall trends on actionable scales (localized as opposed to global scale). The results confirm that oceanic and atmospheric temperature variability and the disposition of heat accumulation or the lack thereof, are important players in sea level variability and rise, but also that the Atlantic Multi-Decadal Oscillation, the El Niño-Southern Oscillation, the Pacific Decadal Oscillation, the Arctic Oscillation, the Quasi-Biennial Oscillation, the North Atlantic Oscillation, Solar Irradiance, the Western Boundary Current-Gulf Stream, and other climate factors, can have strong correlative and perhaps even causal, modulating effects on the monthly to seasonal to annual to inter-annual to decadal to multi-decadal sea level variability at the community level.


2015 ◽  
Vol 07 (01n02) ◽  
pp. 1550003 ◽  
Author(s):  
Adam Huang ◽  
Min-Yin Liu ◽  
Wei-Te Yu

We propose using a rolling ball algorithm, which moves a ball along a time series signal, to sort the local extrema within a signal according to a geometric tangibility criterion. Letting the ball always roll above or below the signal, we are able to classify the signal’s extrema according to their tangibility: touched or not touched by the ball. Applying this ball-tangibility information to select an extremum in the empirical mode decomposition (EMD) algorithm, we are able to prevent the mode mixing problem in analyzing intermittent signals and decompose mode functions satisfying bandpass filtering properties.


2015 ◽  
Vol 07 (01n02) ◽  
pp. 1550004 ◽  
Author(s):  
Gregori J. Clarke ◽  
Samuel S. P. Shen

This study uses the Hilbert–Huang transform (HHT), a signal analysis method for nonlinear and non-stationary processes, to separate signals of varying frequencies in a nonlinear system governed by the Lorenz equations. Similar to the Fourier series expansion, HHT decomposes a data time series into a sum of intrinsic mode functions (IMFs) using empirical mode decomposition (EMD). Unlike an infinite number of Fourier series terms, the EMD always yields a finite number of IMFs, whose sum is equal to the original time series exactly. Using the HHT approach, the properties of the Lorenz attractor are interpreted in a time–frequency frame. This frame shows that: (i) the attractor is symmetric for [Formula: see text] (i.e. invariant for [Formula: see text]), even though the signs on [Formula: see text] and [Formula: see text] are changed; (ii) the attractor is sensitive to initial conditions even by a small perturbation, measured by the divergence of the trajectories over time; (iii) the Lorenz system goes through “windows” of chaos and periodicity; and (iv) at times, a system can be both chaotic and periodic for a given [Formula: see text] value. IMFs are a finite collection of decomposed quasi-periodic signals, starting from the highest to lowest frequencies, providing detection of the lower frequency signals that may have otherwise been “hidden” by their higher frequency counterparts. EMD decomposes the original signal into a family of distinct IMF signals, the Hilbert spectra are a “family portrait” of time–frequency–amplitude interplay of all IMF members. Together with viewing the IMF energy, it is easy to discern where each IMF resides in the spectra in relation to one another. In this study, the majority of high amplitude signals appear at low frequencies, approximately 0.5–1.5. Although our numerical experiments are limited to only two specific cases, our HHT analyses of time–frequency, marginal spectra, and energy and quasi-periodicity of each IMF provide a novel approach to exploring the profound and phenomena-rich Lorenz system.


2015 ◽  
Vol 07 (01n02) ◽  
pp. 1550001
Author(s):  
Dong Mao ◽  
Yang Wang ◽  
Qiang Wu

In this paper, we developed a new approach for the analysis of physiological time series. An iterative convolution filter is used to decompose the time series into various components. Statistics of these components are extracted as features to characterize the mechanisms underlying the time series. Studies have shown that many normal physiological systems involve irregularity, while the decrease of irregularity usually implies abnormality. This motivates the use of the statistics for “outliers” in the components as features measuring irregularity. Support vector machines are used to select the most relevant features that are able to differentiate the time series from normal and abnormal systems. This new approach is successfully used in the study of congestive heart failure by heart beat interval time series.


2014 ◽  
Vol 06 (04) ◽  
pp. 1499001

2014 ◽  
Vol 06 (04) ◽  
pp. 1450012 ◽  
Author(s):  
XIANFENG HU ◽  
YANG WANG ◽  
QIANG WU

Inspired by the authorship controversy of Dream of the Red Chamber and the application of machine learning in the study of literary stylometry, we develop a rigorous new method for the mathematical analysis of authorship by testing for a so-called chrono-divide in writing styles. Our method incorporates some of the latest advances in the study of authorship attribution, particularly techniques from support vector machines. By introducing the notion of relative frequency as a feature ranking metric, our method proves to be highly effective and robust. Applying our method to the Cheng–Gao version of Dream of the Red Chamber has led to convincing if not irrefutable evidence that the first 80 chapters and the last 40 chapters of the book were written by two different authors. Furthermore, our analysis has unexpectedly provided strong support to the hypothesis that Chapter 67 was not the work of Cao Xueqin either. We have also tested our method to the other three Great Classical Novels in Chinese. As expected no chrono-divides have been found. This provides further evidence of the robustness of our method.


2014 ◽  
Vol 06 (04) ◽  
pp. 1450011 ◽  
Author(s):  
PIETRO BONIZZI ◽  
JOËL M. H. KAREL ◽  
OLIVIER MESTE ◽  
RALF L. M. PEETERS

This study introduces singular spectrum decomposition (SSD), a new adaptive method for decomposing nonlinear and nonstationary time series in narrow-banded components. The method takes its origin from singular spectrum analysis (SSA), a nonparametric spectral estimation method used for analysis and prediction of time series. Unlike SSA, SSD is a decomposition method in which the choice of fundamental parameters has been completely automated. This is achieved by focusing on the frequency content of the signal. In particular, this holds for the choice of the window length used to generate the trajectory matrix of the data and for the selection of its principal components for the reconstruction of a specific component series. Moreover, a new definition of the trajectory matrix with respect to the standard SSA allows the oscillatory content in the data to be enhanced and guarantees decrease of energy of the residual. Through the numerical examples and simulations, the SSD method is shown to be able to accurately retrieve different components concealed in the data, minimizing at the same time the generation of spurious components. Applications on time series from both the biological and the physical domain are also presented highlighting the capability of SSD to yield physically meaningful components.


2014 ◽  
Vol 06 (02n03) ◽  
pp. 1450007
Author(s):  
RAYMOND K. W. WONG

The estimation and significance testing of the first-order autoregressive (AR1) coefficient in short time series with trends are examined. The purpose is to identify the difficulties to which analysis procedures need to adjust for better results. The delta recursive AR1 estimator rδand the Sen–Theil trend estimator are viable for short sequence application. Significance testing for rδhas low power. But the existence of trend has negligible influence in estimation and testing. The common practice of trend removal before AR1 estimation gives poorer results. Application to air quality data showed this could greatly change conclusions. Implication to analysis is discussed.


2014 ◽  
Vol 06 (02n03) ◽  
pp. 1450008 ◽  
Author(s):  
STAN LIPOVETSKY

Discrete choice modeling (DCM) is widely used in economics, social studies, and marketing research for estimating utilities and preference probabilities of multiple alternatives. Data for the model is elicited from the respondents who are presented with several sets of items characterized by various attributes, and each respondent chooses the best alternative in each set. Estimation of utilities is usually performed in a multinomial-logit (MNL) modeling and software for Hierarchical Bayesian (HB) technique is usually applied to find individual utilities by iterative estimations. This paper describes an easy and convenient empirical Bayesian way to construct priors and combine them with the likelihood on individual level data. This allows the modeler to obtain posterior estimation of MNL utilities in noniterative evaluations. Logistic modeling for the posterior frequencies is performed using the linear link of their logarithm of odds that clarifies the results of DCM modeling. The problem of overfitting is considered and the optimum balance between signal and noise in the precision of individual prediction and the smoothing of overall data is suggested. Actual market research data are used and the results are discussed.


Sign in / Sign up

Export Citation Format

Share Document