scholarly journals Detection of hydrogeochemical seismic precursors by a statistical learning model

2008 ◽  
Vol 8 (6) ◽  
pp. 1207-1216 ◽  
Author(s):  
L. Castellana ◽  
P. F. Biagi

Abstract. The problem of detecting the occurrence of an earthquake precursor is faced in the general framework of the statistical learning theory. The aim of this work is both to build models able to detect seismic precursors from time series of different geochemical signals and to provide an estimate of number of false positives. The model we used is k-Nearest-Neighbor classifier for discriminating "no-disturbed signal", "seismic precursor" and "co-post seismic precursor" in time series relative to thirteen different hydrogeochemical parameters collected in water samples from a natural spring in Kamchachta (Russia) peninsula. The measurements collected are ion content (Na, Cl, Ca, HCO3, H3BO3), parameters (pH, Q, T) and gases (N2, CO2, CH4, O2, Ag). The classification error is measured by Leave-K-Out-Cross-Validation procedure. Our study shows that the most discriminative ions for detecting seismic precursors are Cl and Na having an error rates of 15%. Moreover, the most discriminative parameters and gases are Q and CH4 respectively, with error rate of 21%. The ions result the most informative hydrogeochemicals for detecting seismic precursors due to the peculiarities of the mechanisms involved in earthquake preparation. Finally we show that the information collected some month before the event under analysis are necessary to improve the classification accuracy.

Mathematics ◽  
2020 ◽  
Vol 8 (3) ◽  
pp. 413 ◽  
Author(s):  
Chris Lytridis ◽  
Anna Lekova ◽  
Christos Bazinas ◽  
Michail Manios ◽  
Vassilis G. Kaburlasos

Our interest is in time series classification regarding cyber–physical systems (CPSs) with emphasis in human-robot interaction. We propose an extension of the k nearest neighbor (kNN) classifier to time-series classification using intervals’ numbers (INs). More specifically, we partition a time-series into windows of equal length and from each window data we induce a distribution which is represented by an IN. This preserves the time dimension in the representation. All-order data statistics, represented by an IN, are employed implicitly as features; moreover, parametric non-linearities are introduced in order to tune the geometrical relationship (i.e., the distance) between signals and consequently tune classification performance. In conclusion, we introduce the windowed IN kNN (WINkNN) classifier whose application is demonstrated comparatively in two benchmark datasets regarding, first, electroencephalography (EEG) signals and, second, audio signals. The results by WINkNN are superior in both problems; in addition, no ad-hoc data preprocessing is required. Potential future work is discussed.


2009 ◽  
Vol 19 (12) ◽  
pp. 4197-4215 ◽  
Author(s):  
ANGELIKI PAPANA ◽  
DIMITRIS KUGIUMTZIS

We study some of the most commonly used mutual information estimators, based on histograms of fixed or adaptive bin size, k-nearest neighbors and kernels and focus on optimal selection of their free parameters. We examine the consistency of the estimators (convergence to a stable value with the increase of time series length) and the degree of deviation among the estimators. The optimization of parameters is assessed by quantifying the deviation of the estimated mutual information from its true or asymptotic value as a function of the free parameter. Moreover, some commonly used criteria for parameter selection are evaluated for each estimator. The comparative study is based on Monte Carlo simulations on time series from several linear and nonlinear systems of different lengths and noise levels. The results show that the k-nearest neighbor is the most stable and less affected by the method-specific parameter. A data adaptive criterion for optimal binning is suggested for linear systems but it is found to be rather conservative for nonlinear systems. It turns out that the binning and kernel estimators give the least deviation in identifying the lag of the first minimum of mutual information from nonlinear systems, and are stable in the presence of noise.


2017 ◽  
Vol 52 (3) ◽  
pp. 2019-2037 ◽  
Author(s):  
Francisco Martínez ◽  
María Pilar Frías ◽  
María Dolores Pérez ◽  
Antonio Jesús Rivera

2008 ◽  
Vol 02 (03) ◽  
pp. 403-423 ◽  
Author(s):  
NICOLA FANIZZI ◽  
CLAUDIA D'AMATO ◽  
FLORIANA ESPOSITO

This work concerns non-parametric approaches for statistical learning applied to the standard knowledge representation languages adopted in the Semantic Web context. We present methods based on epistemic inference that are able to elicit and exploit the semantic similarity of individuals in OWL knowledge bases. Specifically, a totally semantic and language-independent semi-distance function is introduced, whence also an epistemic kernel function for Semantic Web representations is derived. Both the measure and the kernel function are embedded in non-parametric statistical learning algorithms customized for coping with Semantic Web representations. Particularly, the measure is embedded in a k-Nearest Neighbor algorithm and the kernel function is embedded in a Support Vector Machine. The implemented algorithms are used to perform inductive concept retrieval and query answering. An experimentation on real ontologies proves that the methods can be effectively employed for performing the target tasks, and moreover that it is possible to induce new assertions that are not logically derivable.


2016 ◽  
Vol 328 ◽  
pp. 42-59 ◽  
Author(s):  
Mabel González ◽  
Christoph Bergmeir ◽  
Isaac Triguero ◽  
Yanet Rodríguez ◽  
José M Benítez

2021 ◽  
Vol 2021 ◽  
pp. 1-9
Author(s):  
Lijun Hao ◽  
Min Zhang ◽  
Gang Huang

Feature optimization, which is the theme of this paper, is actually the selective selection of the variables on the input side at the time of making a predictive kind of model. However, an improved feature optimization algorithm for breath signal based on the Pearson-BPSO was proposed and applied to distinguish hepatocellular carcinoma by electronic nose (eNose) in the paper. First, the multidimensional features of the breath curves of hepatocellular carcinoma patients and healthy controls in the training samples were extracted; then, the features with less relevance to the classification were removed according to the Pearson correlation coefficient; next, the fitness function was constructed based on K-Nearest Neighbor (KNN) classification error and feature dimension, and the feature optimization transformation matrix was obtained based on BPSO. Furthermore, the transformation matrix was applied to optimize the test sample’s features. Finally, the performance of the optimization algorithm was evaluated by the classifier. The experiment results have shown that the Pearson-BPSO algorithm could effectively improve the classification performance compared with BPSO and PCA optimization methods. The accuracy of SVM and RF classifier was 86.03% and 90%, respectively, and the sensitivity and specificity were about 90% and 80%. Consequently, the application of Pearson-BPSO feature optimization algorithm will help improve the accuracy of hepatocellular carcinoma detection by eNose and promote the clinical application of intelligent detection.


2018 ◽  
Vol 2018 ◽  
pp. 1-8
Author(s):  
John Mashford

Three methods of temporal data upscaling, which may collectively be called the generalized k-nearest neighbor (GkNN) method, are considered. The accuracy of the GkNN simulation of month by month yield is considered (where the term yield denotes the dependent variable). The notion of an eventually well-distributed time series is introduced and on the basis of this assumption some properties of the average annual yield and its variance for a GkNN simulation are computed. The total yield over a planning period is determined and a general framework for considering the GkNN algorithm based on the notion of stochastically dependent time series is described and it is shown that for a sufficiently large training set the GkNN simulation has the same statistical properties as the training data. An example of the application of the methodology is given in the problem of simulating yield of a rainwater tank given monthly climatic data.


2020 ◽  
Vol 10 (2) ◽  
pp. 152-158
Author(s):  
Iswanto ◽  
Yuliana Melita Pranoto ◽  
Reddy Alexandro Harianto

Abstract- Having a sophisticated application, even though often experience problems in deciding BUY - SELL in trading forex trading. This is due to the often time series predictions, in the high variable experiencing high values ​​as well as low variables, for that it is needed a recommendation system to overcome this problem. The application of classification algorithms to the recommendation system in support of BUY-SELL decisions is one appropriate alternative to overcome this. K-Nearest Neighbor (K-NN) algorithm was chosen because the K-NN method is an algorithm that can be used in building a recommendation system that can classify data based on the closest distance. This system is designed to assist traders in making BUY-SELL decisions, based on predictive data. The results of the recommendation system from the ten trials predicted by Arima are recommended. When compared to the price in the field the target profit is 7% per week from ten experiments if the average profit has exceeded the target


Sign in / Sign up

Export Citation Format

Share Document