scholarly journals Automatic 3D illumination-diagnosis method for large-N arrays: Robust data scanner and machine-learning feature provider

Geophysics ◽  
2019 ◽  
Vol 84 (3) ◽  
pp. Q13-Q25 ◽  
Author(s):  
Michał Chamarczuk ◽  
Michał Malinowski ◽  
Yohei Nishitsuji ◽  
Jan Thorbecke ◽  
Emilia Koivisto ◽  
...  

The main issues related to passive-source reflection imaging with seismic interferometry (SI) are inadequate acquisition parameters for sufficient spatial wavefield sampling and vulnerability of surface arrays to the dominant influence of the omnipresent surface-wave sources. Additionally, long recordings provide large data volumes that require robust and efficient processing methods. We address these problems by developing a two-step wavefield evaluation and event detection (TWEED) method of body waves in recorded ambient noise. TWEED evaluates the spatiotemporal characteristics of noise recordings by simultaneous analysis of adjacent receiver lines. We test our method on synthetic data representing transient ambient-noise sources at the surface and in the deeper subsurface. We discriminate between basic types of seismic events by using three adjacent receiver lines. Subsequently, we apply TWEED to 600 h of ambient noise acquired with an approximately 1000-receiver array deployed over an active underground mine in Eastern Finland. We develop the detection of body-wave events related to mine blasts and other routine mining activities using a representative 1 h noise panel. Using TWEED, we successfully detect 1093 body-wave events in the full data set. To increase the computational efficiency, we use slowness parameters derived from the first step of TWEED as input to a support vector machine (SVM) algorithm. Using this approach, we detect 94% of the TWEED-evaluated body-wave events indicating the possibility to limit the illumination analysis to only one step, and therefore increase the time efficiency at the price of lower detection rate. However, TWEED on a small volume of the recorded data followed by SVM on the rest of the data could be efficiently used for a quick and robust (real-time) scanning for body-wave energy in large data volumes for subsequent application of SI for retrieval of reflections.

Geophysics ◽  
2020 ◽  
Vol 85 (1) ◽  
pp. KS29-KS38 ◽  
Author(s):  
Guoli Wu ◽  
Hefeng Dong ◽  
Ganpan Ke ◽  
Junqiang Song

Accurate approximations of Green’s functions retrieved from the correlations of ambient noise require a homogeneous distribution of random and uncorrelated noise sources. In the real world, the existence of highly coherent, strong directional noise generated by ships, earthquakes, and other human activities can result in biases in the ambient-noise crosscorrelations (NCCs). We have developed an adapted eigenvalue-based filter to attenuate the interference of strong directional sources. The filter is based on the statistical model of the sample covariance matrix and can separate different components of the data covariance matrix in the eigenvalue spectrum. To improve the effectiveness and make it adaptable for different data sets, a weight is introduced to the filter. Then, the NCCs can be calculated directly from the filtered data covariance matrix. This approach is applied to a 1.02 h data set of ambient noise recorded by a permanent reservoir monitoring receiver array installed on the seabed. The power spectral density indicates that the noise recordings were contaminated by strong directional noise over nearly half of the whole observation period. Beamforming and crosscorrelation results indicate that the interference still exists even after applying traditional temporal and spectral normalization techniques, whereas the adapted eigenvalue-based filter can significantly attenuate it and help to obtain improved crosscorrelations. The approach makes it possible to retrieve reliable approximations of Green’s functions over a much shorter recording time.


2016 ◽  
Vol 4 (3) ◽  
pp. SJ55-SJ65 ◽  
Author(s):  
Pascal Edme ◽  
David F. Halliday

We have introduced a workflow that allows subsurface imaging using upcoming body-wave arrivals extracted from ambient-noise land seismic data. Rather than using the conventional seismic interferometry approach based on correlation, we have developed a deconvolution technique to extract the earth response from the observed periodicity in the seismic traces. The technique consists of iteratively applying a gapped spiking deconvolution, providing multiple-free images with higher resolution than conventional correlation. We have validated the workflow for zero-offset traces with simple synthetic data and real data recorded during a small point-receiver land seismic survey.


Geophysics ◽  
2021 ◽  
pp. 1-58
Author(s):  
Deepankar Dangwal ◽  
Michael Behm

Interferometric retrieval of body waves from ambient noise recorded at surface stations is usually challenged by the dominance of surface-wave energy, in particular in settings dominated by anthropogenic activities (e.g., natural resource exploitation, traffic, infrastructure construction). As a consequence, ambient noise imaging of shallow structures such as sedimentary layers remains a difficult task for sparse and irregularly distributed receiver networks. We demonstrate how polarization filtering can be used to automatically extract steeply inclined P-waves from continuous three-component recordings and in turn improves passive body-wave imaging. Being a single-station approach, the technique does not rely on a dense receiver array and is therefore well suited for data collected during surveillance monitoring for tasks such as reservoir hydraulic stimulation, CO_2 sequestration, and wastewater disposal injection. We apply the method on a continuous dataset acquired in the Wellington oilfield (Kansas, US), where local and regional seismicity, and other forms of ambient noise provide an abundant source of both surface- and body-wave energy recorded at 15 short-period receivers. We use autocorrelation to derive the shallow (lt; 1 km) reflectivity structure below the receiver array and validate our workflow and results with well logs and active seismic data. Raytracing analysis and waveform modeling indicates that converted shear waves need to be taken into account for realistic ambient noise body-wave source distributions, as they can be projected on the vertical component and might lead to misinterpretation of the P-wave reflectivity structure. Overall, our study suggests that polarization filtering significantly improves passive body-wave imaging on both autocorrelation and interstation crosscorrelation. It reduces the impact of time-varying noise source distributions and is therefore also potentially useful for time-lapse ambient noise interferometry.


2019 ◽  
Vol 21 (9) ◽  
pp. 662-669 ◽  
Author(s):  
Junnan Zhao ◽  
Lu Zhu ◽  
Weineng Zhou ◽  
Lingfeng Yin ◽  
Yuchen Wang ◽  
...  

Background: Thrombin is the central protease of the vertebrate blood coagulation cascade, which is closely related to cardiovascular diseases. The inhibitory constant Ki is the most significant property of thrombin inhibitors. Method: This study was carried out to predict Ki values of thrombin inhibitors based on a large data set by using machine learning methods. Taking advantage of finding non-intuitive regularities on high-dimensional datasets, machine learning can be used to build effective predictive models. A total of 6554 descriptors for each compound were collected and an efficient descriptor selection method was chosen to find the appropriate descriptors. Four different methods including multiple linear regression (MLR), K Nearest Neighbors (KNN), Gradient Boosting Regression Tree (GBRT) and Support Vector Machine (SVM) were implemented to build prediction models with these selected descriptors. Results: The SVM model was the best one among these methods with R2=0.84, MSE=0.55 for the training set and R2=0.83, MSE=0.56 for the test set. Several validation methods such as yrandomization test and applicability domain evaluation, were adopted to assess the robustness and generalization ability of the model. The final model shows excellent stability and predictive ability and can be employed for rapid estimation of the inhibitory constant, which is full of help for designing novel thrombin inhibitors.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Ruolan Zeng ◽  
Jiyong Deng ◽  
Limin Dang ◽  
Xinliang Yu

AbstractA three-descriptor quantitative structure–activity/toxicity relationship (QSAR/QSTR) model was developed for the skin permeability of a sufficiently large data set consisting of 274 compounds, by applying support vector machine (SVM) together with genetic algorithm. The optimal SVM model possesses the coefficient of determination R2 of 0.946 and root mean square (rms) error of 0.253 for the training set of 139 compounds; and a R2 of 0.872 and rms of 0.302 for the test set of 135 compounds. Compared with other models reported in the literature, our SVM model shows better statistical performance in a model that deals with more samples in the test set. Therefore, applying a SVM algorithm to develop a nonlinear QSAR model for skin permeability was achieved.


Geophysics ◽  
2020 ◽  
Vol 85 (6) ◽  
pp. G129-G141
Author(s):  
Diego Takahashi ◽  
Vanderlei C. Oliveira Jr. ◽  
Valéria C. F. Barbosa

We have developed an efficient and very fast equivalent-layer technique for gravity data processing by modifying an iterative method grounded on an excess mass constraint that does not require the solution of linear systems. Taking advantage of the symmetric block-Toeplitz Toeplitz-block (BTTB) structure of the sensitivity matrix that arises when regular grids of observation points and equivalent sources (point masses) are used to set up a fictitious equivalent layer, we develop an algorithm that greatly reduces the computational complexity and RAM memory necessary to estimate a 2D mass distribution over the equivalent layer. The structure of symmetric BTTB matrix consists of the elements of the first column of the sensitivity matrix, which, in turn, can be embedded into a symmetric block-circulant with circulant-block (BCCB) matrix. Likewise, only the first column of the BCCB matrix is needed to reconstruct the full sensitivity matrix completely. From the first column of the BCCB matrix, its eigenvalues can be calculated using the 2D fast Fourier transform (2D FFT), which can be used to readily compute the matrix-vector product of the forward modeling in the fast equivalent-layer technique. As a result, our method is efficient for processing very large data sets. Tests with synthetic data demonstrate the ability of our method to satisfactorily upward- and downward-continue gravity data. Our results show very small border effects and noise amplification compared to those produced by the classic approach in the Fourier domain. In addition, they show that, whereas the running time of our method is [Formula: see text] s for processing [Formula: see text] observations, the fast equivalent-layer technique used [Formula: see text] s with [Formula: see text]. A test with field data from the Carajás Province, Brazil, illustrates the low computational cost of our method to process a large data set composed of [Formula: see text] observations.


Endocrinology ◽  
2019 ◽  
Vol 160 (10) ◽  
pp. 2395-2400 ◽  
Author(s):  
David J Handelsman ◽  
Lam P Ly

Abstract Hormone assay results below the assay detection limit (DL) can introduce bias into quantitative analysis. Although complex maximum likelihood estimation methods exist, they are not widely used, whereas simple substitution methods are often used ad hoc to replace the undetectable (UD) results with numeric values to facilitate data analysis with the full data set. However, the bias of substitution methods for steroid measurements is not reported. Using a large data set (n = 2896) of serum testosterone (T), DHT, estradiol (E2) concentrations from healthy men, we created modified data sets with increasing proportions of UD samples (≤40%) to which we applied five different substitution methods (deleting UD samples as missing and substituting UD sample with DL, DL/√2, DL/2, or 0) to calculate univariate descriptive statistics (mean, SD) or bivariate correlations. For all three steroids and for univariate as well as bivariate statistics, bias increased progressively with increasing proportion of UD samples. Bias was worst when UD samples were deleted or substituted with 0 and least when UD samples were substituted with DL/√2, whereas the other methods (DL or DL/2) displayed intermediate bias. Similar findings were replicated in randomly drawn small subsets of 25, 50, and 100. Hence, we propose that in steroid hormone data with ≤40% UD samples, substituting UD with DL/√2 is a simple, versatile, and reasonably accurate method to minimize left censoring bias, allowing for data analysis with the full data set.


2008 ◽  
Vol 07 (04) ◽  
pp. 721-736 ◽  
Author(s):  
HSIAO-FAN WANG ◽  
ZU-WEN CHAN

In this study, we proposed a general pruning procedure to reduce the dimension of a large database so that the properties of the extracted subset can be well defined. Since learning functions have been widely applied, we take this group of functions as an example to demonstrate the proposed procedure. Based on the concept of Support Vector Machine (SVM), three major stages of preliminary pruning, fitting function, and refining are proposed to discover a subset that possess the characteristics of some learning function from the given large data set. Three models were used to illustrate and evaluate the proposed pruning procedure and the results have shown to be promising in application.


2007 ◽  
Vol 19 (3) ◽  
pp. 816-855 ◽  
Author(s):  
Hyunjung Shin ◽  
Sungzoon Cho

The support vector machine (SVM) has been spotlighted in the machine learning community because of its theoretical soundness and practical performance. When applied to a large data set, however, it requires a large memory and a long time for training. To cope with the practical difficulty, we propose a pattern selection algorithm based on neighborhood properties. The idea is to select only the patterns that are likely to be located near the decision boundary. Those patterns are expected to be more informative than the randomly selected patterns. The experimental results provide promising evidence that it is possible to successfully employ the proposed algorithm ahead of SVM training.


Sign in / Sign up

Export Citation Format

Share Document