scholarly journals Theoretical characterisation of strand cross-correlation in ChIP-seq

2020 ◽  
Author(s):  
Hayato Anzawa ◽  
Hitoshi Yamagata ◽  
Kengo Kinoshita

Abstract Background: Strand cross-correlation profiles are used for both peak calling pre-analysis and quality control (QC) in chromatin immunoprecipitation followed by sequencing (ChIP-seq) analysis. Despite its potential for robust and accurate assessments of signal-to-noise ratio (S/N) because of its peak calling independence, it remains unclear what aspects of quality such strand cross-correlation profiles actually measure. Results: We introduced a simple model to simulate the mapped read-density of ChIP-seq and then derived the theoretical maximum and minimum of cross-correlation coefficients between strands. The results suggest that the maximum coefficient of typical ChIP-seq samples is directly proportional to the number of total mapped reads and the square of the ratio of signal reads, and inversely proportional to the number of peaks and the length of read-enriched regions. Simulation analysis supported our results and evaluation using 790 ChIP-seq data obtained from the public database demonstrated high consistency between calculated cross-correlation coefficients and estimated coefficients based on the theoretical relations and peak calling results. In addition, we found that the mappability-bias-correction improved sensitivity, enabling differentiation of maximum coefficients from the noise level. Based on these insights, we proposed virtual S/N (VSN), a novel peak call-free metric for S/N assessment. We also developed PyMaSC, a tool to calculate strand cross-correlation and VSN efficiently. VSN achieved most consistent S/N estimation for various ChIP targets and sequencing read depths. Furthermore, we demonstrated that a combination of VSN and pre-existing peak calling results enable the estimation of the numbers of detectable peaks for posterior experiments and assess peak calling results. Conclusions: We present the first theoretical insights into the strand cross-correlation, and the results reveal the potential and the limitations of strand cross-correlation analysis. Our quality assessment framework using VSN provides peak call-independent QC and will help in the evaluation of peak call analysis in ChIP-seq experiments.

2019 ◽  
Author(s):  
Hayato Anzawa ◽  
Hitoshi Yamagata ◽  
Kengo Kinoshita

Abstract Background: Strand cross-correlation profiles are used for both peak calling pre-analysis and quality control in chromatin immunoprecipitation followed by sequencing (ChIP-seq) analysis. Despite its potential for robust and accurate assessments of signal-to-noise ratio (S/N) of ChIP-seq samples, it remains unclear what aspects of quality such strand cross-correlation profiles actually measure. Results: We introduced a simple model to simulate the mapped read-density of ChIP-seq and then derived the theoretical maximum and minimum of cross-correlation coefficients between strands. The results suggest that the maximum coefficient of typical ChIP-seq samples is directly proportional to the number of total mapped reads and the square of the ratio of signal reads, and inversely proportional to the number of peaks and the length of read-enriched regions. We also developed PyMaSC to efficiently generate strand cross-correlation profiles. Simulation analysis supported our results and evaluation using 790 ChIP-seq data obtained from the public database demonstrated high consistency between calculated cross-correlation coefficients and estimated coefficients based on the theoretical relations and peak calling results. In addition, we found that the mappability-bias-correction improved sensitivity, enabling differentiation of maximum coefficients from the noise level. Conclusions: We present the first theoretical insights into the strand cross-correlation and the results reveal the potential and the limitations of strand cross-correlation analysis. Our work will help in the establishment of better QC metrics using strand cross-correlation.


2020 ◽  
Vol 21 (1) ◽  
Author(s):  
Hayato Anzawa ◽  
Hitoshi Yamagata ◽  
Kengo Kinoshita

Abstract Background Strand cross-correlation profiles are used for both peak calling pre-analysis and quality control (QC) in chromatin immunoprecipitation followed by sequencing (ChIP-seq) analysis. Despite its potential for robust and accurate assessments of signal-to-noise ratio (S/N) because of its peak calling independence, it remains unclear what aspects of quality such strand cross-correlation profiles actually measure. Results We introduced a simple model to simulate the mapped read-density of ChIP-seq and then derived the theoretical maximum and minimum of cross-correlation coefficients between strands. The results suggest that the maximum coefficient of typical ChIP-seq samples is directly proportional to the number of total mapped reads and the square of the ratio of signal reads, and inversely proportional to the number of peaks and the length of read-enriched regions. Simulation analysis supported our results and evaluation using 790 ChIP-seq data obtained from the public database demonstrated high consistency between calculated cross-correlation coefficients and estimated coefficients based on the theoretical relations and peak calling results. In addition, we found that the mappability-bias-correction improved sensitivity, enabling differentiation of maximum coefficients from the noise level. Based on these insights, we proposed virtual S/N (VSN), a novel peak call-free metric for S/N assessment. We also developed PyMaSC, a tool to calculate strand cross-correlation and VSN efficiently. VSN achieved most consistent S/N estimation for various ChIP targets and sequencing read depths. Furthermore, we demonstrated that a combination of VSN and pre-existing peak calling results enable the estimation of the numbers of detectable peaks for posterior experiments and assess peak calling results. Conclusions We present the first theoretical insights into the strand cross-correlation, and the results reveal the potential and the limitations of strand cross-correlation analysis. Our quality assessment framework using VSN provides peak call-independent QC and will help in the evaluation of peak call analysis in ChIP-seq experiments.


2020 ◽  
Author(s):  
Hayato Anzawa ◽  
Hitoshi Yamagata ◽  
Kengo Kinoshita

Abstract Background: Strand cross-correlation profiles are used for both peak calling pre-analysis and quality control (QC) in chromatin immunoprecipitation followed by sequencing (ChIP-seq) analysis. Despite its potential for robust and accurate assessments of signal-to-noise ratio (S/N) because of its peak calling independence, it remains unclear what aspects of quality such strand cross-correlation profiles actually measure. Results: We introduced a simple model to simulate the mapped read-density of ChIP-seq and then derived the theoretical maximum and minimum of cross-correlation coefficients between strands. The results suggest that the maximum coefficient of typical ChIP-seq samples is directly proportional to the number of total mapped reads and the square of the ratio of signal reads, and inversely proportional to the number of peaks and the length of read-enriched regions. Simulation analysis supported our results and evaluation using 790 ChIP-seq data obtained from the public database demonstrated high consistency between calculated cross-correlation coefficients and estimated coefficients based on the theoretical relations and peak calling results. In addition, we found that the mappability-bias-correction improved sensitivity, enabling differentiation of maximum coefficients from the noise level. Based on these insights, we proposed virtual S/N (VSN), a novel peak call-free metric for S/N assessment. We also developed PyMaSC, a tool to calculate strand cross-correlation and VSN efficiently. VSN achieved most consistent S/N estimation for various ChIP targets and sequencing read depths. Furthermore, we demonstrated that a combination of VSN and pre-existing peak calling results enable the estimation of the numbers of detectable peaks for posterior experiments and assess peak calling results. Conclusions: We present the first theoretical insights into the strand cross-correlation, and the results reveal the potential and the limitations of strand cross-correlation analysis. Our quality assessment framework using VSN provides peak call-independent QC and will help in the evaluation of peak call analysis in ChIP-seq experiments.


2020 ◽  
Author(s):  
Pieter Smets ◽  
Kees Weemstra ◽  
Läslo Evers

<p>Hydroacoustic activity of the submarine Monowai Volcanic Centre (MVC) is repeatedly observed at two distant triplet hydrophone stations, south of Juan Fernandez Islands (H03S, 9,159km) and north of Ascension Island (H10N, 15,823km). <em>T</em>-phase converted energy recorded at the broadband seismic station Rarotonga on Cook Island (RAR, 1,845km) is used as a reference for the cross-correlation analysis. A detailed processing scheme for the calculation of the daily cross-correlation functions (CCF) of the hydroacoustic and seismic data is provided. Preprocessing is essential to account for the non-identical measurements and sensitivities as well as the different sample rates.<span> </span>Further postprocessing by systematic data selection has to be applied before stacking CCFs in order to account for the non-continuous activity of the MVC source.<span> </span>Daily volcanic activity is determined for the period from 2006 until 2018 using the signal-to-noise ratio of the CCFs assuming sound propagation in the SOFAR channel. Monthly stacked CCFs with clear volcanic activity are used to study seasonal variations in sound propagation between the MVC and the hydrophone stations.<span> </span>In winter, however, a faster than expected signal is observed at H10N which is hypothesized to (partial) propagation through the formed sea ice along the path near Antarctica.</p>


2016 ◽  
Vol 15 (02) ◽  
pp. 1650012 ◽  
Author(s):  
Guangxi Cao ◽  
Cuiting He ◽  
Wei Xu

This study investigates the correlation between weather and agricultural futures markets on the basis of detrended cross-correlation analysis (DCCA) cross-correlation coefficients and [Formula: see text]-dependent cross-correlation coefficients. In addition, detrended fluctuation analysis (DFA) is used to measure extreme weather and thus analyze further the effect of this condition on agricultural futures markets. Cross-correlation exists between weather and agricultural futures markets on certain time scales. There are some correlations between temperature and soybean return associated with medium amplitudes. Under extreme weather conditions, weather exerts different influences on different agricultural products; for instance, soybean return is greatly influenced by temperature, and weather variables exhibit no effect on corn return. Based on the detrending moving-average cross-correlation analysis (DMCA) coefficient and DFA regression results are similar to that of DCCA coefficient.


2021 ◽  
Vol 65 (11) ◽  
pp. 1136-1144
Author(s):  
A. E. Rodin ◽  
V. V. Oreshko ◽  
V. A. Fedorova

Abstract We have developed a model for the time delay of pulse arrival between stations on the Moon and Earth. Comparison of the lunar and terrestrial time scales is proposed to be carried out by comparing the arrival time moments of giant pulses from pulsars. A method for such a comparison has been developed based on the cross-correlation analysis of the received pulses. Using the example of giant pulses from the pulsar PSR 0531+21, we showed that the error of comparing scales in the case of a high signal-to-noise ratio reaches a sub-discrete level and, thus, is determined by the reception band of the recording equipment.


2017 ◽  
Author(s):  
Ryuichiro Nakato ◽  
Katsuhiko Shirahige

AbstractChromatin immunoprecipitation followed by sequencing (ChIP-seq) can detect read-enriched DNA loci for point-source (e.g., transcription factor binding) and broad-source factors (e.g., various histone modifications). Although numerous quality metrics for ChIP-seq data have been developed, the ‘peaks’ thus obtained are still difficult to assess with respect to signal-to-noise ratio (S/N) and the percentage of false positives.We developed a quality-assessment tool for ChIP-seq data, SSP (strand-shift profile), that quantifies S/N and peak reliability without peak calling. We validated SSP in-depth using ≥ 1,000 publicly available ChIP-seq datasets along with virtual data to demonstrate that SSP is quantifiable and sensitive to different S/Ns for both pointand broad-source factors. Moreover, SSP is consistent among cell types and with respect to variance of sequencing depth, and identifies low-quality samples that cannot be identified by quality metrics currently available. Finally, we show that “hidden-duplicate reads” cause aberrantly high S/Ns, and SSP provides an additional metric to avoid them, which can also contribute to estimation of peak mode (pointor broad-source) of samples.Availabilityhttps://github.com/rnakato/SSP"


Geophysics ◽  
1954 ◽  
Vol 19 (4) ◽  
pp. 660-683 ◽  
Author(s):  
Hal J. Jones ◽  
John A. Morrison

Correlation analysis techniques may be applied to seismic data already subjected to standard recording and analysis procedure in an effort to extract additional information, or to raw data as an alternative filtering method. These techniques involve determination of certain parameters which provide a quantitative measure of the correlation between two sets of data. Among the most useful of these parameters are the auto‐ and cross‐correlation coefficients and functions long used by statisticians in time series analysis and recently applied to filtering and prediction problems in the field of communications. This paper discusses some applications of correlation analysis in interpretation of seismograms. The use of cross‐correlation analysis to identify weak reflections masked by high noise is illustrated for several problems. Equivalence of correlation analysis procedures to filtering operations is stressed. Special analog computing equipment facilitating computation of correlation coefficients and power spectra directly from oscillograms or graphs is described. A brief discussion of modern optimum filter theory is presented.


2021 ◽  
Vol 2094 (3) ◽  
pp. 032048
Author(s):  
I A Zavedevkin ◽  
A A Shakirova ◽  
P P Firstov

Abstract The DrumCorr program based on cross-correlation detection has been developed to identify multiplets of the volcanic earthquakes. The program is implemented in Python 3 and reads ASCII and MiniSEED seismic data formats. The article presents the algorithm of the program, describing the cross-correlation detector and an example of subsequent processing of seismic data. The program was applied to volcanic earthquakes of the «drumbeats» seismic regime and allowed to identify earthquake multiplets characterized by various wave forms. The article presents the algorithm of the program, describing the cross-correlation detector, the features of the weak volcanic earthquakes selection by the STA/LTA method. And the primary analysis of the values of the correlation coefficients with the calculation of their standard errors depending on different signal-to-noise ratios.


2020 ◽  
Vol 12 (4) ◽  
pp. 1620
Author(s):  
Paulo Ferreira ◽  
Éder J. A. L. Pereira ◽  
Hernane B. B. Pereira

Oil is one of the most important products in the world, being used for fuel production but also as an input in several industries. After the oil shocks of the 1970s, which caused great turbulence, the interest in the analysis of this particular product grew. The analysis of the comovements between oil and other assets became a hot topic. In this study, we propose an analysis of how oil price correlates with several industry indexes. The detrended cross-correlation analysis coefficient ( ρ DCCA ) is used, with data from 1992 to 2019, and we analyze not only the correlation between oil and several Euro Stoxx indexes during the whole sample, but also how that correlation evolved for the different decades (1990s, 2000s and 2010s). Naturally, oil and gas are the sectors that correlate the most with crude oil, with correlation coefficients reaching levels higher than 0.6 in some cases. However, the results also indicate that all sectors are now more exposed to oil price variations than in the past, with the financial sector as one of the sectors with the greatest increase in correlation.


Sign in / Sign up

Export Citation Format

Share Document