scholarly journals robustica: customizable robust independent component analysis

2021 ◽  
Author(s):  
Miquel Anglada-Girotto ◽  
Samuel Miravet-Verde ◽  
Luis Serrano ◽  
Sarah A. Head

Motivation: Independent Component Analysis (ICA) allows the dissection of omic datasets into modules that help to interpret global molecular signatures. The inherent randomness of this algorithm can be overcome by clustering many iterations of ICA together to obtain robust components. Existing algorithms for robust ICA are dependent on the choice of clustering method and on computing a potentially biased and large Pearson distance matrix. Results: We present robustica, a Python-based package to compute robust independent components with a fully customizable clustering algorithm and distance metric. Here, we exploited its customizability to revisit and optimize robust ICA systematically. From the 6 popular clustering algorithms considered, DBSCAN performed the best at clustering independent components across ICA iterations. After confirming the bias introduced with Pearson distances, we created a subroutine that infers and corrects the components′ signs across ICA iterations to enable using Euclidean distance. Our subroutine effectively corrected the bias while simultaneously increasing the precision, robustness, and memory efficiency of the algorithm. Finally, we show the applicability of robustica by dissecting over 500 tumor samples from low-grade glioma (LGG) patients, where we define a new gene expression module with the key modulators of tumor aggressiveness downregulated upon IDH1 mutation. Availability and implementation: robustica is written in Python under the open-source BSD 3-Clause license. The source code and documentation are freely available at <A HREF="https://github.com/CRG-CNAG/robustica">https://github.com/CRG-CNAG/robustica</A>. Additionally, all scripts to reproduce the work presented are available at <A HREF="https://github.com/MiqG/publication_robustica">https://github.com/MiqG/publication_robustica</A>.

2016 ◽  
Vol 37 (1) ◽  
Author(s):  
Klaus Nordhausen ◽  
Hannu Oja ◽  
Esa Ollila

Oja, Sirkiä, and Eriksson (2006) and Ollila, Oja, and Koivunen (2007) showed that, under general assumptions, any two scatter matrices with the so called independent components property can be used to estimate the unmixing matrix for the independent component analysis (ICA). The method is a generalization of Cardoso’s (Cardoso, 1989) FOBI estimate which uses the regular covariance matrix and a scatter matrix based on fourth moments. Different choices of the two scatter matrices are compared in a simulation study. Based on the study, we recommend always the use of two robust scatter matrices. For possible asymmetric independent components, symmetrized versions of the scatter matrix estimates should be used.


2019 ◽  
Vol 7 (3) ◽  
pp. SE19-SE42 ◽  
Author(s):  
David Lubo-Robles ◽  
Kurt J. Marfurt

During the past two decades, the number of volumetric seismic attributes has increased to the point at which interpreters are overwhelmed and cannot analyze all of the information that is available. Principal component analysis (PCA) is one of the best-known multivariate analysis techniques that decompose the input data into second-order statistics by maximizing the variance, thus obtaining mathematically uncorrelated components. Unfortunately, projecting the information in the multiple input data volumes onto an orthogonal basis often mixes rather than separates geologic features of interest. To address this issue, we have implemented and evaluated a relatively new unsupervised multiattribute analysis technique called independent component analysis (ICA), which is based on higher order statistics. We evaluate our algorithm to study the internal architecture of turbiditic channel complexes present in the Moki A sands Formation, Taranaki Basin, New Zealand. We input 12 spectral magnitude components ranging from 25 to 80 Hz into the ICA algorithm and we plot 3 of the resulting independent components against a red-green-blue color scheme to generate a single volume in which the colored independent components correspond to different seismic facies. The results obtained using ICA proved to be superior to those obtained using PCA. Specifically, ICA provides improved resolution and separates geologic features from noise. Moreover, with ICA, we can geologically analyze the different seismic facies and relate them to sand- and mud-prone seismic facies associated with axial and off-axis deposition and cut-and-fill architectures.


1996 ◽  
Vol 07 (06) ◽  
pp. 671-687 ◽  
Author(s):  
AAPO HYVÄRINEN ◽  
ERKKI OJA

Recently, several neural algorithms have been introduced for Independent Component Analysis. Here we approach the problem from the point of view of a single neuron. First, simple Hebbian-like learning rules are introduced for estimating one of the independent components from sphered data. Some of the learning rules can be used to estimate an independent component which has a negative kurtosis, and the others estimate a component of positive kurtosis. Next, a two-unit system is introduced to estimate an independent component of any kurtosis. The results are then generalized to estimate independent components from non-sphered (raw) mixtures. To separate several independent components, a system of several neurons with linear negative feedback is used. The convergence of the learning rules is rigorously proven without any unnecessary hypotheses on the distributions of the independent components.


2014 ◽  
Vol 664 ◽  
pp. 148-152
Author(s):  
Shuang Xi Jing ◽  
Song Tao Guo ◽  
Jun Fa Leng ◽  
Xing Yu Zhao

Constrained independent component analysis (cICA) is a new theory and new method derived from the independent component analysis (ICA).It can extract the desired independent components (ICs) from the data based on some prior information, thus overcoming the uncertainty of the traditional ICA. Early gearbox fault signals is often very weak ,characterized by non-Gaussian,low signal-to-noise ratio (SNR), which make the existing diagnosis methods in the diagnosis of early application restricted. In this paper,cICA algorithm is applied to gear fault diagnosis. Through the case studies verify the feasibility of this method to extract the desired independent components (ICs), indicating the applicability and effectiveness of the method.


2009 ◽  
Vol 10 (2) ◽  
pp. 85-115 ◽  
Author(s):  
M. P. S. Chawla

Independent component analysis (ICA) is a new technique suitable for separating independent components from electrocardiogram (ECG) complex signals. The basic idea of using multidimensional independent component analysis (MICA) is to find stable higher dimensional source signal subspaces and to decompose each rotation into elementary rotations within all two-dimensional planes spanned by the coordinate axes useful for diagnostic information of heart. In this paper, ability of ICA for parameterization of ECG signals was felt to reduce the amount of redundant ECG data. This work aims at finding an independent subspace analysis (ISA) model for ECG analysis that allows applicability to any random vectors available in an ECG data set. For the common standards for electrocardiography (CSE) based ECG data sets, joint approximate diagonalization of eigen matrices (Jade) algorithm is used to find smaller subspaces. The extracted independent components are further cleaned by statistical measures. In this study, it is also observed that the value of kurtosis coefficients for the independent components, which represents the noise component, can be further reduced using parameterized multidimensional ICA (PMICA) technique. The indeterminacies if available in the ECG data are to be analysed also using modified version of Jade algorithm to PMICA and parameterized standard ICA (PsICA) for comparative studies. The indeterminacies if available in the ECG data are reduced in PMICA better in comparison to the analysis done using PsICA. The simulation results obtained indicate that ICA definitely improves signal–noise ratio (SNR) like the other higher order digital filtering methods like Kalman, Butterworth etc. with minimum reconstruction errors. Here, it is also confirmed that re-parameterization of the standard ICA model results into a ‘component model’ using MICA technique, which is geometric in spirit and free of indeterminacies existing in sICA model.


2020 ◽  
Vol 52 ◽  
pp. 19-28
Author(s):  
Paola Cusano ◽  
Simona Petrosino ◽  
Enza De Lauro ◽  
Salvatore De Martino ◽  
Mariarosaria Falanga

Abstract. This work is devoted to the study of both earthquakes and background seismic noise at Ischia Island (Italy) recorded pre and post the Md 4.0 earthquake occurred on 21 August 2017 (18:57 UTC). We compare and characterize noise and earthquakes in terms of Independent Component Analysis, energy and polarization properties. The earthquakes' waveforms and the background noise are decomposed into a few independent components with two main common signals peaked around 1–2 and 3–4 Hz, respectively. A slight increase of the energy of the background seismic noise is observed comparing samples recorded in 2016 and 2017, whereas no variations are detected in 2017 pre and post the main earthquake. The polarization analysis, performed in the frequency bands individuated by Independent Component Analysis and applied to the background seismic noise, indicates a shallow propagation and the azimuthal pattern is mainly controlled by the local structural features. These results suggest that noise and earthquakes are ascribable to a common phenomenon of fluid-solid interaction in the hydrothermal system of Ischia Island.


2014 ◽  
Vol 1073-1076 ◽  
pp. 2508-2511
Author(s):  
Hui Ping Li ◽  
Li Wei Fan ◽  
Peng Zhou

This study adopted independent component analysis (ICA) to explore the underlying driving factors affect the international crude oil prices. Three original benchmark crude oil spot prices were first preprocessed to become normalized form by centering and whitening. Three independent components were then estimated by Fast-ICA algorithm. We find that the three independent components vary differently in their fluctuation amplitude and indicate clearly different hidden factors consisting of dominant long-term trend, medium-term extreme events influence, as well as frequent short-term irregular events such as weather and speculation. It shows that ICA is a powerful tool in finding out common hidden driving factors of international parallel crude oil prices.


2021 ◽  
Author(s):  
Marc C. Paulus ◽  
Anja Paulus ◽  
Rüdiger-A. Eichel ◽  
Josef Granwehr

The use of independent component analysis (ICA) for the analysis of two-dimensional (2D) spin-alignment echo--T1 7Li NMR correlation data with transient echo detection as a third dimension is demonstrated for the superionic conductor Li10GeP2S12 (LGPS). ICA was combined with Laplace inversion, or discrete inverse Laplace transform (ILT), to obtain spectrally resolved 2D correlation maps. Robust results were obtained with the spectra as well as the vectorized correlation maps as independent components. It was also shown that the order of ICA and ILT steps can be swapped. While performing the ILT step before ICA provided better contrast, a substantial data compression can be achieved if ICA is executed first. Thereby the overall computation time could be reduced by one to two orders of magnitude, since the number of computationally expensive ILT steps is limited to the number of retained independent components. For LGPS, it was demonstrated that physically meaningful independent components and mixing matrices are obtained, which could be correlated with previously investigated material properties yet provided a clearer, better separation of features in the data. LGPS from two different batches was investigated, which showed substantial differences in their spectral and relaxation behavior. While in both cases this could be attributed to ionic mobility, the presented analysis may also clear the way for a more in-depth theoretical analysis based on numerical simulations. The presented method appears to be particularly suitable for samples with at least partially resolved static quadrupolar spectra, such as alkali metal ions in superionic conductors. The good stability of the ICA analysis makes this a prospect algorithm for preprocessing of data for a subsequent automatized analysis using machine learning concepts.


2011 ◽  
Vol 28 (3) ◽  
pp. 247-261 ◽  
Author(s):  
YELDA ALKAN ◽  
BHARAT B. BISWAL ◽  
PAUL A. TAYLOR ◽  
TARA L. ALVAREZ

AbstractPurpose: Cortical and subcortical functional activity stimulated via saccade and vergence eye movements were investigated to examine the similarities and differences between networks and regions of interest (ROIs). Methods: Blood oxygenation level-dependent (BOLD) signals from stimulus-induced functional Magnetic Resonance Imaging (MRI) experiments were analyzed studying 16 healthy subjects. Six types of oculomotor experiments were conducted using a block design to study both saccade and vergence circuits. The experiments included a simple eye movement task and a more cognitively demanding prediction task. A hierarchical independent component analysis (ICA) process began by analyzing individual subject data sets with spatial ICA to extract spatial independent components (sIC), which resulted in three ROIs. Using the time series from each of the three ROIs per subject, per oculomotor experiment, a temporal ICA was used to compute individual temporal independent components (tICs). For each of the three ROIs, the individual tICs from multiple subjects were entered into a second temporal ICA to compute group-level tICs for comparison. Results: Two independent spatial maps were observed for each subject (one sIC showing activity in the frontoparietal regions and another sIC in the cerebellum) during the six oculomotor tasks. Analysis of group-level tICs revealed an increased latency in the cerebellar region when compared to the frontoparietal region. Conclusion: Shared neuronal behavior has been reported in the frontal and parietal lobes, which may in part explain the segregation of frontoparietal functional activity into one sIC. The cerebellum uses multiple time scales for motor learning. This may result in an increased latency observed in the BOLD signal of the cerebellar group-level tIC when compared to the frontal and parietal group-level tICs. The increased latency offers a possible explanation to why ICA dissects the cerebellar activity into an sIC. The hierarchical ICA process used to calculate group-level tICs can yield insight into functional connectivity within complex neural networks.


Sign in / Sign up

Export Citation Format

Share Document