Dataset Information

Memory Efficient PCA Methods for Large Group ICA.

ABSTRACT: Principal component analysis (PCA) is widely used for data reduction in group independent component analysis (ICA) of fMRI data. Commonly, group-level PCA of temporally concatenated datasets is computed prior to ICA of the group principal components. This work focuses on reducing very high dimensional temporally concatenated datasets into its group PCA space. Existing randomized PCA methods can determine the PCA subspace with minimal memory requirements and, thus, are ideal for solving large PCA problems. Since the number of dataloads is not typically optimized, we extend one of these methods to compute PCA of very large datasets with a minimal number of dataloads. This method is coined multi power iteration (MPOWIT). The key idea behind MPOWIT is to estimate a subspace larger than the desired one, while checking for convergence of only the smaller subset of interest. The number of iterations is reduced considerably (as well as the number of dataloads), accelerating convergence without loss of accuracy. More importantly, in the proposed implementation of MPOWIT, the memory required for successful recovery of the group principal components becomes independent of the number of subjects analyzed. Highly efficient subsampled eigenvalue decomposition techniques are also introduced, furnishing excellent PCA subspace approximations that can be used for intelligent initialization of randomized methods such as MPOWIT. Together, these developments enable efficient estimation of accurate principal components, as we illustrate by solving a 1600-subject group-level PCA of fMRI with standard acquisition parameters, on a regular desktop computer with only 4 GB RAM, in just a few hours. MPOWIT is also highly scalable and could realistically solve group-level PCA of fMRI on thousands of subjects, or more, using standard hardware, limited only by time, not memory. Also, the MPOWIT algorithm is highly parallelizable, which would enable fast, distributed implementations ideal for big data analysis. Implications to other methods such as expectation maximization PCA (EM PCA) are also presented. Based on our results, general recommendations for efficient application of PCA methods are given according to problem size and available computational resources. MPOWIT and all other methods discussed here are implemented and readily available in the open source GIFT software.

SUBMITTER: Rachakonda S

PROVIDER: S-EPMC4735350 | biostudies-literature | 2016

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Memory Efficient PCA Methods for Large Group ICA.

Rachakonda Srinivas S Silva Rogers F RF Liu Jingyu J Calhoun Vince D VD

Frontiers in neuroscience 20160202

Principal component analysis (PCA) is widely used for data reduction in group independent component analysis (ICA) of fMRI data. Commonly, group-level PCA of temporally concatenated datasets is computed prior to ICA of the group principal components. This work focuses on reducing very high dimensional temporally concatenated datasets into its group PCA space. Existing randomized PCA methods can determine the PCA subspace with minimal memory requirements and, thus, are ideal for solving large PCA ...[more]

PMID: 26869874

Similar Datasets

Project description:Non-adaptive signal processing methods have been successfully applied to extract fetal electrocardiograms (fECGs) from maternal abdominal electrocardiograms (aECGs); and initial tests to evaluate the efficacy of these methods have been carried out by using synthetic data. Nevertheless, performance evaluation of such methods using real data is a much more challenging task and has neither been fully undertaken nor reported in the literature. Therefore, in this investigation, we aimed to compare the effectiveness of two popular non-adaptive methods (the ICA and PCA) to explore the non-invasive (NI) extraction (separation) of fECGs, also known as NI-fECGs from aECGs. The performance of these well-known methods was enhanced by an adaptive algorithm, compensating amplitude difference and time shift between the estimated components. We used real signals compiled in 12 recordings (real01-real12). Five of the recordings were from the publicly available database (PhysioNet-Abdominal and Direct Fetal Electrocardiogram Database), which included data recorded by multiple abdominal electrodes. Seven more recordings were acquired by measurements performed at the Institute of Medical Technology and Equipment, Zabrze, Poland. Therefore, in total we used 60 min of data (i.e., around 88,000 R waves) for our experiments. This dataset covers different gestational ages, fetal positions, fetal positions, maternal body mass indices (BMI), etc. Such a unique heterogeneous dataset of sufficient length combining continuous Fetal Scalp Electrode (FSE) acquired and abdominal ECG recordings allows for robust testing of the applied ICA and PCA methods. The performance of these signal separation methods was then comprehensively evaluated by comparing the fetal Heart Rate (fHR) values determined from the extracted fECGs with those calculated from the fECG signals recorded directly by means of a reference FSE. Additionally, we tested the possibility of non-invasive ST analysis (NI-STAN) by determining the T/QRS ratio. Our results demonstrated that even though these advanced signal processing methods are suitable for the non-invasive estimation and monitoring of the fHR information from maternal aECG signals, their utility for further morphological analysis of the extracted fECG signals remains questionable and warrants further work.

Project description:A variety of preprocessing techniques are available to correct subject-dependant artifacts in fMRI, caused by head motion and physiological noise. Although it has been established that the chosen preprocessing steps (or "pipeline") may significantly affect fMRI results, it is not well understood how preprocessing choices interact with other parts of the fMRI experimental design. In this study, we examine how two experimental factors interact with preprocessing: between-subject heterogeneity, and strength of task contrast. Two levels of cognitive contrast were examined in an fMRI adaptation of the Trail-Making Test, with data from young, healthy adults. The importance of standard preprocessing with motion correction, physiological noise correction, motion parameter regression and temporal detrending were examined for the two task contrasts. We also tested subspace estimation using Principal Component Analysis (PCA), and Independent Component Analysis (ICA). Results were obtained for Penalized Discriminant Analysis, and model performance quantified with reproducibility (R) and prediction metrics (P). Simulation methods were also used to test for potential biases from individual-subject optimization. Our results demonstrate that (1) individual pipeline optimization is not significantly more biased than fixed preprocessing. In addition, (2) when applying a fixed pipeline across all subjects, the task contrast significantly affects pipeline performance; in particular, the effects of PCA and ICA models vary with contrast, and are not by themselves optimal preprocessing steps. Also, (3) selecting the optimal pipeline for each subject improves within-subject (P,R) and between-subject overlap, with the weaker cognitive contrast being more sensitive to pipeline optimization. These results demonstrate that sensitivity of fMRI results is influenced not only by preprocessing choices, but also by interactions with other experimental design factors. This paper outlines a quantitative procedure to denoise data that would otherwise be discarded due to artifact; this is particularly relevant for weak signal contrasts in single-subject, small-sample and clinical datasets.

Dataset Information

Memory Efficient PCA Methods for Large Group ICA.

Publications

Memory Efficient PCA Methods for Large Group ICA.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets