Dataset Information

Robust and Fast Markov Chain Monte Carlo Sampling of Diffusion MRI Microstructure Models.

ABSTRACT: In diffusion MRI analysis, advances in biophysical multi-compartment modeling have gained popularity over the conventional Diffusion Tensor Imaging (DTI), because they can obtain a greater specificity in relating the dMRI signal to underlying cellular microstructure. Biophysical multi-compartment models require a parameter estimation, typically performed using either the Maximum Likelihood Estimation (MLE) or the Markov Chain Monte Carlo (MCMC) sampling. Whereas, the MLE provides only a point estimate of the fitted model parameters, the MCMC recovers the entire posterior distribution of the model parameters given in the data, providing additional information such as parameter uncertainty and correlations. MCMC sampling is currently not routinely applied in dMRI microstructure modeling, as it requires adjustment and tuning, specific to each model, particularly in the choice of proposal distributions, burn-in length, thinning, and the number of samples to store. In addition, sampling often takes at least an order of magnitude, more time than non-linear optimization. Here we investigate the performance of the MCMC algorithm variations over multiple popular diffusion microstructure models, to examine whether a single, well performing variation could be applied efficiently and robustly to many models. Using an efficient GPU-based implementation, we showed that run times can be removed as a prohibitive constraint for the sampling of diffusion multi-compartment models. Using this implementation, we investigated the effectiveness of different adaptive MCMC algorithms, burn-in, initialization, and thinning. Finally we applied the theory of the Effective Sample Size, to the diffusion multi-compartment models, as a way of determining a relatively general target for the number of samples needed to characterize parameter distributions for different models and data sets. We conclude that adaptive Metropolis methods increase MCMC performance and select the Adaptive Metropolis-Within-Gibbs (AMWG) algorithm as the primary method. We furthermore advise to initialize the sampling with an MLE point estimate, in which case 100 to 200 samples are sufficient as a burn-in. Finally, we advise against thinning in most use-cases and as a relatively general target for the number of samples, we recommend a multivariate Effective Sample Size of 2,200.

SUBMITTER: Harms RL

PROVIDER: S-EPMC6305549 | biostudies-literature | 2018

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Robust and Fast Markov Chain Monte Carlo Sampling of Diffusion MRI Microstructure Models.

Harms Robbert L RL Roebroeck Alard A

Frontiers in neuroinformatics 20181218

In diffusion MRI analysis, advances in biophysical multi-compartment modeling have gained popularity over the conventional Diffusion Tensor Imaging (DTI), because they can obtain a greater specificity in relating the dMRI signal to underlying cellular microstructure. Biophysical multi-compartment models require a parameter estimation, typically performed using either the Maximum Likelihood Estimation (MLE) or the Markov Chain Monte Carlo (MCMC) sampling. Whereas, the MLE provides only a point es ...[more]

PMID: 30618702

Similar Datasets

Project description:Advances in biophysical multi-compartment modeling for diffusion MRI (dMRI) have gained popularity because of greater specificity than DTI in relating the dMRI signal to underlying cellular microstructure. A large range of these diffusion microstructure models have been developed and each of the popular models comes with its own, often different, optimization algorithm, noise model and initialization strategy to estimate its parameter maps. Since data fit, accuracy and precision is hard to verify, this creates additional challenges to comparability and generalization of results from diffusion microstructure models. In addition, non-linear optimization is computationally expensive leading to very long run times, which can be prohibitive in large group or population studies. In this technical note we investigate the performance of several optimization algorithms and initialization strategies over a few of the most popular diffusion microstructure models, including NODDI and CHARMED. We evaluate whether a single well performing optimization approach exists that could be applied to many models and would equate both run time and fit aspects. All models, algorithms and strategies were implemented on the Graphics Processing Unit (GPU) to remove run time constraints, with which we achieve whole brain dataset fits in seconds to minutes. We then evaluated fit, accuracy, precision and run time for different models of differing complexity against three common optimization algorithms and three parameter initialization strategies. Variability of the achieved quality of fit in actual data was evaluated on ten subjects of each of two population studies with a different acquisition protocol. We find that optimization algorithms and multi-step optimization approaches have a considerable influence on performance and stability over subjects and over acquisition protocols. The gradient-free Powell conjugate-direction algorithm was found to outperform other common algorithms in terms of run time, fit, accuracy and precision. Parameter initialization approaches were found to be relevant especially for more complex models, such as those involving several fiber orientations per voxel. For these, a fitting cascade initializing or fixing parameter values in a later optimization step from simpler models in an earlier optimization step further improved run time, fit, accuracy and precision compared to a single step fit. This establishes and makes available standards by which robust fit and accuracy can be achieved in shorter run times. This is especially relevant for the use of diffusion microstructure modeling in large group or population studies and in combining microstructure parameter maps with tractography results.

Project description:BackgroundRunning multiple-chain Markov Chain Monte Carlo (MCMC) provides an efficient parallel computing method for complex Bayesian models, although the efficiency of the approach critically depends on the length of the non-parallelizable burn-in period, for which all simulated data are discarded. In practice, this burn-in period is set arbitrarily and often leads to the performance of far more iterations than required. In addition, the accuracy of genomic predictions does not improve after the MCMC reaches equilibrium.ResultsAutomatic tuning of the burn-in length for running multiple-chain MCMC was proposed in the context of genomic predictions using BayesA and BayesCπ models. The performance of parallel computing versus sequential computing and tunable burn-in MCMC versus fixed burn-in MCMC was assessed using simulation data sets as well by applying these methods to genomic predictions of a Chinese Simmental beef cattle population. The results showed that tunable burn-in parallel MCMC had greater speedups than fixed burn-in parallel MCMC, and both had greater speedups relative to sequential (single-chain) MCMC. Nevertheless, genomic estimated breeding values (GEBVs) and genomic prediction accuracies were highly comparable between the various computing approaches. When applied to the genomic predictions of four quantitative traits in a Chinese Simmental population of 1217 beef cattle genotyped by an Illumina Bovine 770 K SNP BeadChip, tunable burn-in multiple-chain BayesCπ (TBM-BayesCπ) outperformed tunable burn-in multiple-chain BayesCπ (TBM-BayesA) and Genomic Best Linear Unbiased Prediction (GBLUP) in terms of the prediction accuracy, although the differences were not necessarily caused by computational factors and could have been intrinsic to the statistical models per se.ConclusionsAutomatically tunable burn-in multiple-chain MCMC provides an accurate and cost-effective tool for high-performance computing of Bayesian genomic prediction models, and this algorithm is generally applicable to high-performance computing of any complex Bayesian statistical model.

Project description:BackgroundLong-range interactions between regulatory DNA elements such as enhancers, insulators and promoters play an important role in regulating transcription. As chromatin contacts have been found throughout the human genome and in different cell types, spatial transcriptional control is now viewed as a general mechanism of gene expression regulation. Chromosome Conformation Capture Carbon Copy (5C) and its variant Hi-C are techniques used to measure the interaction frequency (IF) between specific regions of the genome. Our goal is to use the IF data generated by these experiments to computationally model and analyze three-dimensional chromatin organization.ResultsWe formulate a probabilistic model linking 5C/Hi-C data to physical distances and describe a Markov chain Monte Carlo (MCMC) approach called MCMC5C to generate a representative sample from the posterior distribution over structures from IF data. Structures produced from parallel MCMC runs on the same dataset demonstrate that our MCMC method mixes quickly and is able to sample from the posterior distribution of structures and find subclasses of structures. Structural properties (base looping, condensation, and local density) were defined and their distribution measured across the ensembles of structures generated. We applied these methods to a biological model of human myelomonocyte cellular differentiation and identified distinct chromatin conformation signatures (CCSs) corresponding to each of the cellular states. We also demonstrate the ability of our method to run on Hi-C data and produce a model of human chromosome 14 at 1Mb resolution that is consistent with previously observed structural properties as measured by 3D-FISH.ConclusionsWe believe that tools like MCMC5C are essential for the reliable analysis of data from the 3C-derived techniques such as 5C and Hi-C. By integrating complex, high-dimensional and noisy datasets into an easy to interpret ensemble of three-dimensional conformations, MCMC5C allows researchers to reliably interpret the result of their assay and contrast conformations under different conditions.Availabilityhttp://Dostielab.biochem.mcgill.ca.

Dataset Information

Robust and Fast Markov Chain Monte Carlo Sampling of Diffusion MRI Microstructure Models.

Publications

Robust and Fast Markov Chain Monte Carlo Sampling of Diffusion MRI Microstructure Models.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets