Unknown

Dataset Information

0

Branch-recombinant Gaussian processes for analysis of perturbations in biological time series.


ABSTRACT: Motivation:A common class of behaviour encountered in the biological sciences involves branching and recombination. During branching, a statistical process bifurcates resulting in two or more potentially correlated processes that may undergo further branching; the contrary is true during recombination, where two or more statistical processes converge. A key objective is to identify the time of this bifurcation (branch or recombination time) from time series measurements, e.g. by comparing a control time series with perturbed time series. Gaussian processes (GPs) represent an ideal framework for such analysis, allowing for nonlinear regression that includes a rigorous treatment of uncertainty. Currently, however, GP models only exist for two-branch systems. Here, we highlight how arbitrarily complex branching processes can be built using the correct composition of covariance functions within a GP framework, thus outlining a general framework for the treatment of branching and recombination in the form of branch-recombinant Gaussian processes (B-RGPs). Results:We first benchmark the performance of B-RGPs compared to a variety of existing regression approaches, and demonstrate robustness to model misspecification. B-RGPs are then used to investigate the branching patterns of Arabidopsis thaliana gene expression following inoculation with the hemibotrophic bacteria, Pseudomonas syringae DC3000, and a disarmed mutant strain, hrpA. By grouping genes according to the number of branches, we could naturally separate out genes involved in basal immune response from those subverted by the virulent strain, and show enrichment for targets of pathogen protein effectors. Finally, we identify two early branching genes WRKY11 and WRKY17, and show that genes that branched at similar times to WRKY11/17 were enriched for W-box binding motifs, and overrepresented for genes differentially expressed in WRKY11/17 knockouts, suggesting that branch time could be used for identifying direct and indirect binding targets of key transcription factors. Availability and implementation:https://github.com/cap76/BranchingGPs. Supplementary information:Supplementary data are available at Bioinformatics online.

SUBMITTER: Penfold CA 

PROVIDER: S-EPMC6129282 | biostudies-literature | 2018 Sep

REPOSITORIES: biostudies-literature

altmetric image

Publications

Branch-recombinant Gaussian processes for analysis of perturbations in biological time series.

Penfold Christopher A CA   Sybirna Anastasiya A   Reid John E JE   Huang Yun Y   Wernisch Lorenz L   Ghahramani Zoubin Z   Grant Murray M   Surani M Azim MA  

Bioinformatics (Oxford, England) 20180901 17


<h4>Motivation</h4>A common class of behaviour encountered in the biological sciences involves branching and recombination. During branching, a statistical process bifurcates resulting in two or more potentially correlated processes that may undergo further branching; the contrary is true during recombination, where two or more statistical processes converge. A key objective is to identify the time of this bifurcation (branch or recombination time) from time series measurements, e.g. by comparin  ...[more]

Similar Datasets

| S-EPMC7341595 | biostudies-literature
| S-EPMC5444866 | biostudies-literature
| S-EPMC4480927 | biostudies-literature
| S-EPMC5159892 | biostudies-literature
| S-EPMC7394483 | biostudies-literature
2017-10-08 | GSE104714 | GEO
| S-EPMC5390911 | biostudies-literature
| S-EPMC3748037 | biostudies-literature
2019-05-11 | GSE131032 | GEO
| S-EPMC5786324 | biostudies-literature