Dataset Information

SynTReN: a generator of synthetic gene expression data for design and analysis of structure learning algorithms.

ABSTRACT:

Background

The development of algorithms to infer the structure of gene regulatory networks based on expression data is an important subject in bioinformatics research. Validation of these algorithms requires benchmark data sets for which the underlying network is known. Since experimental data sets of the appropriate size and design are usually not available, there is a clear need to generate well-characterized synthetic data sets that allow thorough testing of learning algorithms in a fast and reproducible manner.

Results

In this paper we describe a network generator that creates synthetic transcriptional regulatory networks and produces simulated gene expression data that approximates experimental data. Network topologies are generated by selecting subnetworks from previously described regulatory networks. Interaction kinetics are modeled by equations based on Michaelis-Menten and Hill kinetics. Our results show that the statistical properties of these topologies more closely approximate those of genuine biological networks than do those of different types of random graph models. Several user-definable parameters adjust the complexity of the resulting data set with respect to the structure learning algorithms.

Conclusion

This network generation technique offers a valid alternative to existing methods. The topological characteristics of the generated networks more closely resemble the characteristics of real transcriptional networks. Simulation of the network scales well to large networks. The generator models different types of biological interactions and produces biologically plausible synthetic gene expression data.

SUBMITTER: Van den Bulcke T

PROVIDER: S-EPMC1373604 | biostudies-literature | 2006 Jan

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

SynTReN: a generator of synthetic gene expression data for design and analysis of structure learning algorithms.

Van den Bulcke Tim T Van Leemput Koenraad K Naudts Bart B van Remortel Piet P Ma Hongwu H Verschoren Alain A De Moor Bart B Marchal Kathleen K

BMC bioinformatics 20060126

<h4>Background</h4>The development of algorithms to infer the structure of gene regulatory networks based on expression data is an important subject in bioinformatics research. Validation of these algorithms requires benchmark data sets for which the underlying network is known. Since experimental data sets of the appropriate size and design are usually not available, there is a clear need to generate well-characterized synthetic data sets that allow thorough testing of learning algorithms in a ...[more]

PMID: 16438721

Dataset Information

SynTReN: a generator of synthetic gene expression data for design and analysis of structure learning algorithms.

Background

Results

Conclusion

Publications

SynTReN: a generator of synthetic gene expression data for design and analysis of structure learning algorithms.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

A Synthetic Kinome Microarray Data Generator.
| S-EPMC4996406 | biostudies-literature

Adapting machine-learning algorithms to design gene circuits.
| S-EPMC6487017 | biostudies-literature

A comparative analysis of biclustering algorithms for gene expression data.
| S-EPMC3659300 | biostudies-literature

SEED-G: Simulated EEG Data Generator for Testing Connectivity Algorithms.
| S-EPMC8197139 | biostudies-literature

Combining Multiple RNA-Seq Data Analysis Algorithms Using Machine Learning Improves Differential Isoform Expression Analysis.
| S-EPMC8544431 | biostudies-literature

Design and deep learning of synthetic B- cell-specific promoters
2023-05-16 | GSE232161 | GEO

A comparative study of machine learning and deep learning algorithms to classify cancer types based on microarray gene expression data.
| S-EPMC7924492 | biostudies-literature

Clustering Algorithms: Their Application to Gene Expression Data.
| S-EPMC5135122 | biostudies-other

Impact of RNA-seq data analysis algorithms on gene expression estimation and downstream prediction.
| S-EPMC7578822 | biostudies-literature

Training data composition affects performance of protein structure analysis algorithms.
| S-EPMC8669736 | biostudies-literature