Dataset Information

VCGDB: a dynamic genome database of the Chinese population.

ABSTRACT: BACKGROUND: The data released by the 1000 Genomes Project contain an increasing number of genome sequences from different nations and populations with a large number of genetic variations. As a result, the focus of human genome studies is changing from single and static to complex and dynamic. The currently available human reference genome (GRCh37) is based on sequencing data from 13 anonymous Caucasian volunteers, which might limit the scope of genomics, transcriptomics, epigenetics, and genome wide association studies. DESCRIPTION: We used the massive amount of sequencing data published by the 1000 Genomes Project Consortium to construct the Virtual Chinese Genome Database (VCGDB), a dynamic genome database of the Chinese population based on the whole genome sequencing data of 194 individuals. VCGDB provides dynamic genomic information, which contains 35 million single nucleotide variations (SNVs), 0.5 million insertions/deletions (indels), and 29 million rare variations, together with genomic annotation information. VCGDB also provides a highly interactive user-friendly virtual Chinese genome browser (VCGBrowser) with functions like seamless zooming and real-time searching. In addition, we have established three population-specific consensus Chinese reference genomes that are compatible with mainstream alignment software. CONCLUSIONS: VCGDB offers a feasible strategy for processing big data to keep pace with the biological data explosion by providing a robust resource for genomics studies; in particular, studies aimed at finding regions of the genome associated with diseases.

SUBMITTER: Ling Y

PROVIDER: S-EPMC4028056 | biostudies-literature | 2014

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

VCGDB: a dynamic genome database of the Chinese population.

Ling Yunchao Y Jin Zhong Z Su Mingming M Zhong Jun J Zhao Yongbing Y Yu Jun J Wu Jiayan J Xiao Jingfa J

BMC genomics 20140405

<h4>Background</h4>The data released by the 1000 Genomes Project contain an increasing number of genome sequences from different nations and populations with a large number of genetic variations. As a result, the focus of human genome studies is changing from single and static to complex and dynamic. The currently available human reference genome (GRCh37) is based on sequencing data from 13 anonymous Caucasian volunteers, which might limit the scope of genomics, transcriptomics, epigenetics, and ...[more]

PMID: 24708222

Similar Datasets

Project description:Smoking is a risk factor for many human diseases. DNA methylation has been related to smoking, but genome-wide methylation data for smoking in Chinese populations is limited.We aimed to investigate epigenome-wide methylation in relation to smoking in a Chinese population.We measured the methylation levels at > 485,000 CpG sites (CpGs) in DNA from leukocytes using a methylation array and conducted a genome-wide meta-analysis of DNA methylation and smoking in a total of 596 Chinese participants. We further evaluated the associations of smoking-related CpGs with internal polycyclic aromatic hydrocarbon (PAH) biomarkers and their correlations with the expression of corresponding genes.We identified 318 CpGs whose methylation levels were associated with smoking at a genome-wide significance level (false discovery rate < 0.05), among which 161 CpGs annotated to 123 genes were not associated with smoking in recent studies of Europeans and African Americans. Of these smoking-related CpGs, methylation levels at 80 CpGs showed significant correlations with the expression of corresponding genes (including RUNX3, IL6R, PTAFR, ANKRD11, CEP135 and CDH23), and methylation at 15 CpGs was significantly associated with urinary 2-hydroxynaphthalene, the most representative internal monohydroxy-PAH biomarker for smoking.We identified DNA methylation markers associated with smoking in a Chinese population, including some markers that were also correlated with gene expression. Exposure to naphthalene, a byproduct of tobacco smoke, may contribute to smoking-related methylation.Zhu X, Li J, Deng S, Yu K, Liu X, Deng Q, Sun H, Zhang X, He M, Guo H, Chen W, Yuan J, Zhang B, Kuang D, He X, Bai Y, Han X, Liu B, Li X, Yang L, Jiang H, Zhang Y, Hu J, Cheng L, Luo X, Mei W, Zhou Z, Sun S, Zhang L, Liu C, Guo Y, Zhang Z, Hu FB, Liang L, Wu T. 2016. Genome-wide analysis of DNA methylation and cigarette smoking in Chinese. Environ Health Perspect 124:966-973;?http://dx.doi.org/10.1289/ehp.1509834.

Dataset Information

VCGDB: a dynamic genome database of the Chinese population.

Publications

VCGDB: a dynamic genome database of the Chinese population.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets