Dataset Information

DbVOR: a database system for importing pedigree, phenotype and genotype data and exporting selected subsets.

ABSTRACT: When studying the genetics of a human trait, we typically have to manage both genome-wide and targeted genotype data. There can be overlap of both people and markers from different genotyping experiments; the overlap can introduce several kinds of problems. Most times the overlapping genotypes are the same, but sometimes they are different. Occasionally, the lab will return genotypes using a different allele labeling scheme (for example 1/2 vs A/C). Sometimes, the genotype for a person/marker index is unreliable or missing. Further, over time some markers are merged and bad samples are re-run under a different sample name. We need a consistent picture of the subset of data we have chosen to work with even though there might possibly be conflicting measurements from multiple data sources.We have developed the dbVOR database, which is designed to hold data efficiently for both genome-wide and targeted experiments. The data are indexed for fast retrieval by person and marker. In addition, we store pedigree and phenotype data for our subjects. The dbVOR database allows us to select subsets of the data by several different criteria and to merge their results into a coherent and consistent whole. Data may be filtered by: family, person, trait value, markers, chromosomes, and chromosome ranges. The results can be presented in columnar, Mega2, or PLINK format.dbVOR serves our needs well. It is freely available from https://watson.hgen.pitt.edu/register . Documentation for dbVOR can be found at https://watson.hgen.pitt.edu/register/docs/dbvor.html .

SUBMITTER: Baron RV

PROVIDER: S-EPMC4407391 | biostudies-literature | 2015 Mar

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

dbVOR: a database system for importing pedigree, phenotype and genotype data and exporting selected subsets.

Baron Robert V RV Conley Yvette P YP Gorin Michael B MB Weeks Daniel E DE

BMC bioinformatics 20150318

<h4>Background</h4>When studying the genetics of a human trait, we typically have to manage both genome-wide and targeted genotype data. There can be overlap of both people and markers from different genotyping experiments; the overlap can introduce several kinds of problems. Most times the overlapping genotypes are the same, but sometimes they are different. Occasionally, the lab will return genotypes using a different allele labeling scheme (for example 1/2 vs A/C). Sometimes, the genotype for ...[more]

PMID: 25887129

Similar Datasets

Project description:BackgroundThe ability to discover genetic variants in a patient runs far ahead of the ability to interpret them. Databases with accurate descriptions of the causal relationship between the variants and the phenotype are valuable since these are critical tools in clinical genetic diagnostics. Here, we introduce a comprehensive and global genotype-phenotype database focusing on rare diseases.MethodsThis database (CentoMD ®) is a browser-based tool that enables access to a comprehensive, independently curated system utilizing stringent high-quality criteria and a quickly growing repository of genetic and human phenotype ontology (HPO)-based clinical information. Its main goals are to aid the evaluation of genetic variants, to enhance the validity of the genetic analytical workflow, to increase the quality of genetic diagnoses, and to improve evaluation of treatment options for patients with hereditary diseases. The database software correlates clinical information from consented patients and probands of different geographical backgrounds with a large dataset of genetic variants and, when available, biomarker information. An automated follow-up tool is incorporated that informs all users whenever a variant classification has changed. These unique features fully embedded in a CLIA/CAP-accredited quality management system allow appropriate data quality and enhanced patient safety.ResultsMore than 100,000 genetically screened individuals are documented in the database, resulting in more than 470 million variant detections. Approximately, 57% of the clinically relevant and uncertain variants in the database are novel. Notably, 3% of the genetic variants identified and previously reported in the literature as being associated with a particular rare disease were reclassified, based on internal evidence, as clinically irrelevant.ConclusionsThe database offers a comprehensive summary of the clinical validity and causality of detected gene variants with their associated phenotypes, and is a valuable tool for identifying new disease genes through the correlation of novel genetic variants with specific, well-defined phenotypes.

Dataset Information

DbVOR: a database system for importing pedigree, phenotype and genotype data and exporting selected subsets.

Publications

dbVOR: a database system for importing pedigree, phenotype and genotype data and exporting selected subsets.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets