Unknown

Dataset Information

0

A pipeline for high throughput detection and mapping of SNPs from EST databases.


ABSTRACT: Single nucleotide polymorphisms (SNPs) represent the most abundant type of genetic variation that can be used as molecular markers. The SNPs that are hidden in sequence databases can be unlocked using bioinformatic tools. For efficient application of these SNPs, the sequence set should be error-free as much as possible, targeting single loci and suitable for the SNP scoring platform of choice. We have developed a pipeline to effectively mine SNPs from public EST databases with or without quality information using QualitySNP software, select reliable SNP and prepare the loci for analysis on the Illumina GoldenGate genotyping platform. The applicability of the pipeline was demonstrated using publicly available potato EST data, genotyping individuals from two diploid mapping populations and subsequently mapping the SNP markers (putative genes) in both populations. Over 7000 reliable SNPs were identified that met the criteria for genotyping on the GoldenGate platform. Of the 384 SNPs on the SNP array approximately 12% dropped out. For the two potato mapping populations 165 and 185 SNPs segregating SNP loci could be mapped on the respective genetic maps, illustrating the effectiveness of our pipeline for SNP selection and validation. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1007/s11032-009-9377-5) contains supplementary material, which is available to authorized users.

SUBMITTER: Anithakumari AM 

PROVIDER: S-EPMC2869401 | biostudies-literature | 2010 Jun

REPOSITORIES: biostudies-literature

altmetric image

Publications

A pipeline for high throughput detection and mapping of SNPs from EST databases.

Anithakumari A M AM   Tang Jifeng J   van Eck Herman J HJ   Visser Richard G F RG   Leunissen Jack A M JA   Vosman Ben B   van der Linden C Gerard CG  

Molecular breeding : new strategies in plant improvement 20100120 1


Single nucleotide polymorphisms (SNPs) represent the most abundant type of genetic variation that can be used as molecular markers. The SNPs that are hidden in sequence databases can be unlocked using bioinformatic tools. For efficient application of these SNPs, the sequence set should be error-free as much as possible, targeting single loci and suitable for the SNP scoring platform of choice. We have developed a pipeline to effectively mine SNPs from public EST databases with or without quality  ...[more]

Similar Datasets

| S-EPMC3079667 | biostudies-literature
| S-EPMC7532165 | biostudies-literature
| S-EPMC8561967 | biostudies-literature
| S-EPMC6384052 | biostudies-literature
| S-EPMC10082157 | biostudies-literature
| S-EPMC1626092 | biostudies-literature
2020-06-12 | PXD017450 | Pride
2009-12-09 | GSE16651 | GEO
| S-EPMC2268656 | biostudies-literature
2009-12-09 | E-GEOD-16651 | biostudies-arrayexpress