Project description:The ideal genome sequence for medical interpretation is complete and diploid, capturing the full spectrum of genetic variation. Toward this end, there has been progress in discovery of single nucleotide polymorphism (SNP) and small (<10bp) insertion/deletions (indels), but annotation of larger structural variation (SV) including copy number variation (CNV) has been less comprehensive, even with available diploid sequence assemblies. We applied a multi-step sequence and microarray-based analysis to identify numerous previously unknown SVs within the first genome sequence reported from an individual. An Affymetrix SNP array experiment was performed according to the manufacturer's directions on DNA extracted from a lymphoblastoid cell line (HuRef). Copy number analysis of Affymetrix 6.0 SNP arrays was performed for the HuRef sample. The HuRef sample was run in a batch of 50 control samples, which were used as baseline for calling CNVs.