Hierarchical clustering of human papilloma virus genotype patterns in the ASCUS-LSIL triage study.
Ontology highlight
ABSTRACT: Anogenital cancers are associated with ?13 carcinogenic human papilloma virus (HPV) types in a broader group that cause cervical intraepithelial neoplasia (CIN). Multiple concurrent cervical HPV infections are common, which complicates the attribution of HPV types to different grades of CIN. Here we report the analysis of HPV genotype patterns in the atypical squamous cells of undetermined significance-low-grade squamous intraepithelial lesion triage study with the use of unsupervised hierarchical clustering. Women who underwent colposcopy at baseline (n = 2,780) were grouped into 20 disease categories based on histology and cytology. Disease groups and HPV genotypes were clustered with the use of complete linkage. Risk of 2-year cumulative CIN3+, viral load, colposcopic impression, and age were compared between disease groups and major clusters. Hierarchical clustering yielded four major disease clusters: cluster 1 included all CIN3 histology with abnormal cytology; cluster 2 included CIN3 histology with normal cytology and combinations with either CIN2 or high-grade squamous intraepithelial lesion cytology; cluster 3 included older women with normal or low-grade histology/cytology and low viral load; and cluster 4 included younger women with low-grade histology/cytology, multiple infections, and the highest viral load. Three major groups of HPV genotypes were identified: group 1 included only HPV16; group 2 included nine carcinogenic types, plus noncarcinogenic HPV53 and HPV66; and group 3 included noncarcinogenic types, plus carcinogenic HPV33 and HPV45. Clustering results suggested that colposcopy missed a prevalent precancer in many women with no biopsy/normal histology and high-grade squamous intraepithelial lesion. This result was confirmed by an elevated 2-year risk of CIN3+ in these groups. Our novel approach to study multiple genotype infections in cervical disease with the use of unsupervised hierarchical clustering can address complex genotype distributions on a population level.
SUBMITTER: Wentzensen N
PROVIDER: S-EPMC2970748 | biostudies-literature | 2010 Nov
REPOSITORIES: biostudies-literature
ACCESS DATA