Unknown

Dataset Information

0

Quantification of private information leakage from phenotype-genotype data: linking attacks.


ABSTRACT: Studies on genomic privacy have traditionally focused on identifying individuals using DNA variants. In contrast, molecular phenotype data, such as gene expression levels, are generally assumed to be free of such identifying information. Although there is no explicit genotypic information in phenotype data, adversaries can statistically link phenotypes to genotypes using publicly available genotype-phenotype correlations such as expression quantitative trait loci (eQTLs). This linking can be accurate when high-dimensional data (i.e., many expression levels) are used, and the resulting links can then reveal sensitive information (for example, the fact that an individual has cancer). Here we develop frameworks for quantifying the leakage of characterizing information from phenotype data sets. These frameworks can be used to estimate the leakage from large data sets before release. We also present a general three-step procedure for practically instantiating linking attacks and a specific attack using outlier gene expression levels that is simple yet accurate. Finally, we describe the effectiveness of this outlier attack under different scenarios.

SUBMITTER: Harmanci A 

PROVIDER: S-EPMC4834871 | biostudies-literature | 2016 Mar

REPOSITORIES: biostudies-literature

altmetric image

Publications

Quantification of private information leakage from phenotype-genotype data: linking attacks.

Harmanci Arif A   Gerstein Mark M  

Nature methods 20160201 3


Studies on genomic privacy have traditionally focused on identifying individuals using DNA variants. In contrast, molecular phenotype data, such as gene expression levels, are generally assumed to be free of such identifying information. Although there is no explicit genotypic information in phenotype data, adversaries can statistically link phenotypes to genotypes using publicly available genotype-phenotype correlations such as expression quantitative trait loci (eQTLs). This linking can be acc  ...[more]

Similar Datasets

| S-EPMC7672785 | biostudies-literature
| S-EPMC8278664 | biostudies-literature
| S-EPMC8147460 | biostudies-literature
| S-EPMC5896589 | biostudies-literature
| S-EPMC3927890 | biostudies-literature
| S-EPMC6090647 | biostudies-literature
| S-EPMC6746860 | biostudies-literature
| S-EPMC8696111 | biostudies-literature
| S-EPMC3965098 | biostudies-literature
| S-EPMC4326710 | biostudies-literature