Ontology highlight
ABSTRACT: Motivation
The majority of genome analysis tools and pipelines require data to be decrypted for access. This potentially leaves sensitive genetic data exposed, either because the unencrypted data is not removed after analysis, or because the data leaves traces on the permanent storage medium.Results
: We defined a file container specification enabling direct byte-level compatible random access to encrypted genetic data stored in community standards such as SAM/BAM/CRAM/VCF/BCF. By standardizing this format, we show how it can be added as a native file format to genomic libraries, enabling direct analysis of encrypted data without the need to create a decrypted copy.Availability and implementation
The Crypt4GH specification can be found at: http://samtools.github.io/hts-specs/crypt4gh.pdf.Supplementary information
Supplementary data are available at Bioinformatics online.
SUBMITTER: Senf A
PROVIDER: S-EPMC8522443 | biostudies-literature | 2021 Sep
REPOSITORIES: biostudies-literature
Senf Alexander A Davies Robert R Haziza Frédéric F Marshall John J Troncoso-Pastoriza Juan J Hofmann Oliver O Keane Thomas M TM
Bioinformatics (Oxford, England) 20210901 17
<h4>Motivation</h4>The majority of genome analysis tools and pipelines require data to be decrypted for access. This potentially leaves sensitive genetic data exposed, either because the unencrypted data is not removed after analysis, or because the data leaves traces on the permanent storage medium.<h4>Results</h4>: We defined a file container specification enabling direct byte-level compatible random access to encrypted genetic data stored in community standards such as SAM/BAM/CRAM/VCF/BCF. B ...[more]