Ontology highlight
ABSTRACT: Objective
Advancements in human genomics have generated a surge of available data, fueling the growth and accessibility of databases for more comprehensive, in-depth genetic studies.Methods
We provide a straightforward and innovative methodology to optimize cloud configuration in order to conduct genome-wide association studies. We utilized Spark clusters on both Google Cloud Platform and Amazon Web Services, as well as Hail (http://doi.org/10.5281/zenodo.2646680) for analysis and exploration of genomic variants dataset.Results
Comparative evaluation of numerous cloud-based cluster configurations demonstrate a successful and unprecedented compromise between speed and cost for performing genome-wide association studies on 4 distinct whole-genome sequencing datasets. Results are consistent across the 2 cloud providers and could be highly useful for accelerating research in genetics.Conclusions
We present a timely piece for one of the most frequently asked questions when moving to the cloud: what is the trade-off between speed and cost?
SUBMITTER: Krissaane I
PROVIDER: S-EPMC7534581 | biostudies-literature | 2020 Sep
REPOSITORIES: biostudies-literature
Krissaane Inès I De Niz Carlos C Gutiérrez-Sacristán Alba A Korodi Gabor G Ede Nneka N Kumar Ranjay R Lyons Jessica J Manrai Arjun A Patel Chirag C Kohane Isaac I Avillach Paul P
Journal of the American Medical Informatics Association : JAMIA 20200901 9
<h4>Objective</h4>Advancements in human genomics have generated a surge of available data, fueling the growth and accessibility of databases for more comprehensive, in-depth genetic studies.<h4>Methods</h4>We provide a straightforward and innovative methodology to optimize cloud configuration in order to conduct genome-wide association studies. We utilized Spark clusters on both Google Cloud Platform and Amazon Web Services, as well as Hail (http://doi.org/10.5281/zenodo.2646680) for analysis an ...[more]