Unknown

Dataset Information

0

Trellis for efficient data and task management in the VA Million Veteran Program.


ABSTRACT: Biomedical studies have become larger in size and yielded large quantities of data, yet efficient data processing remains a challenge. Here we present Trellis, a cloud-based data and task management framework that completely automates the process from data ingestion to result presentation, while tracking data lineage, facilitating information query, and supporting fault-tolerance and scalability. Using a graph database to coordinate the state of the data processing workflows and a scalable microservice architecture to perform bioinformatics tasks, Trellis has enabled efficient variant calling on 100,000 human genomes collected in the VA Million Veteran Program.

SUBMITTER: Ross PB 

PROVIDER: S-EPMC8636485 | biostudies-literature | 2021 Dec

REPOSITORIES: biostudies-literature

altmetric image

Publications

Trellis for efficient data and task management in the VA Million Veteran Program.

Ross Paul Billing PB   Song Jina J   Tsao Philip S PS   Pan Cuiping C  

Scientific reports 20211201 1


Biomedical studies have become larger in size and yielded large quantities of data, yet efficient data processing remains a challenge. Here we present Trellis, a cloud-based data and task management framework that completely automates the process from data ingestion to result presentation, while tracking data lineage, facilitating information query, and supporting fault-tolerance and scalability. Using a graph database to coordinate the state of the data processing workflows and a scalable micro  ...[more]

Similar Datasets

| S-EPMC10327290 | biostudies-literature
| S-EPMC10732342 | biostudies-literature
| S-EPMC6710266 | biostudies-literature
| S-EPMC9132265 | biostudies-literature
| S-EPMC8529838 | biostudies-literature
| S-EPMC7118558 | biostudies-literature
| S-EPMC9583190 | biostudies-literature
| phs001672 | dbGaP
| S-EPMC8864634 | biostudies-literature
| S-EPMC9997524 | biostudies-literature