Unknown

Dataset Information

0

The Essential Component in DNA-Based Information Storage System: Robust Error-Tolerating Module.


ABSTRACT: The size of digital data is ever increasing and is expected to grow to 40,000?EB by 2020, yet the estimated global information storage capacity in 2011 is <300?EB, indicating that most of the data are transient. DNA, as a very stable nano-molecule, is an ideal massive storage device for long-term data archive. The two most notable illustrations are from Church et al. and Goldman et al., whose approaches are well-optimized for most sequencing platforms - short synthesized DNA fragments without homopolymer. Here, we suggested improvements on error handling methodology that could enable the integration of DNA-based computational process, e.g., algorithms based on self-assembly of DNA. As a proof of concept, a picture of size 438?bytes was encoded to DNA with low-density parity-check error-correction code. We salvaged a significant portion of sequencing reads with mutations generated during DNA synthesis and sequencing and successfully reconstructed the entire picture. A modular-based programing framework - DNAcodec with an eXtensible Markup Language-based data format was also introduced. Our experiments demonstrated the practicability of long DNA message recovery with high error tolerance, which opens the field to biocomputing and synthetic biology.

SUBMITTER: Yim AK 

PROVIDER: S-EPMC4222239 | biostudies-literature | 2014

REPOSITORIES: biostudies-literature

altmetric image

Publications

The Essential Component in DNA-Based Information Storage System: Robust Error-Tolerating Module.

Yim Aldrin Kay-Yuen AK   Yu Allen Chi-Shing AC   Li Jing-Woei JW   Wong Ada In-Chun AI   Loo Jacky F C JF   Chan King Ming KM   Kong S K SK   Yip Kevin Y KY   Chan Ting-Fung TF  

Frontiers in bioengineering and biotechnology 20141106


The size of digital data is ever increasing and is expected to grow to 40,000 EB by 2020, yet the estimated global information storage capacity in 2011 is <300 EB, indicating that most of the data are transient. DNA, as a very stable nano-molecule, is an ideal massive storage device for long-term data archive. The two most notable illustrations are from Church et al. and Goldman et al., whose approaches are well-optimized for most sequencing platforms - short synthesized DNA fragments without ho  ...[more]

Similar Datasets

| S-EPMC7582880 | biostudies-literature
| S-EPMC5503945 | biostudies-literature
| S-EPMC9116035 | biostudies-literature
| PRJEB32885 | ENA
| S-EPMC10294226 | biostudies-literature
| S-EPMC7293219 | biostudies-literature
| S-EPMC7513150 | biostudies-literature
| S-EPMC10776348 | biostudies-literature
| S-EPMC7414044 | biostudies-literature
| S-EPMC4699846 | biostudies-literature