ABSTRACT: storage of 5 computer files by coding into synthetic DNA oligos, and verification of encoding/decoding process via high-throughput sequencing
Project description:storage of 5 computer files (739 kB) by coding into synthetic DNA oligos, and recovery of original information via high-throughput sequencing
Project description:We describe "Aird", an opensource and computation-oriented format with controllable precision, flexible indexing strategies, and high compression rate. Aird provides a novel compressor called Zlib-Diff-PforDelta (ZDPD) for m/z data. Compared with Zlib only, m/z data size is about 55% lower in Aird on average. With the high-speed decoding and encoding performance brought by the Single Instruction Multiple Data(SIMD) technology used in the ZDPD, Aird merely takes 33% decoding time compared with Zlib. We used the open dataset HYE, which contains 48 raw files from SCIEX TripleTOF 5600 and TripleTOF6600. The total file size is 206GB as the vendor format. The total size increases to 854GB after converting to mzML with 32-bit encoding precision. While it takes only 189GB when using Aird. Aird uses JavaScript Object Notation (JSON) for metadata storage. Aird-SDK is written in Java and AirdPro is a GUI client for vendor file converting which is written in C#. They are freely available at https://github.com/CSi-Studio/Aird-SDK and https://github.com/CSi-Studio/AirdPro.
Project description:Transcriptional profiling of MDA-MB-231 comparing the regulatory consequences of transfecting synthetic decoys carring sRSM1 motifs versus synthetic scrambled oligos
Project description:We describe "Aird", an opensource and computation-oriented format with controllable precision, flexible indexing strategies, and high compression rate. Aird provides a novel compressor called Zlib-Diff-PforDelta (ZDPD) for m/z data. Compared with Zlib only, m/z data size is about 55% lower in Aird on average. With the high-speed decoding and encoding performance brought by the Single Instruction Multiple Data(SIMD) technology used in the ZDPD, Aird merely takes 33% decoding time compared with Zlib. We used the open dataset HYE, which contains 48 raw files from SCIEX TripleTOF 5600 and TripleTOF6600. The total file size is 206GB as the vendor format. The total size increases to 854GB after converting to mzML with 32-bit encoding precision. While it takes only 189GB when using Aird. Aird uses JavaScript Object Notation (JSON) for metadata storage. Aird-SDK is written in Java and AirdPro is a GUI client for vendor file converting which is written in C#. They are freely available at https://github.com/CSi-Studio/Aird-SDK and https://github.com/CSi-Studio/AirdPro
Project description:The Supplementary files (appended below) contain the mapping for the decoding of blinded samples. This SuperSeries is composed of the SubSeries listed below.
Project description:Transcriptional profiling of MDA-MB-231 comparing the regulatory consequences of transfecting synthetic decoys carring sRSM1 motifs versus synthetic scrambled oligos Two decoy/scrambled sets, each with two biological replicates
Project description:In addition to determining possible diagnostic and predictive peptides of lupus and CNS-lupus, we also used our microarray technology along with the Guitope computer program to determine possible natural protein match to five monoclonal autoantibodies that were created using one of the autoimmune MRL/lpr mouse. Submitter states "We have no processed data to submit. We have no gpr files to submit."
Project description:This SuperSeries is composed of the following subset Series:; GSE5593: Acetaminophen (APAP) Rat Blood Training Gene Expression Data Set; GSE5594: Acetaminophen (APAP) Rat Blood Test Gene Expression Data Set; GSE5595: Acetaminophen (APAP) Rat Liver Test Gene Expression Data Set; The Supplementary files (appended below) contain the mapping for the decoding of blinded samples. Experiment Overall Design: Refer to individual Series
Project description:Boolean approaches and extensions thereof are becoming increasingly popular to model signaling and regulatory networks, including those controlling cell differentiation, pattern formation and embryonic development. Here, we describe a logical modeling framework relying on three steps: the delineation of a regulatory graph, the specification of multilevel components, and the encoding of Boolean rules specifying the behavior of model components depending on the levels or activities of their regulators. Referring to a non-deterministic, asynchronous updating scheme, we present several complementary methods and tools enabling the computation of stable activity patterns, the verification of the reachability of such patterns, as well as the generation of mean temporal evolution curves and the computation of the probabilities to reach distinct activity patterns. We apply this logical framework to the regulatory network controlling T lymphocyte specification. This process involves cross-regulations between specific T cell regulatory factors and factors driving alternative differentiation pathways, which remain accessible during the early steps of thymocyte development. Many transcription factors needed for T cell specification are required in other hematopoietic differentiation pathways and are combined in a fine-tuned, time-dependent fashion to achieve T cell commitment. Using the software GINsim, we integrated current knowledge into a dynamical model, which recapitulates the main developmental steps from early progenitors entering the thymus up to T cell commitment, as well as the impact of various documented environmental and genetic perturbations. Our model analysis further enabled the identification of several knowledge gaps. The model, software and whole analysis workflow are provided in computer-readable and executable form to ensure reproducibility and ease extensions.