Project description:Whole-transcriptome gene-expression analyses are commonly performed in species that have a sequenced genome and for which microarrays are commercially available. To do such analyses in species with no or limited genome data, i.e. non-model organisms, necessary transcriptomics resources, i.e. an annotated transcriptome and a validated gene-expression microarray, must first be developed. The aim of the present study was to establish an advanced approach for developing transcriptomics resources for non-model organisms by combining next-generation sequencing (NGS) and microarray technology. We applied our approach to the non-biting midge Chironomus riparius, an ecologically relevant species that is widely used in sediment ecotoxicity testing. We sampled extensively covering all C. riparius developmental stages as well as toxicant exposed larvae and obtained from a normalized cDNA library 1.5 M NGS reads totalling 501 Mbp. Using the NGS data we developed transcriptomics resources in several steps. First, we designed 844 k probes directly on the NGS reads, as well as 76 k probes targeting expressed sequence tags of related species. These probes were tested for their affinity to C. riparius DNA and mRNA, by performing two biological experiments with a 1 M probe-selection microarray that contained the entire probe-library. Subsequently, the 1.5 M NGS reads were assembled into 23,709 isotigs and 135,082 singletons, which were associated to ~55 k, respectively, ~61 k gene ontology terms and which corresponded together to 22,593 unique protein accessions. An algorithm was developed that took the assembly and the probe affinities to DNA and mRNA into account, what resulted in 59 k highly-reliable probes that targeted uniquely 95% of the isotigs and 18% of the singletons. Concluding, our approach allowed the development of high-quality transcriptomics resources for C. riparius, and is applicable to any non-model organism. It is expected, that these resources will advance ecotoxicity testing with C. riparius as whole-transcriptome gene-expression analysis are now possible with this species. 1x 1M CGH array with Cy3 labeled C. riparius gDNA and Cy5 labeled A. gambiae gDNA. The microarray was designed against C. riparius mRNA sequencing reads, and has been used to identify trustworthy sequencing reads to design an expression array. This 1M array is therefore not functionally annotated.
Project description:Whole-transcriptome gene-expression analyses are commonly performed in species that have a sequenced genome and for which microarrays are commercially available. To do such analyses in species with no or limited genome data, i.e. non-model organisms, necessary transcriptomics resources, i.e. an annotated transcriptome and a validated gene-expression microarray, must first be developed. The aim of the present study was to establish an advanced approach for developing transcriptomics resources for non-model organisms by combining next-generation sequencing (NGS) and microarray technology. We applied our approach to the non-biting midge Chironomus riparius, an ecologically relevant species that is widely used in sediment ecotoxicity testing. We sampled extensively covering all C. riparius developmental stages as well as toxicant exposed larvae and obtained from a normalized cDNA library 1.5 M NGS reads totalling 501 Mbp. Using the NGS data we developed transcriptomics resources in several steps. First, we designed 844 k probes directly on the NGS reads, as well as 76 k probes targeting expressed sequence tags of related species. These probes were tested for their affinity to C. riparius DNA and mRNA, by performing two biological experiments with a 1 M probe-selection microarray that contained the entire probe-library. Subsequently, the 1.5 M NGS reads were assembled into 23,709 isotigs and 135,082 singletons, which were associated to ~55 k, respectively, ~61 k gene ontology terms and which corresponded together to 22,593 unique protein accessions. An algorithm was developed that took the assembly and the probe affinities to DNA and mRNA into account, what resulted in 59 k highly-reliable probes that targeted uniquely 95% of the isotigs and 18% of the singletons. Concluding, our approach allowed the development of high-quality transcriptomics resources for C. riparius, and is applicable to any non-model organism. It is expected, that these resources will advance ecotoxicity testing with C. riparius as whole-transcriptome gene-expression analysis are now possible with this species.
Project description:Development of an alternative method to ChIP for the identification of DNA bound by transcriptional complexes assayed using next-generation sequencing Next-generation sequencing data from sites identified by different Notch complexes using SpDamID-seq and compared against FAIRE and ChIP data
Project description:Asian salamander Hynobiidae is commonly observed in the Far East Asia regions, including Korea, Japan, China, and the eastern region of Russia. In Korea, there are four Hynobiidae species known to be lived: Hynobius leechii, Hynobius quelpaertensis, Hynobius yangi, and recently reported Hynobius unisacculus. However, even H. leechii which is broadly colonized in Korea peninsula seems to have a new species candidate, which has distinctive genetic and phenotypic characteristics. Genomic resources are essential to understand the current status of these species, but due to the large size of their genomes (about 16 to 20 Gb), it is not easy to analyze. To reveal the genomic characteristics of these species, we constructed more than ten thousands of protein-coding gene sequences from multiple samples of each species, using the de novo transcriptome assembly approach from RNA-Seq data, confirming their taxonomic relationship which was reported based on mitochondrial DNA and marker genes. Also, by comparing previously reported transcriptome of Hynobius chinensis and Hynobius retardatus, lived in China and Japan, respectively, we found that Korean species have unique genetic signatures. By comparing vertebrate model organism genes, we reported Hynobidaii specific proteins. These data would be a useful resource to study other Caudata species in the future. This research was supported by the National Institute of Biological Resources, Republic of Korea, under the project "Genetic diversity of animal resources” (NIBR201703203 and NIBR201803101).
Project description:Performing proteomic studies on non-model organisms with little or no genomic information is still difficult. However, many specific processes and biochemical pathways occur only in species that are poorly characterized at the genomic level. For example, many plants can reproduce both sexually and asexually, the first one allowing the generation of new genotypes and the latter their fixation. Thus, both modes of reproduction are of great agronomic value. However, the molecular basis of asexual reproduction is not understood in any plant. In ferns, it combines the production of unreduced spores (diplospory) and the formation of sporophytes from somatic cells (apogamy). To set the basis to study these processes, we performed transcriptomics by next-generation sequencing (NGS) and shotgun proteomics by tandem mass spectrometry in the apogamous fern D. affinis ssp. affinis. For protein identification we used the public viridiplantae database (VPDB) to identify orthologous proteins from other plant species and new transcriptomics data to generate a “species-specific transcriptome database” (SSTDB). In total 1397 protein clusters with 5865 unique peptide sequences were identified (13 decoy proteins out of 1410, protFDR 0.93% on protein cluster level). We show that using a “species-specific transcriptome database” for protein identification increases the number of identified peptides almost four times compared to using only the publically available viridiplantae database. We identified homologs of proteins involved in reproduction of higher plants, including proteins with a potential role in apogamy.
Project description:Here, we present new functional genomic resources for the amphipod crustacean Parhyale hawaiensis, facilitating the exploration of gene regulatory evolution using this emerging research organism. We use Omni-ATAC-Seq, an improved form of the Assay for Transposase-Accessible Chromatin coupled with next-generation sequencing (ATAC-Seq), to identify accessible chromatin genome-wide across a broad time course of Parhyale embryonic development. This time course encompasses many major morphological events, including segmentation, body regionalization, gut morphogenesis, and limb development. In addition, we use short- and long-read RNA-Seq to generate an improved Parhyale genome annotation, enabling deeper classification of identified regulatory elements. We leverage a variety of bioinformatic tools to discover differential accessibility, predict nucleosome positioning, infer transcription factor binding, cluster peaks based on accessibility dynamics, classify biological functions, and correlate gene expression with accessibility.
Project description:Microarrays have increasingly become a powerful tool for high throughput gene-expression studies and discovery of novel biomarker genes. Developed for a large number of organisms, including plants, microarrays are commonly performed for species that have sequenced data, for performing gene expression analysis, miRNA profiling, comparative genomic hybridization (CGH), ChIP-on-chip and SNP analysis. Genomic resources are still very limited for chickpea, a very important food legume crop. Here, we report the design and comprehensive validation of Next Generation Sequencing transcriptome data for chickpea through microarray technology to develop a high-throughput resource for studying the expression of all the transcripts in different biological samples to help functional genomics and breeding programs. This microarray design was developed and validated jointly by Genotypic Technology Private Limited and National Institute of Plant Genome Research. First, we designed 400k probes using reads covering 35k assembled contigs and 100k singletons chickpea transcripts. The 400k chip was hybridized with DNA and RNA samples of chickpea and microarray analysis was carried out. A total of 73,922 probes were found to be specific to chickpea transcripts. Best probes were filtered from the analyzed data and a total of 61,659 probes were selected to develop the final microarray design in 60k gene-expression microarray format. The probes represented 51,444 unique transcripts. The probes were annotated based on their corresponding chickpea transcript and similarity with other plants species. Microarray results were concordant with previous results from the NGS studies. The design of custom oligonucleotide probes for microarrays have varied functional genomic applications and this approach represents a valuable resource for chickpea.