Project description:Enhancer RNAs (eRNAs) are a class of non-coding RNAs transcribed from enhancers. As the markers of active enhancers, eRNAs play important roles in gene regulation and are associated with various complex traits and characteristics. With increasing attention to eRNAs, numerous eRNAs have been identified in different human tissues. However, the expression landscape, regulatory network and potential functions of eRNAs in animals have not been fully elucidated. Here, we systematically characterized 185 177 eRNAs from 5085 samples across 10 species by mapping the RNA sequencing data to the regions of known enhancers. To explore their potential functions based on evolutionary conservation, we investigated the sequence similarity of eRNAs among multiple species. In addition, we identified the possible associations between eRNAs and transcription factors (TFs) or nearby genes to decipher their possible regulators and target genes, as well as characterized trait-related eRNAs to explore their potential functions in biological processes. Based on these findings, we further developed Animal-eRNAdb (http://gong_lab.hzau.edu.cn/Animal-eRNAdb/), a user-friendly database for data searching, browsing and downloading. With the comprehensive characterization of eRNAs in various tissues of different species, Animal-eRNAdb may greatly facilitate the exploration of functions and mechanisms of eRNAs.
Project description:The Plant snoRNA database (http://www.scri.sari.ac.uk/plant_snoRNA/) provides information on small nucleolar RNAs from Arabidopsis and eighteen other plant species. Information includes sequences, expression data, methylation and pseudouridylation target modification sites, initial gene organization (polycistronic, single gene and intronic) and the number of gene variants. The Arabidopsis information is divided into box C/D and box H/ACA snoRNAs, and within each of these groups, by target sites in rRNA, snRNA or unknown. Alignments of orthologous genes and gene variants from different plant species are available for many snoRNA genes. Plant snoRNA genes have been given a standard nomenclature, designed wherever possible, to provide a consistent identity with yeast and human orthologues.
Project description:Retrocopies of protein-coding genes, reverse transcribed and inserted into the genome copies of mature RNA, have commonly been categorized as pseudogenes with no biological importance. However, recent studies showed that they play important role in the genomes evolution and shaping interspecies differences. Here, we present RetrogeneDB, a database of retrocopies in 62 animal genomes. RetrogeneDB contains information about retrocopies, their genomic localization, parental genes, ORF conservation, and expression. To our best knowledge, this is the most complete retrocopies database providing information for dozens of species previously never analyzed in the context of protein-coding genes retroposition. The database is available at http://retrogenedb.amu.edu.pl.
Project description:MicroRNAs (miRNA) are approximately 21 nucleotide-long non-coding small RNAs, which function as post-transcriptional regulators in eukaryotes. miRNAs play essential roles in regulating plant growth and development. In recent years, research into the mechanism and consequences of miRNA action has made great progress. With whole genome sequence available in such plants as Arabidopsis thaliana, Oryza sativa, Populus trichocarpa, Glycine max, etc., it is desirable to develop a plant miRNA database through the integration of large amounts of information about publicly deposited miRNA data. The plant miRNA database (PMRD) integrates available plant miRNA data deposited in public databases, gleaned from the recent literature, and data generated in-house. This database contains sequence information, secondary structure, target genes, expression profiles and a genome browser. In total, there are 8433 miRNAs collected from 121 plant species in PMRD, including model plants and major crops such as Arabidopsis, rice, wheat, soybean, maize, sorghum, barley, etc. For Arabidopsis, rice, poplar, soybean, cotton, medicago and maize, we included the possible target genes for each miRNA with a predicted interaction site in the database. Furthermore, we provided miRNA expression profiles in the PMRD, including our local rice oxidative stress related microarray data (LC Sciences miRPlants_10.1) and the recently published microarray data for poplar, Arabidopsis, tomato, maize and rice. The PMRD database was constructed by open source technology utilizing a user-friendly web interface, and multiple search tools. The PMRD is freely available at http://bioinformatics.cau.edu.cn/PMRD. We expect PMRD to be a useful tool for scientists in the miRNA field in order to study the function of miRNAs and their target genes, especially in model plants and major crops.
Project description:Two transcribed retrocopies of the fibroblast growth factor 4 (FGF4) gene have previously been described in the domestic dog. An FGF4 retrocopy on chr18 is associated with disproportionate dwarfism, while an FGF4 retrocopy on chr12 is associated with both disproportionate dwarfism and intervertebral disc disease (IVDD). In this study, whole-genome sequencing data were queried to identify other FGF4 retrocopies that could be contributing to phenotypic diversity in canids. Additionally, dogs with surgically confirmed IVDD were assayed for novel FGF4 retrocopies. Five additional and distinct FGF4 retrocopies were identified in canids including a copy unique to red wolves (Canis rufus). The FGF4 retrocopies identified in domestic dogs were identical to domestic dog FGF4 haplotypes, which are distinct from modern wolf FGF4 haplotypes, indicating that these retrotransposition events likely occurred after domestication. The identification of multiple, full length FGF4 retrocopies with open reading frames in canids indicates that gene retrotransposition events occur much more frequently than previously thought and provide a mechanism for continued genetic and phenotypic diversity in canids.
Project description:BACKGROUND: Sireviruses are an ancient genus of the Copia superfamily of LTR retrotransposons, and the only one that has exclusively proliferated within plant genomes. Based on experimental data and phylogenetic analyses, Sireviruses have successfully infiltrated many branches of the plant kingdom, extensively colonizing the genomes of grass species. Notably, it was recently shown that they have been a major force in the make-up and evolution of the maize genome, where they currently occupy ~21% of the nuclear content and ~90% of the Copia population. It is highly likely, therefore, that their life dynamics have been fundamental in the genome composition and organization of a plethora of plant hosts. To assist studies into their impact on plant genome evolution and also facilitate accurate identification and annotation of transposable elements in sequencing projects, we developed MASiVEdb (Mapping and Analysis of SireVirus Elements Database), a collective and systematic resource of Sireviruses in plants. DESCRIPTION: Taking advantage of the increasing availability of plant genomic sequences, and using an updated version of MASiVE, an algorithm specifically designed to identify Sireviruses based on their highly conserved genome structure, we populated MASiVEdb (http://bat.infspire.org/databases/masivedb/) with data on 16,243 intact Sireviruses (total length >158Mb) discovered in 11 fully-sequenced plant genomes. MASiVEdb is unlike any other transposable element database, providing a multitude of highly curated and detailed information on a specific genus across its hosts, such as complete set of coordinates, insertion age, and an analytical breakdown of the structure and gene complement of each element. All data are readily available through basic and advanced query interfaces, batch retrieval, and downloadable files. A purpose-built system is also offered for detecting and visualizing similarity between user sequences and Sireviruses, as well as for coding domain discovery and phylogenetic analysis. CONCLUSION: MASiVEdb is currently the most comprehensive directory of Sireviruses, and as such complements other efforts in cataloguing plant transposable elements and elucidating their role in host genome evolution. Such insights will gradually deepen, as we plan to further improve MASiVEdb by phylogenetically mapping Sireviruses into families, by including data on fragments and solo LTRs, and by incorporating elements from newly-released genomes.
Project description:BackgroundLong noncoding RNAs (lncRNAs) have attracted significant attention in recent years due to their important roles in many biological processes. Domestic animals constitute a unique resource for understanding the genetic basis of phenotypic variation and are ideal models relevant to diverse areas of biomedical research. With improving sequencing technologies, numerous domestic-animal lncRNAs are now available. Thus, there is an immediate need for a database resource that can assist researchers to store, organize, analyze and visualize domestic-animal lncRNAs.ResultsThe domestic-animal lncRNA database, named ALDB, is the first comprehensive database with a focus on the domestic-animal lncRNAs. It currently archives 12,103 pig intergenic lncRNAs (lincRNAs), 8,923 chicken lincRNAs and 8,250 cow lincRNAs. In addition to the annotations of lincRNAs, it offers related data that is not available yet in existing lncRNA databases (lncRNAdb and NONCODE), such as genome-wide expression profiles and animal quantitative trait loci (QTLs) of domestic animals. Moreover, a collection of interfaces and applications, such as the Basic Local Alignment Search Tool (BLAST), the Generic Genome Browser (GBrowse) and flexible search functionalities, are available to help users effectively explore, analyze and download data related to domestic-animal lncRNAs.ConclusionsALDB enables the exploration and comparative analysis of lncRNAs in domestic animals. A user-friendly web interface, integrated information and tools make it valuable to researchers in their studies. ALDB is freely available from http://res.xaut.edu.cn/aldb/index.jsp.
Project description:Pre-clinical research builds on a large variety of in vivo and ex vivo tools such as non-invasive imaging, microscopy, and analysis of gene expression. To work efficiently with multimodal data and correlate results across scales, it is of particular importance to have easy access to all data points from different specimen, e.g. the magnetic resonance imaging (MRI) data from different time points, and the post-mortem histology. That requires an efficient data management, which is customizable and designed to relate all applied methods, raw data and analyses to one specific animal. Despite increasing demands to handle such complex data, most pre-clinical labs have not yet established such an electronic database. Here, we present a novel cloud-based relational database for multimodal animal data, which operates on commercial software. We have implemented data fields for various pre-clinical features such as MRI, histology and behaviour. Automated procedures replace manual and recurrent calculations. Pre-set plotting and printing features provide efficient analysis and documentation. The database template is useful for all labs working with laboratory animals and the adaption to specific research projects requires no prior scripting expertise. The database works operating-system independent through the web browser and allows multiple users to work simultaneously. The data entry is monitored and restricted for particular tests according to the user management in order to keep for example users during the experiment blinded for the experimental group. The database improves data accessibility, standardization of data recording and data handling efficiency in pre-clinical research.
Project description:Individual animals can often move more safely or more efficiently as members of a group. This can be as simple as safety in numbers or as sophisticated as aerodynamic or hydrodynamic cooperation. Here, we show that individual plant-animal worms (Symsagittifera roscoffensis) can move to safety more quickly through flocculation. Flocs form in response to turbulence that might otherwise carry these beach-dwelling worms out to sea. They allow the worms to descend much more quickly to the safety of the substrate than single worms could swim. Descent speed increases with floc size such that larger flocs can catch up with smaller ones and engulf them to become even larger and faster. To our knowledge, this is the first demonstration of social flocculation in a wild, multicellular organism. It is also remarkable that such effective flocculation occurs where the components are comparatively large multicellular organisms organized as entangled ensembles.
Project description:The gene encoding the ubiquitous DNA repair protein, Ku70p, has undergone extensive copy number expansion during primate evolution. Gene duplications of KU70 have the hallmark of long interspersed element-1 mediated retrotransposition with evidence of target-site duplications, the poly-A tails, and the absence of introns. Evolutionary analysis of this expanded family of KU70-derived "NUKU" retrocopies reveals that these genes are both ancient and also actively being created in extant primate species. NUKU retrocopies show evidence of functional divergence away from KU70, as evinced by their altered pattern of tissue expression and possible tissue-specific translation. Molecular modeling predicted that amino acid changes in Nuku2p at the interaction interface with Ku80p would prevent the assembly of the Ku heterodimer. The lack of Nuku2p-Ku80p interaction was confirmed by yeast two-hybrid assay, which contrasts the robust interaction of Ku70p-Ku80p. While several NUKU retrocopies appear to have been degraded by mutation, NUKU2 shows evidence of positive natural selection, suggesting that this retrocopy is undergoing neofunctionalization. Although Nuku proteins do not appear to antagonize retrovirus transduction in cell culture, the observed expansion and rapid evolution of NUKUs could be being driven by alternative selective pressures related to infectious disease or an undefined role in primate physiology.