Project description:Black corals, ecologically important cnidarians found from shallow to deep ocean depths, form a strong yet flexible skeleton of sclerotized chitin and other biomolecules including proteins. The structure and mechanical properties of the chitin component of the skeleton have been well-characterized. However, the protein component has remained a mystery. Here we used liquid chromatography-tandem mass spectrometry to sequence proteins extracted from two species of common Red Sea black corals following either one or two cleaning steps. We detected hundreds of proteins between the two corals, nearly 70 of which are each others’ reciprocal best BLAST hit. Unlike stony corals, only a few of the detected proteins were moderately acidic (biased toward aspartic and/or glutamic acid residues) suggesting less of a role for these types of proteins in black coral skeleton formation as compared to stony corals. No distinct chitin binding domains were found in the proteins, but proteins annotated as having a role in protein and chitin modifications were detected. Our results support the integral role of proteins in black coral skeleton formation, structure, and function.
Project description:The surprising observation that virtually the entire human genome is transcribed means we know very little about the function of many emerging classes of RNAs, except their astounding diversity. Traditional RNA function prediction methods rely on sequence or alignment information, which are limited in their ability to classify classes of non-coding RNAs (ncRNAs). To address this, we developed CoRAL, a machine learning-based approach for classification of RNA molecules. CoRAL uses biologically interpretable features including fragment length, cleavage specificity, and antisense transcription to distinguish between different ncRNA classes. We evaluated CoRAL using genome-wide small RNA sequencing (smRNA-seq) datasets from two human tissue types (brain and skin [GSE31037]), and were able to classify six different types of RNA transcripts with 79~80% accuracy in cross-validation experiments, and with 71~73% accuracy when CoRAL uses one tissue type for training and the other as validation. Analysis by CoRAL revealed that long intergenic ncRNAs, small cytoplasmic RNAs, and small nuclear RNAs show more tissue specificity, while microRNAs, small nucleolar, and transposon-derived RNAs are highly discernible and consistent across the two tissue types. The ability to consistently annotate loci across tissue types demonstrates the potential of CoRAL to characterize ncRNAs using smRNA-seq data in less characterized organisms.