Dataset Information

Gene Model Annotations for Drosophila melanogaster: Impact of High-Throughput Data.

ABSTRACT: We report the current status of the FlyBase annotated gene set for Drosophila melanogaster and highlight improvements based on high-throughput data. The FlyBase annotated gene set consists entirely of manually annotated gene models, with the exception of some classes of small non-coding RNAs. All gene models have been reviewed using evidence from high-throughput datasets, primarily from the modENCODE project. These datasets include RNA-Seq coverage data, RNA-Seq junction data, transcription start site profiles, and translation stop-codon read-through predictions. New annotation guidelines were developed to take into account the use of the high-throughput data. We describe how this flood of new data was incorporated into thousands of new and revised annotations. FlyBase has adopted a philosophy of excluding low-confidence and low-frequency data from gene model annotations; we also do not attempt to represent all possible permutations for complex and modularly organized genes. This has allowed us to produce a high-confidence, manageable gene annotation dataset that is available at FlyBase (http://flybase.org). Interesting aspects of new annotations include new genes (coding, non-coding, and antisense), many genes with alternative transcripts with very long 3' UTRs (up to 15-18 kb), and a stunning mismatch in the number of male-specific genes (approximately 13% of all annotated gene models) vs. female-specific genes (less than 1%). The number of identified pseudogenes and mutations in the sequenced strain also increased significantly. We discuss remaining challenges, for instance, identification of functional small polypeptides and detection of alternative translation starts.

SUBMITTER: Matthews BB

PROVIDER: S-EPMC4528329 | biostudies-literature | 2015 Jun

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Gene Model Annotations for Drosophila melanogaster: Impact of High-Throughput Data.

Matthews Beverley B BB Dos Santos Gilberto G Crosby Madeline A MA Emmert David B DB St Pierre Susan E SE Gramates L Sian LS Zhou Pinglei P Schroeder Andrew J AJ Falls Kathleen K Strelets Victor V Russo Susan M SM Gelbart William M WM

G3 (Bethesda, Md.) 20150624 8

We report the current status of the FlyBase annotated gene set for Drosophila melanogaster and highlight improvements based on high-throughput data. The FlyBase annotated gene set consists entirely of manually annotated gene models, with the exception of some classes of small non-coding RNAs. All gene models have been reviewed using evidence from high-throughput datasets, primarily from the modENCODE project. These datasets include RNA-Seq coverage data, RNA-Seq junction data, transcription star ...[more]

PMID: 26109357

Similar Datasets

Project description:Drosophila melanogaster can be used to identify genes with novel functional roles in neuronal plasticity induced by repeated consumption of addictive drugs. Behavioral sensitization is a relatively simple behavioral output of plastic changes that occur in the brain after repeated exposures to drugs of abuse. The development of screening procedures for genes that control behavioral sensitization has stalled due to a lack of high-throughput behavioral tests that can be used in genetically tractable organism, such as Drosophila. We have developed a new behavioral test, FlyBong, which combines delivery of volatilized cocaine (vCOC) to individually housed flies with objective quantification of their locomotor activity. There are two main advantages of FlyBong: it is high-throughput and it allows for comparisons of locomotor activity of individual flies before and after single or multiple exposures. At the population level, exposure to vCOC leads to transient and concentration-dependent increase in locomotor activity, representing sensitivity to an acute dose. A second exposure leads to further increase in locomotion, representing locomotor sensitization. We validate FlyBong by showing that locomotor sensitization at either the population or individual level is absent in the mutants for circadian genes period (per), Clock (Clk), and cycle (cyc). The locomotor sensitization that is present in timeless (tim) and pigment dispersing factor (pdf) mutant flies is in large part not cocaine specific, but derived from increased sensitivity to warm air. Circadian genes are not only integral part of the neural mechanism that is required for development of locomotor sensitization, but in addition, they modulate the intensity of locomotor sensitization as a function of the time of day. Motor-activating effects of cocaine are sexually dimorphic and require a functional dopaminergic transporter. FlyBong is a new and improved method for inducing and measuring locomotor sensitization to cocaine in individual Drosophila. Because of its high-throughput nature, FlyBong can be used in genetic screens or in selection experiments aimed at the unbiased identification of functional genes involved in acute or chronic effects of volatilized psychoactive substances.

Dataset Information

Gene Model Annotations for Drosophila melanogaster: Impact of High-Throughput Data.

Publications

Gene Model Annotations for Drosophila melanogaster: Impact of High-Throughput Data.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets