Dataset Information

Alternative gene form discovery and candidate gene selection from gene indexing projects.

ABSTRACT: Several efforts are under way to partition single-read expressed sequence tag (EST), as well as full-length transcript data, into large-scale gene indices, where transcripts are in common index classes if and only if they share a common progenitor gene. Accurate gene indexing facilitates gene expression studies, as well as inexpensive and early gene sequence discovery through assembly of ESTs that are derived from genes that have not been sequenced by classical methods. We extend, correct, and enhance the information obtained from index groups by splitting index classes into subclasses based on sequence dissimilarity (diversity). Two applications of this are highlighted in this report. First it is shown that our method can ameliorate the damage that artifacts, such as chimerism, inflict on index integrity. Additionally, we demonstrate how the organization imposed by an effective subpartition can greatly increase the sensitivity of gene expression studies by accounting for the existence and tissue- or pathology-specific regulation of novel gene isoforms and polymorphisms. We apply our subpartitioning treatment to the UniGene gene indexing project to measure a marked increase in information quality and abundance (in terms of assembly length and insertion/deletion error) after treatment and demonstrate cases where new levels of information concerning differential expression of alternate gene forms, such as regulated alternative splicing, are discovered. [Tables 2 and 3 can be viewed in their entirety as Online Supplements at http://www.genome.org.]

SUBMITTER: Burke J

PROVIDER: S-EPMC310695 | biostudies-literature | 1998 Mar

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Alternative gene form discovery and candidate gene selection from gene indexing projects.

Burke J J Wang H H Hide W W Davison D B DB

Genome research 19980301 3

Several efforts are under way to partition single-read expressed sequence tag (EST), as well as full-length transcript data, into large-scale gene indices, where transcripts are in common index classes if and only if they share a common progenitor gene. Accurate gene indexing facilitates gene expression studies, as well as inexpensive and early gene sequence discovery through assembly of ESTs that are derived from genes that have not been sequenced by classical methods. We extend, correct, and e ...[more]

PMID: 9521931

Similar Datasets

Project description:Schizophrenia is a chronic psychiatric disorder that affects about 1% of the population globally. A tremendous amount of effort has been expended in the past decade, including more than 2400 association studies, to identify genes influencing susceptibility to the disorder. However, few genes or markers have been reliably replicated. The wealth of this information calls for an integration of gene association data, evidence-based gene ranking, and follow-up replication in large sample. The objective of this study is to develop and evaluate evidence-based gene ranking methods and to examine the features of top-ranking candidate genes for schizophrenia.We proposed a gene-based approach for selecting and prioritizing candidate genes by combining odds ratios (ORs) of multiple markers in each association study and then combining ORs in multiple studies of a gene. We named it combination-combination OR method (CCOR). CCOR is similar to our recently published method, which first selects the largest OR of the markers in each study and then combines these ORs in multiple studies (i.e., selection-combination OR method, SCOR), but differs in selecting representative OR in each study. Features of top-ranking genes were examined by Gene Ontology terms and gene expression in tissues.Our evaluation suggested that the SCOR method overall outperforms the CCOR method. Using the SCOR, a list of 75 top-ranking genes was selected for schizophrenia candidate genes (SZGenes). We found that SZGenes had strong correlation with neuro-related functional terms and were highly expressed in brain-related tissues.The scientific landscape for schizophrenia genetics and other complex disease studies is expected to change dramatically in the next a few years, thus, the gene-based combined OR method is useful in candidate gene selection for follow-up association studies and in further artificial intelligence in medicine. This method for prioritization of candidate genes can be applied to other complex diseases such as depression, anxiety, nicotine dependence, alcohol dependence, and cardiovascular diseases.

Project description:Schizophrenia (SZ) is a heritable, complex mental disorder. We have seen limited success in finding causal genes for schizophrenia from numerous conventional studies. Protein interaction network and pathway-based analysis may provide us an alternative and effective approach to investigating the molecular mechanisms of schizophrenia.We selected a list of schizophrenia candidate genes (SZGenes) using a multi-dimensional evidence-based approach. The global network properties of proteins encoded by these SZGenes were explored in the context of the human protein interactome while local network properties were investigated by comparing SZ-specific and cancer-specific networks that were extracted from the human interactome. Relative to cancer genes, we observed that SZGenes tend to have an intermediate degree and an intermediate efficiency on a perturbation spreading throughout the human interactome. This suggested that schizophrenia might have different pathological mechanisms from cancer even though both are complex diseases. We conducted pathway analysis using Ingenuity System and constructed the first schizophrenia molecular network (SMN) based on protein interaction networks, pathways and literature survey. We identified 24 pathways overrepresented in SZGenes and examined their interactions and crosstalk. We observed that these pathways were related to neurodevelopment, immune system, and retinoic X receptor (RXR). Our examination of SMN revealed that schizophrenia is a dynamic process caused by dysregulation of the multiple pathways. Finally, we applied the network/pathway approach to identify novel candidate genes, some of which could be verified by experiments.This study provides the first comprehensive review of the network and pathway characteristics of schizophrenia candidate genes. Our preliminary results suggest that this systems biology approach might prove promising for selection of candidate genes for complex diseases. Our findings have important implications for the molecular mechanisms for schizophrenia and, potentially, other psychiatric disorders.

Dataset Information

Alternative gene form discovery and candidate gene selection from gene indexing projects.

Publications

Alternative gene form discovery and candidate gene selection from gene indexing projects.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets