Unknown

Dataset Information

0

Inferring higher functional information for RIKEN mouse full-length cDNA clones with FACTS.


ABSTRACT: FACTS (Functional Association/Annotation of cDNA Clones from Text/Sequence Sources) is a semiautomated knowledge discovery and annotation system that integrates molecular function information derived from sequence analysis results (sequence inferred) with functional information extracted from text. Text-inferred information was extracted from keyword-based retrievals of MEDLINE abstracts and by matching of gene or protein names to OMIM, BIND, and DIP database entries. Using FACTS, we found that 47.5% of the 60,770 RIKEN mouse cDNA FANTOM2 clone annotations were informative for text searches. MEDLINE queries yielded molecular interaction-containing sentences for 23.1% of the clones. When disease MeSH and GO terms were matched with retrieved abstracts, 22.7% of clones were associated with potential diseases, and 32.5% with GO identifiers. A significant number (23.5%) of disease MeSH-associated clones were also found to have a hereditary disease association (OMIM Morbidmap). Inferred neoplastic and nervous system disease represented 49.6% and 36.0% of disease MeSH-associated clones, respectively. A comparison of sequence-based GO assignments with informative text-based GO assignments revealed that for 78.2% of clones, identical GO assignments were provided for that clone by either method, whereas for 21.8% of clones, the assignments differed. In contrast, for OMIM assignments, only 28.5% of clones had identical sequence-based and text-based OMIM assignments. Sequence, sentence, and term-based functional associations are included in the FACTS database (http://facts.gsc.riken.go.jp/), which permits results to be annotated and explored through web-accessible keyword and sequence search interfaces. The FACTS database will be a critical tool for investigating the functional complexity of the mouse transcriptome, cDNA-inferred interactome (molecular interactions), and pathome (pathologies).

SUBMITTER: Nagashima T 

PROVIDER: S-EPMC403704 | biostudies-literature | 2003 Jun

REPOSITORIES: biostudies-literature

altmetric image

Publications

Inferring higher functional information for RIKEN mouse full-length cDNA clones with FACTS.

Nagashima Takeshi T   Silva Diego G DG   Petrovsky Nikolai N   Socha Luis A LA   Suzuki Harukazu H   Saito Rintaro R   Kasukawa Takeya T   Kurochkin Igor V IV   Konagaya Akihiko A   Schönbach Christian C  

Genome research 20030601 6B


FACTS (Functional Association/Annotation of cDNA Clones from Text/Sequence Sources) is a semiautomated knowledge discovery and annotation system that integrates molecular function information derived from sequence analysis results (sequence inferred) with functional information extracted from text. Text-inferred information was extracted from keyword-based retrievals of MEDLINE abstracts and by matching of gene or protein names to OMIM, BIND, and DIP database entries. Using FACTS, we found that  ...[more]

Similar Datasets

| S-EPMC403723 | biostudies-literature
| S-EPMC403720 | biostudies-literature
| S-EPMC420239 | biostudies-literature
| S-EPMC2222646 | biostudies-literature
| S-EPMC30115 | biostudies-literature
| S-EPMC393292 | biostudies-literature
| S-EPMC2608845 | biostudies-literature
| S-EPMC1088967 | biostudies-literature
| S-EPMC403653 | biostudies-literature
| S-EPMC2866332 | biostudies-literature