Dataset Information

ORFer--retrieval of protein sequences and open reading frames from GenBank and storage into relational databases or text files.

ABSTRACT: BACKGROUND:Functional genomics involves the parallel experimentation with large sets of proteins. This requires management of large sets of open reading frames as a prerequisite of the cloning and recombinant expression of these proteins. RESULTS:A Java program was developed for retrieval of protein and nucleic acid sequences and annotations from NCBI GenBank, using the XML sequence format. Annotations retrieved by ORFer include sequence name, organism and also the completeness of the sequence. The program has a graphical user interface, although it can be used in a non-interactive mode. For protein sequences, the program also extracts the open reading frame sequence, if available, and checks its correct translation. ORFer accepts user input in the form of single or lists of GenBank GI identifiers or accession numbers. It can be used to extract complete sets of open reading frames and protein sequences from any kind of GenBank sequence entry, including complete genomes or chromosomes. Sequences are either stored with their features in a relational database or can be exported as text files in Fasta or tabulator delimited format. The ORFer program is freely available at http://www.proteinstrukturfabrik.de/orfer. CONCLUSION:The ORFer program allows for fast retrieval of DNA sequences, protein sequences and their open reading frames and sequence annotations from GenBank. Furthermore, storage of sequences and features in a relational database is supported. Such a database can supplement a laboratory information system (LIMS) with appropriate sequence information.

SUBMITTER: Bussow K

PROVIDER: S-EPMC139979 | biostudies-literature | 2002 Dec

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

ORFer--retrieval of protein sequences and open reading frames from GenBank and storage into relational databases or text files.

Büssow Konrad K Hoffmann Steve S Sievert Volker V

BMC bioinformatics 20021219

<h4>Background</h4>Functional genomics involves the parallel experimentation with large sets of proteins. This requires management of large sets of open reading frames as a prerequisite of the cloning and recombinant expression of these proteins.<h4>Results</h4>A Java program was developed for retrieval of protein and nucleic acid sequences and annotations from NCBI GenBank, using the XML sequence format. Annotations retrieved by ORFer include sequence name, organism and also the completeness of ...[more]

PMID: 12493080

Dataset Information

ORFer--retrieval of protein sequences and open reading frames from GenBank and storage into relational databases or text files.

Publications

ORFer--retrieval of protein sequences and open reading frames from GenBank and storage into relational databases or text files.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

NeuroExtract: facilitating neuroscience-oriented retrieval from broadly-focused bioscience databases using text-based query mediation.
| S-EPMC2244880 | biostudies-literature

OpenCustomDB: Integration of Unannotated Open Reading Frames and Genetic Variants to Generate More Comprehensive Customized Protein Databases.
| S-EPMC10167680 | biostudies-literature

Turning text into research networks: information retrieval and computational ontologies in the creation of scientific databases.
| S-EPMC3250392 | biostudies-literature

Identifying issue frames in text.
| S-EPMC3712954 | biostudies-literature

Identification of Arabidopsis thaliana upstream open reading frames encoding peptide sequences that cause ribosomal arrest.
| S-EPMC5587730 | biostudies-literature

slORFfinder: a tool to detect open reading frames resulting from trans-splicing of spliced leader sequences.
| S-EPMC9851317 | biostudies-literature

Selection pressure in alternative reading frames.
| S-EPMC4182739 | biostudies-literature

Can a relational mindset boost analogical retrieval?
| S-EPMC6923295 | biostudies-literature

Pervasive functional translation of noncanonical open reading frames
2020-03-14 | GSE131650 | GEO

The insertion sequences of Anabaena sp. strain PCC 7120 and their effects on its open reading frames.
| S-EPMC2950511 | biostudies-literature