Unknown

Dataset Information

0

Mistaken identifiers: gene name errors can be introduced inadvertently when using Excel in bioinformatics.


ABSTRACT:

Background

When processing microarray data sets, we recently noticed that some gene names were being changed inadvertently to non-gene names.

Results

A little detective work traced the problem to default date format conversions and floating-point format conversions in the very useful Excel program package. The date conversions affect at least 30 gene names; the floating-point conversions affect at least 2,000 if Riken identifiers are included. These conversions are irreversible; the original gene names cannot be recovered.

Conclusions

Users of Excel for analyses involving gene names should be aware of this problem, which can cause genes, including medically important ones, to be lost from view and which has contaminated even carefully curated public databases. We provide work-arounds and scripts for circumventing the problem.

SUBMITTER: Zeeberg BR 

PROVIDER: S-EPMC459209 | biostudies-literature | 2004 Jun

REPOSITORIES: biostudies-literature

altmetric image

Publications

Mistaken identifiers: gene name errors can be introduced inadvertently when using Excel in bioinformatics.

Zeeberg Barry R BR   Riss Joseph J   Kane David W DW   Bussey Kimberly J KJ   Uchio Edward E   Linehan W Marston WM   Barrett J Carl JC   Weinstein John N JN  

BMC bioinformatics 20040623


<h4>Background</h4>When processing microarray data sets, we recently noticed that some gene names were being changed inadvertently to non-gene names.<h4>Results</h4>A little detective work traced the problem to default date format conversions and floating-point format conversions in the very useful Excel program package. The date conversions affect at least 30 gene names; the floating-point conversions affect at least 2,000 if Riken identifiers are included. These conversions are irreversible; t  ...[more]

Similar Datasets

| S-EPMC8357140 | biostudies-literature
| S-EPMC5617173 | biostudies-literature
| S-EPMC5340976 | biostudies-literature
| S-EPMC2639076 | biostudies-literature
2023-06-01 | GSE218903 | GEO
| S-EPMC3169536 | biostudies-literature
| S-EPMC10927542 | biostudies-literature
| S-EPMC6964641 | biostudies-literature
2023-06-01 | GSE218899 | GEO
2023-06-01 | GSE218901 | GEO