Unknown

Dataset Information

0

Identifying Possible False Matches in Anonymized Hospital Administrative Data without Patient Identifiers.


ABSTRACT:

Objective

To identify data linkage errors in the form of possible false matches, where two patients appear to share the same unique identification number.

Data source

Hospital Episode Statistics (HES) in England, United Kingdom.

Study design

Data on births and re-admissions for infants (April 1, 2011 to March 31, 2012; age 0-1 year) and adolescents (April 1, 2004 to March 31, 2011; age 10-19 years).

Data collection/extraction methods

Hospital records pseudo-anonymized using an algorithm designed to link multiple records belonging to the same person. Six implausible clinical scenarios were considered possible false matches: multiple births sharing HESID, re-admission after death, two birth episodes sharing HESID, simultaneous admission at different hospitals, infant episodes coded as deliveries, and adolescent episodes coded as births.

Principal findings

Among 507,778 infants, possible false matches were relatively rare (n = 433, 0.1 percent). The most common scenario (simultaneous admission at two hospitals, n = 324) was more likely for infants with missing data, those born preterm, and for Asian infants. Among adolescents, this scenario (n = 320) was more common for males, younger patients, the Mixed ethnic group, and those re-admitted more frequently.

Conclusions

Researchers can identify clinically implausible scenarios and patients affected, at the data cleaning stage, to mitigate the impact of possible linkage errors.

SUBMITTER: Hagger-Johnson G 

PROVIDER: S-EPMC4545352 | biostudies-literature | 2015 Aug

REPOSITORIES: biostudies-literature

altmetric image

Publications

Identifying Possible False Matches in Anonymized Hospital Administrative Data without Patient Identifiers.

Hagger-Johnson Gareth G   Harron Katie K   Gonzalez-Izquierdo Arturo A   Cortina-Borja Mario M   Dattani Nirupa N   Muller-Pebody Berit B   Parslow Roger R   Gilbert Ruth R   Goldstein Harvey H  

Health services research 20141218 4


<h4>Objective</h4>To identify data linkage errors in the form of possible false matches, where two patients appear to share the same unique identification number.<h4>Data source</h4>Hospital Episode Statistics (HES) in England, United Kingdom.<h4>Study design</h4>Data on births and re-admissions for infants (April 1, 2011 to March 31, 2012; age 0-1 year) and adolescents (April 1, 2004 to March 31, 2011; age 10-19 years).<h4>Data collection/extraction methods</h4>Hospital records pseudo-anonymize  ...[more]

Similar Datasets

| S-EPMC10205637 | biostudies-literature
| S-EPMC8171026 | biostudies-literature
| S-EPMC6662999 | biostudies-literature
| S-EPMC6695163 | biostudies-literature
| S-EPMC4659568 | biostudies-other
| S-EPMC3203614 | biostudies-literature
| S-EPMC5919706 | biostudies-literature
| S-EPMC7347222 | biostudies-literature
| S-EPMC7003968 | biostudies-literature
| S-EPMC9588002 | biostudies-literature