Unknown

Dataset Information

0

Automated data extraction from historical city directories: The rise and fall of mid-century gas stations in Providence, RI.


ABSTRACT: The location of defunct environmentally hazardous businesses like gas stations has many implications for modern American cities. To track down these locations, we present the directoreadr code (github.com/brown-ccv/directoreadr). Using scans of Polk city directories from Providence, RI, directoreadr extracts and parses business location data with a high degree of accuracy. The image processing pipeline ran without any human input for 94.4% of the pages we examined. For the remaining 5.6%, we processed them with some human input. Through hand-checking a sample of three years, we estimate that ~94.6% of historical gas stations are correctly identified and located, with historical street changes and non-standard address formats being the main drivers of errors. As an example use, we look at gas stations, finding that gas stations were most common early in the study period in 1936, beginning a sharp and steady decline around 1950. We are making the dataset produced by directoreadr publicly available. We hope it will be used to explore a range of important questions about socioeconomic patterns in Providence and cities like it during the transformations of the mid-1900s.

SUBMITTER: Bell S 

PROVIDER: S-EPMC7437912 | biostudies-literature | 2020

REPOSITORIES: biostudies-literature

altmetric image

Publications

Automated data extraction from historical city directories: The rise and fall of mid-century gas stations in Providence, RI.

Bell Samuel S   Marlow Thomas T   Wombacher Kai K   Hitt Anina A   Parikh Neev N   Zsom Andras A   Frickel Scott S  

PloS one 20200819 8


The location of defunct environmentally hazardous businesses like gas stations has many implications for modern American cities. To track down these locations, we present the directoreadr code (github.com/brown-ccv/directoreadr). Using scans of Polk city directories from Providence, RI, directoreadr extracts and parses business location data with a high degree of accuracy. The image processing pipeline ran without any human input for 94.4% of the pages we examined. For the remaining 5.6%, we pro  ...[more]

Similar Datasets

| S-EPMC4756436 | biostudies-literature
| S-EPMC6392263 | biostudies-literature
| PRJEB15541 | ENA
| S-EPMC5916506 | biostudies-literature
| S-EPMC3922928 | biostudies-literature
| S-EPMC8713757 | biostudies-literature
| S-EPMC9223313 | biostudies-literature
| S-EPMC4140013 | biostudies-literature
| PRJEB19217 | ENA
| S-EPMC3021526 | biostudies-literature