Use of generalized additive models and cokriging of spatial residuals to improve land-use regression estimates of nitrogen oxides in Southern California.
Ontology highlight
ABSTRACT: Land-use regression (LUR) models have been developed to estimate spatial distributions of traffic-related pollutants. Several studies have examined spatial autocorrelation among residuals in LUR models, but few utilized spatial residual information in model prediction, or examined the impact of modeling methods, monitoring site selection, or traffic data quality on LUR performance. This study aims to improve spatial models for traffic-related pollutants using generalized additive models (GAM) combined with cokriging of spatial residuals. Specifically, we developed spatial models for nitrogen dioxide (NO(2)) and nitrogen oxides (NO(x)) concentrations in Southern California separately for two seasons (summer and winter) based on over 240 sampling locations. Pollutant concentrations were disaggregated into three components: local means, spatial residuals, and normal random residuals. Local means were modeled by GAM. Spatial residuals were cokriged with global residuals at nearby sampling locations that were spatially auto-correlated. We compared this two-stage approach with four commonly-used spatial models: universal kriging, multiple linear LUR and GAM with and without a spatial smoothing term. Leave-one-out cross validation was conducted for model validation and comparison purposes. The results show that our GAM plus cokriging models predicted summer and winter NO(2) and NO(x) concentration surfaces well, with cross validation R(2) values ranging from 0.88 to 0.92. While local covariates accounted for partial variance of the measured NO(2) and NO(x) concentrations, spatial autocorrelation accounted for about 20% of the variance. Our spatial GAM model improved R(2) considerably compared to the other four approaches. Conclusively, our two-stage model captured summer and winter differences in NO(2) and NO(x) spatial distributions in Southern California well. When sampling location selection cannot be optimized for the intended model and fewer covariates are available as predictors for the model, the two-stage model is more robust compared to multiple linear regression models.
SUBMITTER: Li L
PROVIDER: S-EPMC3579670 | biostudies-literature | 2012 Aug
REPOSITORIES: biostudies-literature
ACCESS DATA