Dataset Information

Comparison of Methods to Reduce Bias From Clinical Prediction Models of Postpartum Depression.

ABSTRACT:

Importance

The lack of standards in methods to reduce bias for clinical algorithms presents various challenges in providing reliable predictions and in addressing health disparities.

Objective

To evaluate approaches for reducing bias in machine learning models using a real-world clinical scenario.

Design, setting, and participants

Health data for this cohort study were obtained from the IBM MarketScan Medicaid Database. Eligibility criteria were as follows: (1) Female individuals aged 12 to 55 years with a live birth record identified by delivery-related codes from January 1, 2014, through December 31, 2018; (2) greater than 80% enrollment through pregnancy to 60 days post partum; and (3) evidence of coverage for depression screening and mental health services. Statistical analysis was performed in 2020.

Exposures

Binarized race (Black individuals and White individuals).

Main outcomes and measures

Machine learning models (logistic regression [LR], random forest, and extreme gradient boosting) were trained for 2 binary outcomes: postpartum depression (PPD) and postpartum mental health service utilization. Risk-adjusted generalized linear models were used for each outcome to assess potential disparity in the cohort associated with binarized race (Black or White). Methods for reducing bias, including reweighing, Prejudice Remover, and removing race from the models, were examined by analyzing changes in fairness metrics compared with the base models. Baseline characteristics of female individuals at the top-predicted risk decile were compared for systematic differences. Fairness metrics of disparate impact (DI, 1 indicates fairness) and equal opportunity difference (EOD, 0 indicates fairness).

Results

Among 573 634 female individuals initially examined for this study, 314 903 were White (54.9%), 217 899 were Black (38.0%), and the mean (SD) age was 26.1 (5.5) years. The risk-adjusted odds ratio comparing White participants with Black participants was 2.06 (95% CI, 2.02-2.10) for clinically recognized PPD and 1.37 (95% CI, 1.33-1.40) for postpartum mental health service utilization. Taking the LR model for PPD prediction as an example, reweighing reduced bias as measured by improved DI and EOD metrics from 0.31 and -0.19 to 0.79 and 0.02, respectively. Removing race from the models had inferior performance for reducing bias compared with the other methods (PPD: DI = 0.61; EOD = -0.05; mental health service utilization: DI = 0.63; EOD = -0.04).

Conclusions and relevance

Clinical prediction models trained on potentially biased data may produce unfair outcomes on the basis of the chosen metrics. This study's results suggest that the performance varied depending on the model, outcome label, and method for reducing bias. This approach toward evaluating algorithmic bias can be used as an example for the growing number of researchers who wish to examine and address bias in their data and models.

SUBMITTER: Park Y

PROVIDER: S-EPMC8050742 | biostudies-literature | 2021 Apr

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Comparison of Methods to Reduce Bias From Clinical Prediction Models of Postpartum Depression.

Park Yoonyoung Y Hu Jianying J Singh Moninder M Sylla Issa I Dankwa-Mullan Irene I Koski Eileen E Das Amar K AK

JAMA network open 20210401 4

<h4>Importance</h4>The lack of standards in methods to reduce bias for clinical algorithms presents various challenges in providing reliable predictions and in addressing health disparities.<h4>Objective</h4>To evaluate approaches for reducing bias in machine learning models using a real-world clinical scenario.<h4>Design, setting, and participants</h4>Health data for this cohort study were obtained from the IBM MarketScan Medicaid Database. Eligibility criteria were as follows: (1) Female indiv ...[more]

PMID: 33856478

Similar Datasets

Project description:Publication bias is prevalent within the scientific literature. Whilst there are multiple ideas on how to reduce publication bias, only a minority of journals have made substantive changes to address the problem. We aimed to explore the perceived feasibility of strategies to reduce publication bias by gauging opinions of journal editors (n = 73) and other academics/researchers (n = 160) regarding nine methods of publishing and peer-reviewing research: mandatory publication, negative results journals/articles, open reviewing, peer-review training and accreditation, post-publication review, pre-study publication of methodology, published rejection lists, research registration, and two-stage review. Participants completed a questionnaire asking both quantitative (multiple choice or Likert scales) and qualitative (open-ended) questions regarding the barriers to implementing each suggestion, and their strengths and limitations. Participants were asked to rate the nine suggestions, then choose the method they felt was most effective. Mandatory publication was most popularly selected as the 'most effective' method of reducing publication bias for editors (25%), and was the third most popular choice for academics/researchers (14%). The most common selection for academics/researchers was two-stage review (26%), but fewer editors prioritised this (11%). Negative results journals/articles were the second and third most common choices for academics/researchers (21%) and editors (16%), respectively. Editors more commonly chose research registration as 'most effective' (21%), which was favoured by only 6% of academics/researchers. Whilst mandatory publication was generally favoured by respondents, it is infeasible to trial at a journal level. Where suggestions have already been implemented (e.g. negative results journals/articles, trial registration), efforts should be made to objectively assess their efficacy. Two-stage review should be further trialled as its popularity amongst academics/researchers suggests it may be well received, though editors may be less receptive. Several underlying barriers to change also emerged, including scientific culture, impact factors, and researcher training; these should be further explored to reduce publication bias.

Project description:BackgroundStudies investigating the prevalence and risk factors for postpartum depression (PPD) have used different definitions. Some studies have used a high score on the Edinburgh Postnatal Depression Scale (EPDS) to define PPD, whereas others have used information on antidepressant medication use and/or diagnostic information on treatment for depression at a psychiatric hospital. We wanted to compare results using these two approaches to evaluate to what degree results can be compared. Moreover we wanted to evaluate, whether use of EPDS or PPAT (defined below) leads to identification of different risk factor profiles.MethodsWe identified women who delivered a child between 1 January 2014 and 31 December 2016 in Copenhagen or in one of the municipalities that were part of the Danish Health Visitors' Child Health Database. The potential risk factors were demographic factors and pregnancy- and obstetrical events. Outcomes of interest were an EPDS score ≥ 13, use of antidepressants (ATC: N06A) and/or a diagnosis of depression (F32) within six months after birth. Use of antidepressants and/or diagnosis of depression will be referred to as postpartum antidepressant treatment (PPAT). Agreement between EPDS ≥ 13 and PPAT was evaluated by the kappa coefficient. Associations between risk factors and the two outcomes (EPDS ≥ 13 and PPAT) were estimated by risk ratios (RR) using log-linear binomial regression. Presence of a systematic difference between RRs based on EPDS ≥ 13 (RREPDS≥13) and PPAT (RRPPAT) was evaluated in a meta-regression approach weighted by inverse-variance and with logarithm of the RRs as outcome.ResultsThe estimated PPD prevalence using EPDS ≥ 13 was 3.2% and of PPAT 0.4%. The agreement between the two measures was small (Kappa = 0.08), but their risk factor profile was very similar with no systematic difference between them.ConclusionsUsing the two different methods of case identification produced different prevalence estimates, but a similar risk factor profile. The differences in estimated prevalence and low agreement suggest that the two measures identify different potential PPD cases and using only one of the methods in defining PPD would underestimate PPD prevalence. The similar risk factor profile suggests that the considered risk factors are involved in the general development of PPD.

Dataset Information

Comparison of Methods to Reduce Bias From Clinical Prediction Models of Postpartum Depression.

Importance

Objective

Design, setting, and participants

Exposures

Main outcomes and measures

Results

Conclusions and relevance

Publications

Comparison of Methods to Reduce Bias From Clinical Prediction Models of Postpartum Depression.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets