Dataset Information

Comparing a novel machine learning method to the Friedewald formula and Martin-Hopkins equation for low-density lipoprotein estimation.

ABSTRACT: BACKGROUND:Low-density lipoprotein cholesterol (LDL-C) is a target for cardiovascular prevention. Contemporary equations for LDL-C estimation have limited accuracy in certain scenarios (high triglycerides [TG], very low LDL-C). OBJECTIVES:We derived a novel method for LDL-C estimation from the standard lipid profile using a machine learning (ML) approach utilizing random forests (the Weill Cornell model). We compared its correlation to direct LDL-C with the Friedewald and Martin-Hopkins equations for LDL-C estimation. METHODS:The study cohort comprised a convenience sample of standard lipid profile measurements (with the directly measured components of total cholesterol [TC], high-density lipoprotein cholesterol [HDL-C], and TG) as well as chemical-based direct LDL-C performed on the same day at the New York-Presbyterian Hospital/Weill Cornell Medicine (NYP-WCM). Subsequently, an ML algorithm was used to construct a model for LDL-C estimation. Results are reported on the held-out test set, with correlation coefficients and absolute residuals used to assess model performance. RESULTS:Between 2005 and 2019, there were 17,500 lipid profiles performed on 10,936 unique individuals (4,456 females; 40.8%) aged 1 to 103. Correlation coefficients between estimated and measured LDL-C values were 0.982 for the Weill Cornell model, compared to 0.950 for Friedewald and 0.962 for the Martin-Hopkins method. The Weill Cornell model was consistently better across subgroups stratified by LDL-C and TG values, including TG >500 and LDL-C <70. CONCLUSIONS:An ML model was found to have a better correlation with direct LDL-C than either the Friedewald formula or Martin-Hopkins equation, including in the setting of elevated TG and very low LDL-C.

SUBMITTER: Singh G

PROVIDER: S-EPMC7526877 | biostudies-literature | 2020

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Comparing a novel machine learning method to the Friedewald formula and Martin-Hopkins equation for low-density lipoprotein estimation.

Singh Gurpreet G Hussain Yasin Y Xu Zhuoran Z Sholle Evan E Michalak Kelly K Dolan Kristina K Lee Benjamin C BC van Rosendael Alexander R AR Fatima Zahra Z Peña Jessica M JM Wilson Peter W F PWF Gotto Antonio M AM Shaw Leslee J LJ Baskaran Lohendran L Al'Aref Subhi J SJ

PloS one 20200930 9

<h4>Background</h4>Low-density lipoprotein cholesterol (LDL-C) is a target for cardiovascular prevention. Contemporary equations for LDL-C estimation have limited accuracy in certain scenarios (high triglycerides [TG], very low LDL-C).<h4>Objectives</h4>We derived a novel method for LDL-C estimation from the standard lipid profile using a machine learning (ML) approach utilizing random forests (the Weill Cornell model). We compared its correlation to direct LDL-C with the Friedewald and Martin-H ...[more]

PMID: 32997716

Similar Datasets

Project description:ImportanceRecent studies have shown that Friedewald underestimates low-density lipoprotein cholesterol (LDL-C) at lower levels, which could result in undertreatment of high-risk patients. A novel method (Martin/Hopkins) using a patient-specific conversion factor provides more accurate LDL-C levels. However, this method has not been tested in proprotein convertase subtilisin/kexin type 9 (PCSK9) inhibitor-treated patients.ObjectiveTo investigate accuracy of 2 different methods for estimating LDL-C levels (Martin/Hopkins and Friedewald) compared with gold standard preparative ultracentrifugation (PUC) in patients with low LDL-C levels in the Further Cardiovascular Outcomes Research With PCSK9 Inhibition in Patients With Elevated Risk (FOURIER) trial.Design, setting, and participantsThe FOURIER trial was a randomized clinical trial of evolocumab vs placebo added to statin therapy in 27 564 patients with stable atherosclerotic cardiovascular disease. The patients' LDL-C levels were assessed at baseline, 4 weeks, 12 weeks, 24 weeks, and every 24 weeks thereafter, and measured directly by PUC when the level was less than 40 mg/dL per the Friedewald method (calculated as non-HDL-C level - triglycerides/5). In the Martin/Hopkins method, patient-specific ratios of triglycerides to very low-density lipoprotein cholesterol (VLDL-C) ratios were determined and used to estimate VLDL-C, which was subtracted from the non-HDL-C level to obtain the LDL-C level.Main outcomes and measuresLow-density lipoprotein cholesterol calculated by the Friedewald and Martin/Hopkins methods, with PUC as the reference method.ResultsFor this analysis, the mean (SD) age was 62.7 (9.0) years; 2885 of the 12 742 patients were women (22.6%). A total of 56 624 observations from 12 742 patients had Friedewald, Martin/Hopkins, and PUC LDL-C measurements. The median difference from PUC LDL-C levels for Martin/Hopkins LDL-C levels was -2 mg/dL (interquartile range [IQR], -4 to 1 mg/dL) and for Friedewald LDL-C levels was -4 mg/dL (IQR, -8 to -1 mg/dL; P < .001). Overall, 22.9% of Martin/Hopkins LDL-C values were more than 5 mg/dL different than PUC values, and 2.6% were more than 10 mg/dL different than PUC levels. These were significantly less than respective proportions with Friedewald estimation (40.1% and 13.3%; P < .001), mainly because of underestimation by the Friedewald method. The correlation with PUC LDL-C was significantly higher for Martin/Hopkins vs Friedewald (ρ, 0.918 [95% CI 0.916-0.919] vs ρ, 0.867 [0.865-0.869], P < .001).Conclusions and relevanceIn patients achieving low LDL-C with PCSK9 inhibition, the Martin/Hopkins method for LDL-C estimation more closely approximates gold standard PUC than Friedewald estimation does. The Martin/Hopkins method may prevent undertreatment because of LDL-C underestimation by the Friedewald method.Trial registrationClinicalTrials.gov Identifier: NCT01764633.

Project description:ImportanceIn clinical and research settings worldwide, low-density lipoprotein cholesterol (LDL-C) is typically estimated using the Friedewald equation. This equation assumes a fixed factor of 5 for the ratio of triglycerides to very low-density lipoprotein cholesterol (TG:VLDL-C); however, the actual TG:VLDL-C ratio varies significantly across the range of triglyceride and cholesterol levels.ObjectiveTo derive and validate a more accurate method for LDL-C estimation from the standard lipid profile using an adjustable factor for the TG:VLDL-C ratio.Design, setting, and participantsWe used a convenience sample of consecutive clinical lipid profiles obtained from 2009 through 2011 from 1,350,908 children, adolescents, and adults in the United States. Cholesterol concentrations were directly measured after vertical spin density-gradient ultracentrifugation, and triglycerides were directly measured. Lipid distributions closely matched the population-based National Health and Nutrition Examination Survey (NHANES). Samples were randomly assigned to derivation (n = 900,605) and validation (n = 450,303) data sets.Main outcomes and measuresIndividual patient-level concordance in clinical practice guideline LDL-C risk classification using estimated vs directly measured LDL-C (LDL-CD).ResultsIn the derivation data set, the median TG:VLDL-C was 5.2 (IQR, 4.5-6.0). The triglyceride and non-high-density lipoprotein cholesterol (HDL-C) levels explained 65% of the variance in the TG:VLDL-C ratio. Based on strata of triglyceride and non-HDL-C values, a 180-cell table of median TG:VLDL-C values was derived and applied in the validation data set to estimate the novel LDL-C (LDL-CN). For patients with triglycerides lower than 400 mg/dL, overall concordance in guideline risk classification with LDL-CD was 91.7% (95% CI, 91.6%-91.8%) for LDL-CN vs 85.4% (95% CI, 85.3%-85.5%) for Friedewald LDL-C (LDL-CF) (P < .001). The greatest improvement in concordance occurred in classifying LDL-C lower than 70 mg/dL, especially in patients with high triglyceride levels. In patients with an estimated LDL-C lower than 70 mg/dL, LDL-CD was also lower than 70 mg/dL in 94.3% (95% CI, 93.9%-94.7%) for LDL-CN vs 79.9% (95% CI, 79.3%-80.4%) for LDL-CF in samples with triglyceride levels of 100 to 149 mg/dL; 92.4% (95% CI, 91.7%-93.1%) for LDL-CN vs 61.3% (95% CI, 60.3%-62.3%) for LDL-CF in samples with triglyceride levels of 150 to 199 mg/dL; and 84.0% (95% CI, 82.9%-85.1%) for LDL-CN vs 40.3% (95% CI, 39.4%-41.3%) for LDL-CF in samples with triglyceride levels of 200 to 399 mg/dL (P < .001 for each comparison).Conclusions and relevanceA novel method to estimate LDL-C using an adjustable factor for the TG:VLDL-C ratio provided more accurate guideline risk classification than the Friedewald equation. These findings require external validation, as well as assessment of their clinical importance. The implementation of these findings into clinical practice would be straightforward and at virtually no cost.Trial registrationclinicaltrials.gov Identifier: NCT01698489.

Project description:ImportanceLow-density lipoprotein cholesterol (LDL-C), a key cardiovascular disease marker, is often estimated by the Friedewald or Martin equation, but calculating LDL-C is less accurate in patients with a low LDL-C level or hypertriglyceridemia (triglyceride [TG] levels ≥400 mg/dL).ObjectiveTo design a more accurate LDL-C equation for patients with a low LDL-C level and/or hypertriglyceridemia.Design, setting, and participantsData on LDL-C levels and other lipid measures from 8656 patients seen at the National Institutes of Health Clinical Center between January 1, 1976, and June 2, 1999, were analyzed by the β-quantification reference method (18 715 LDL-C test results) and were randomly divided into equally sized training and validation data sets. Using TG and non-high-density lipoprotein cholesterol as independent variables, multiple least squares regression was used to develop an equation for very low-density lipoprotein cholesterol, which was then used in a second equation for LDL-C. Equations were tested against the internal validation data set and multiple external data sets of either β-quantification LDL-C results (n = 28 891) or direct LDL-C test results (n = 252 888). Statistical analysis was performed from August 7, 2018, to July 18, 2019.Main outcomes and measuresConcordance between calculated and measured LDL-C levels by β-quantification, as assessed by various measures of test accuracy (correlation coefficient [R2], root mean square error [RMSE], mean absolute difference [MAD]), and percentage of patients misclassified at LDL-C treatment thresholds of 70, 100, and 190 mg/dL.ResultsCompared with β-quantification, the new equation was more accurate than other LDL-C equations (slope, 0.964; RMSE = 15.2 mg/dL; R2 = 0.9648; vs Friedewald equation: slope, 1.056; RMSE = 32 mg/dL; R2 = 0.8808; vs Martin equation: slope, 0.945; RMSE = 25.7 mg/dL; R2 = 0.9022), particularly for patients with hypertriglyceridemia (MAD = 24.9 mg/dL; vs Friedewald equation: MAD = 56.4 mg/dL; vs Martin equation: MAD = 44.8 mg/dL). The new equation calculates the LDL-C level in patients with TG levels up to 800 mg/dL as accurately as the Friedewald equation does for TG levels less than 400 mg/dL and was associated with 35% fewer misclassifications when patients with hypertriglyceridemia (TG levels, 400-800 mg/dL) were categorized into different LDL-C treatment groups.Conclusions and relevanceThe new equation can be readily implemented by clinical laboratories with no additional costs compared with the standard lipid panel. It will allow for more accurate calculation of LDL-C level in patients with low LDL-C levels and/or hypertriglyceridemia (TG levels, ≤800 mg/dL) and thus should improve the use of LDL-C level in cardiovascular disease risk management.

Dataset Information

Comparing a novel machine learning method to the Friedewald formula and Martin-Hopkins equation for low-density lipoprotein estimation.

Publications

Comparing a novel machine learning method to the Friedewald formula and Martin-Hopkins equation for low-density lipoprotein estimation.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets