Unknown

Dataset Information

0

Multi-TGDR, a multi-class regularization method, identifies the metabolic profiles of hepatocellular carcinoma and cirrhosis infected with hepatitis B or hepatitis C virus.


ABSTRACT: Over the last decade, metabolomics has evolved into a mainstream enterprise utilized by many laboratories globally. Like other "omics" data, metabolomics data has the characteristics of a smaller sample size compared to the number of features evaluated. Thus the selection of an optimal subset of features with a supervised classifier is imperative. We extended an existing feature selection algorithm, threshold gradient descent regularization (TGDR), to handle multi-class classification of "omics" data, and proposed two such extensions referred to as multi-TGDR. Both multi-TGDR frameworks were used to analyze a metabolomics dataset that compares the metabolic profiles of hepatocellular carcinoma (HCC) infected with hepatitis B (HBV) or C virus (HCV) with that of cirrhosis induced by HBV/HCV infection; the goal was to improve early-stage diagnosis of HCC.We applied two multi-TGDR frameworks to the HCC metabolomics data that determined TGDR thresholds either globally across classes, or locally for each class. Multi-TGDR global model selected 45 metabolites with a 0% misclassification rate (the error rate on the training data) and had a 3.82% 5-fold cross-validation (CV-5) predictive error rate. Multi-TGDR local selected 48 metabolites with a 0% misclassification rate and a 5.34% CV-5 error rate.One important advantage of multi-TGDR local is that it allows inference for determining which feature is related specifically to the class/classes. Thus, we recommend multi-TGDR local be used because it has similar predictive performance and requires the same computing time as multi-TGDR global, but may provide class-specific inference.

SUBMITTER: Tian S 

PROVIDER: S-EPMC4234477 | biostudies-literature | 2014 Apr

REPOSITORIES: biostudies-literature

altmetric image

Publications

Multi-TGDR, a multi-class regularization method, identifies the metabolic profiles of hepatocellular carcinoma and cirrhosis infected with hepatitis B or hepatitis C virus.

Tian Suyan S   Chang Howard H HH   Wang Chi C   Jiang Jing J   Wang Xiaomei X   Niu Junqi J  

BMC bioinformatics 20140404


<h4>Background</h4>Over the last decade, metabolomics has evolved into a mainstream enterprise utilized by many laboratories globally. Like other "omics" data, metabolomics data has the characteristics of a smaller sample size compared to the number of features evaluated. Thus the selection of an optimal subset of features with a supervised classifier is imperative. We extended an existing feature selection algorithm, threshold gradient descent regularization (TGDR), to handle multi-class classi  ...[more]

Similar Datasets

| S-EPMC3833980 | biostudies-literature
| S-EPMC5355740 | biostudies-literature
| S-EPMC3321022 | biostudies-literature
| S-EPMC4260413 | biostudies-literature
| S-EPMC3951426 | biostudies-literature
2013-01-22 | GSE15654 | GEO
| S-EPMC1774397 | biostudies-literature
| S-EPMC5757181 | biostudies-literature
| S-EPMC2755230 | biostudies-literature
| S-EPMC2760187 | biostudies-literature