Unknown

Dataset Information

0

Representing and extracting lung cancer study metadata: study objective and study design.


ABSTRACT: This paper describes the information retrieval step in Casama (Contextualized Semantic Maps), a project that summarizes and contextualizes current research papers on driver mutations in non-small cell lung cancer. Casama?s representation of lung cancer studies aims to capture elements that will assist an end-user in retrieving studies and, importantly, judging their strength. This paper focuses on two types of study metadata: study objective and study design. 430 abstracts on EGFR and ALK mutations in lung cancer were annotated manually. Casama?s support vector machine (SVM) automatically classified the abstracts by study objective with as much as 129% higher F-scores compared to PubMed?s built-in filters. A second SVM classified the abstracts by epidemiological study design, suggesting strength of evidence at a more granular level than in previous work. The classification results and the top features determined by the classifiers suggest that this scheme would be generalizable to other mutations in lung cancer, as well as studies on driver mutations in other cancer domains.

SUBMITTER: Garcia-Gathright JI 

PROVIDER: S-EPMC4331232 | biostudies-literature | 2015 Mar

REPOSITORIES: biostudies-literature

altmetric image

Publications

Representing and extracting lung cancer study metadata: study objective and study design.

Garcia-Gathright Jean I JI   Oh Andrea A   Abarca Phillip A PA   Han Mary M   Sago William W   Spiegel Marshall L ML   Wolf Brian B   Garon Edward B EB   Bui Alex A T AA   Aberle Denise R DR  

Computers in biology and medicine 20150113


This paper describes the information retrieval step in Casama (Contextualized Semantic Maps), a project that summarizes and contextualizes current research papers on driver mutations in non-small cell lung cancer. Casama׳s representation of lung cancer studies aims to capture elements that will assist an end-user in retrieving studies and, importantly, judging their strength. This paper focuses on two types of study metadata: study objective and study design. 430 abstracts on EGFR and ALK mutati  ...[more]

Similar Datasets

| S-EPMC5265225 | biostudies-literature
| PRJEB40728 | ENA
| S-EPMC4892825 | biostudies-other
| S-EPMC4319840 | biostudies-other
| S-EPMC4565530 | biostudies-literature
| S-EPMC516245 | biostudies-literature
| S-EPMC10373112 | biostudies-literature
2009-01-01 | GSE10445 | GEO
| PRJEB22174 | ENA
| S-ECPF-SMDB-810 | biostudies-other