Unknown

Dataset Information

0

Understanding disciplinary vocabularies using a full-text enabled domain-independent term extraction approach.


ABSTRACT: Publication metadata help deliver rich analyses of scholarly communication. However, research concepts and ideas are more effectively expressed through unstructured fields such as full texts. Thus, the goals of this paper are to employ a full-text enabled method to extract terms relevant to disciplinary vocabularies, and through them, to understand the relationships between disciplines. This paper uses an efficient, domain-independent term extraction method to extract disciplinary vocabularies from a large multidisciplinary corpus of PLoS ONE publications. It finds a power-law pattern in the frequency distributions of terms present in each discipline, indicating a semantic richness potentially sufficient for further study and advanced analysis. The salient relationships amongst these vocabularies become apparent in application of a principal component analysis. For example, Mathematics and Computer and Information Sciences were found to have similar vocabulary use patterns along with Engineering and Physics; while Chemistry and the Social Sciences were found to exhibit contrasting vocabulary use patterns along with the Earth Sciences and Chemistry. These results have implications to studies of scholarly communication as scholars attempt to identify the epistemological cultures of disciplines, and as a full text-based methodology could lead to machine learning applications in the automated classification of scholarly work according to disciplinary vocabularies.

SUBMITTER: Yan E 

PROVIDER: S-EPMC5706669 | biostudies-literature | 2017

REPOSITORIES: biostudies-literature

altmetric image

Publications

Understanding disciplinary vocabularies using a full-text enabled domain-independent term extraction approach.

Yan Erjia E   Williams Jake J   Chen Zheng Z  

PloS one 20171129 11


Publication metadata help deliver rich analyses of scholarly communication. However, research concepts and ideas are more effectively expressed through unstructured fields such as full texts. Thus, the goals of this paper are to employ a full-text enabled method to extract terms relevant to disciplinary vocabularies, and through them, to understand the relationships between disciplines. This paper uses an efficient, domain-independent term extraction method to extract disciplinary vocabularies f  ...[more]

Similar Datasets

| S-EPMC1090555 | biostudies-literature
| S-EPMC3441580 | biostudies-literature
| S-EPMC4681986 | biostudies-literature
| S-EPMC8256824 | biostudies-literature
| S-EPMC2367623 | biostudies-literature
| S-EPMC2939881 | biostudies-literature
| S-EPMC5442348 | biostudies-literature
| S-EPMC7454993 | biostudies-literature
| S-EPMC3475109 | biostudies-literature
| S-EPMC3179660 | biostudies-literature