Unknown

Dataset Information

0

Enriching representation learning using 53 million patient notes through human phenotype ontology embedding.


ABSTRACT: The Human Phenotype Ontology (HPO) is a dictionary of >15,000 clinical phenotypic terms with defined semantic relationships, developed to standardize phenotypic analysis. Over the last decade, the HPO has been used to accelerate the implementation of precision medicine into clinical practice. In addition, recent research in representation learning, specifically in graph embedding, has led to notable progress in automated prediction via learned features. Here, we present a novel approach to phenotype representation by incorporating phenotypic frequencies based on 53 million full-text health care notes from >1.5 million individuals. We demonstrate the efficacy of our proposed phenotype embedding technique by comparing our work to existing phenotypic similarity-measuring methods. Using phenotype frequencies in our embedding technique, we are able to identify phenotypic similarities that surpass current computational models. Furthermore, our embedding technique exhibits a high degree of agreement with domain experts' judgment. By transforming complex and multidimensional phenotypes from the HPO format into vectors, our proposed method enables efficient representation of these phenotypes for downstream tasks that require deep phenotyping. This is demonstrated in a patient similarity analysis and can further be applied to disease trajectory and risk prediction.

SUBMITTER: Daniali M 

PROVIDER: S-EPMC10782859 | biostudies-literature | 2023 May

REPOSITORIES: biostudies-literature

altmetric image

Publications

Enriching representation learning using 53 million patient notes through human phenotype ontology embedding.

Daniali Maryam M   Galer Peter D PD   Lewis-Smith David D   Parthasarathy Shridhar S   Kim Edward E   Salvucci Dario D DD   Miller Jeffrey M JM   Haag Scott S   Helbig Ingo I  

Artificial intelligence in medicine 20230228


The Human Phenotype Ontology (HPO) is a dictionary of >15,000 clinical phenotypic terms with defined semantic relationships, developed to standardize phenotypic analysis. Over the last decade, the HPO has been used to accelerate the implementation of precision medicine into clinical practice. In addition, recent research in representation learning, specifically in graph embedding, has led to notable progress in automated prediction via learned features. Here, we present a novel approach to pheno  ...[more]

Similar Datasets

| S-EPMC5737094 | biostudies-literature
| S-EPMC7779012 | biostudies-literature
| S-EPMC4718658 | biostudies-literature
| S-EPMC10889906 | biostudies-literature
| S-EPMC8862729 | biostudies-literature
| S-EPMC8367145 | biostudies-literature
| S-EPMC10823585 | biostudies-literature
| S-EPMC8048212 | biostudies-literature
| S-EPMC10022456 | biostudies-literature
| S-EPMC7085143 | biostudies-literature