Loading...
Improving medical term embeddings using UMLS Metathesaurus
; ; Yang, Ziyu ;
Yang, Ziyu
Citations
Altmetric:
Genre
Journal article
Date
2022-04-29
Advisor
Committee member
Group
Department
Computer and Information Sciences
Permanent link to this record
Collections
Research Projects
Organizational Units
Journal Issue
DOI
http://dx.doi.org/10.1186/s12911-022-01850-5
Abstract
Background: Health providers create Electronic Health Records (EHRs) to describe the conditions and procedures used to treat their patients. Medical notes entered by medical staff in the form of free text are a particularly insightful component of EHRs. There is a great interest in applying machine learning tools on medical notes in numerous medical informatics applications. Learning vector representations, or embeddings, of terms in the notes, is an important pre-processing step in such applications. However, learning good embeddings is challenging because medical notes are rich in specialized terminology, and the number of available EHRs in practical applications is often very small. Methods: In this paper, we propose a novel algorithm to learn embeddings of medical terms from a limited set of medical notes. The algorithm, called definition2vec, exploits external information in the form of medical term definitions. It is an extension of a skip-gram algorithm that incorporates textual definitions of medical terms provided by the Unified Medical Language System (UMLS) Metathesaurus. Results: To evaluate the proposed approach, we used a publicly available Medical Information Mart for Intensive Care (MIMIC-III) EHR data set. We performed quantitative and qualitative experiments to measure the usefulness of the learned embeddings. The experimental results show that definition2vec keeps the semantically similar medical terms together in the embedding vector space even when they are rare or unobserved in the corpus. We also demonstrate that learned vector embeddings are helpful in downstream medical informatics applications. Conclusion: This paper shows that medical term definitions can be helpful when learning embeddings of rare or previously unseen medical terms from a small corpus of specialized documents such as medical notes.
Description
Citation
Chanda, A.K., Bai, T., Yang, Z. et al. Improving medical term embeddings using UMLS Metathesaurus. BMC Med Inform Decis Mak 22, 114 (2022). https://doi.org/10.1186/s12911-022-01850-5
Citation to related work
BMC
Has part
BMC Medical Informatics and Decision Making, Vol. 22
ADA compliance
For Americans with Disabilities Act (ADA) accommodation, including help with reading this content, please contact scholarshare@temple.edu