Programmatic Theme: Data Science

Abstract: Seamless sharing between imaging facilities of medical images obtained on the same patient is crucial in providing accurate and efficient care to patients. However, the terminology used to describe semantically similar examinations can vary widely between facilities. Current practice is manual table-based mapping to a standard terminology, which has substantial potential for mislabelled and missing examinations. In this work, we establish several baseline methods for automating the mapping of radiology imaging procedure descriptions to a SNOMED CT based standard terminology. Our best performing baseline, consisting of a bag of words representation and shallow neural network, achieved 96.3% accuracy. In addition, we explore an unsupervised clustering method that explores relevancy matching without the need for an intervening standard. Lastly, we make the procedure name dataset used in this work available to encourage extension of this application.

Learning Objective: Understand the purpose and function of Diagnostic Imagaing Repositories (DIR)
Understand the data format and organization of data exchange between sites and a DIR
Learn how Natural Language Processing and Machine Learning techniques perform on this dataset
Learn multiple methods to automate manual processes currently employed by most DIRs


Salaar Liaqat (Presenter)
University of Toronto

Joanna Pineda, University of Toronto
Jeevaa Velayutham, University of Toronto
Allen Lee, University of Toronto
Joshua Reicher, Stanford Health Care
Jason Nagels, Hospital Diagnostic Imaging Repository Services
Marzyeh Ghassemi, University of Toronto
Benjamin Fine, Trillium Health Partners

Keywords, Themes & Types