Programmatic Theme: Clinical Research Informatics

Abstract: Identifying patient characteristics that influence the rate of colorectal polyp recurrence can provide important insights into which patients are at higher risk for recurrence. We used natural language processing to extract polyp morphological characteristics from 953 polyp-presenting patients’ electronic medical records. We used subsequent colonoscopy reports to examine how the time to polyp recurrence (731 patients experienced recurrence) is influenced by these characteristics as well as anthropometric features using Kaplan-Meier curves, Cox proportional hazards modeling, and random survival forest models. We found that the rate of recurrence differed significantly by polyp size, number, and location and patient smoking status. Additionally, right-sided colon polyps increased recurrence risk by 30% compared to left-sided polyps. History of tobacco use increased polyp recurrence risk by 20% compared to never-users. A random survival forest model showed an AUC of 0.65 and identified several other predictive variables, which can inform development of personalized polyp surveillance plans.

Learning Objective: Predicting colorectal polyp recurrence using time-to-event analysis of medical records


Lia Harrington (Presenter)
Dartmouth College

Jason Wei, Dartmouth College
Arief Suriawinata, Dartmouth-Hitchcock Medical Center
Todd MacKensie, Dartmouth College
Saeed Hassanpour, Dartmouth College

Keywords, Themes & Types