Oral Presentations

GenomeForest: An Ensemble Machine Learning Classifier for Endometriosis

12:33 PM–12:51 PM Mar 24, 2020 (America - Chicago)



Programmatic Theme: Translational Bioinformatics

Abstract: Endometriosis is a complex and high impact disease affecting 176 million women worldwide with diagnostic latency between 4 to 11 years due to lack of a definitive clinical symptom or a minimally invasive diagnostic method. In this study, we developed a new ensemble machine learning classifier based on chromosomal partitioning, named GenomeForest and applied it in classifying the endometriosis vs. the control patients using 38 RNA-seq and 80 enrichment-based DNA-methylation (MBD-seq) datasets, and computed performance assessment with six different experiments. The ensemble machine learning models provided an avenue for identifying several candidate biomarker genes with a very high F1 score; a near perfect F1 score (0.968) for the transcriptomics dataset and a very high F1 score (0.918) for the methylomics dataset. We hope in the future a less invasive biopsy can be used to diagnose endometriosis using the findings from such ensemble machine learning classifiers, as demonstrated in this study.

Learning Objective: 1. Learn machine learning classification applications on transcriptomics data
2. Learn machine learning classification applications on methylomics data data
3. Learn disease classifications using multi-omics data


Sadia Akter (Presenter)
University of Missouri

Dong Xu, University of Missouri
Susan Nagel, University of Missouri
John Bromfield, University of Missouri
Katherine Pelch, University of Missouri
Gilbert Wilshire, Boone Hospital
Trupti Joshi, University of Missouri

Keywords, Themes & Types