Exploring Clinical Outcomes using Topological Data Analysis

Yara Skaf, BS, Osama Dasa, MD, MPH, Jason Cory Brunson, PhD, Reinhard Laubenbacher, PhD

Analysis and modeling of complex clinical data such as electronic health records (EHRs) remain challenging despite modern advances in biomedical informatics– factors including their large size, high dimensionality, and abundant heterogeneity make such data difficult to visualize or explore through conventional means. In this project, we aim to demonstrate the potential of topological data analysis (TDA) for addressing such challenges, specifically in the context of COVID-19 outcomes prediction. We use a TDA algorithm called Mapper to build topological models of a population of COVID-19 patients using EHRs extracted from the OneFlorida database. After implementing a number of modifications to the Mapper algorithm to adapt it for use on this type of data, we use this topological approach to conduct population-level exploratory analyses with an emphasis on identifying phenotypic subtypes at increased risk of adverse outcomes such as major adverse cardiovascular events (MACE), mechanical ventilation, and death.



  • Skaf Y, Laubenbacher R. Topological data analysis in biomedicine: A review. J Biomed Inform. 2022 Jun;130:104082. doi: 10.1016/j.jbi.2022.104082. Epub 2022 May 1. PMID: 35508272.
  • Brunson JC, Skaf Y. Fixed and adaptive landmark sets for finite pseudometric spaces. 2022 Dec 21. arXiv: 2212.09826