Date: April 26, 2022
Speaker: Dr. Vincent Major
Affiliation: NYU Grossman School of Medicine
Title: Best practices to develop AI for health: some (very) applied tips
Abstract: The choices we—as researchers or data scientists—make when preparing data to develop a machine learning model hugely influence how that model may be useful in real-world settings. In an obvious case, using data that is finalized upon a patient’s discharge, e.g. primary diagnosis or length-of-stay, guarantees false estimates if used too early. This kind of ‘off-label’ usage can slip into live systems when the development or validation doesn’t perfectly match how a system is likely used. Optimistic validation performance can sour into embarrassing results once live. Here I present some empiric results comparing experimental designs plus a variety of anecdotes from my and related work.