AI models pre-trained on internet data can understand text, code, audio, and video. However, as public data sources become exhausted, it is evident that enabling applications beyond
consumer chatbots requires a thoughtful approach to data curation. Mistakes are costly in business applications, and intelligent agents must carefully collect and leverage proprietary data, such as customer feedback and user interactions. As ML systems increasingly affect high-stakes decisions, it is critical that they maintain a reliable level of performance under operation. However, traditional modeling assumptions rarely hold in practice due to noisy inputs, shifts in environment, omitted variables, and even adversarial attacks. The standard machine learning paradigm that optimize average performance is brittle to even small distributional shifts, exhibiting poor performance on minority groups and tail inputs. Even performance of heavily engineered state-of-the-art models degrades significantly on domains that are slightly different from what the model was trained on. Lack of understanding of their failure modes highlights the need for models that reliably work, and rigorous safety tests to evaluate them.
Division: Decision, Risk and Operations
Spring 2025
B9145 - 001
Format
Full Term
Day(s)
Date(s)
Start/End Time
Room
-
Thursday 01/27/2025 - 05/02/2025 9:00AM - 12:15PM Kravis 430