AI Safety & Trustworthiness
We study when and how AI systems fail, and how to make their behavior more reliable in critical settings.
This includes:
- Bias and fairness – detecting and characterizing underdiagnosis and demographic leakage in imaging AI and language models.
- Robustness to clinical variation – stress-testing models under real-world shifts in acquisition, scanners, and protocols.
- Security & adversarial bias – understanding “hidden in plain sight” attacks and other subtle ways systems can be manipulated.
- Best practices for generative AI – guidelines for the safe use of large language models in radiology and clinical workflows.
The goal is to design evaluation frameworks and mitigation strategies that go beyond accuracy, placing safety and trust at the center of AI deployment.