Enhancing AI Reliability in Medical Diagnosis
Researchers at MIT have developed a novel method to improve the trustworthiness of AI models in high-stakes medical settings. The technique addresses the challenge of uncertainty in AI predictions, particularly in medical imaging where multiple conditions can present similarly.
In medical imaging, AI models can assist clinicians by identifying subtle details and improving diagnosis efficiency. However, the inherent ambiguity in images often leads to large sets of possible diagnoses, making it difficult for clinicians to identify the correct condition. To address this, the MIT team enhanced a technique called conformal classification, which provides a set of probable diagnoses along with a guarantee that the correct diagnosis is included.
The researchers applied test-time augmentation (TTA) to improve conformal classification. TTA involves creating multiple versions of an image through cropping, flipping, and zooming, then aggregating the model’s predictions. This approach reduced the size of prediction sets by 10-30% while maintaining accuracy guarantees.
“With fewer classes to consider, the sets of predictions are naturally more informative,” explains Divya Shanmugam PhD ’24, lead researcher on the project. “You’re not sacrificing accuracy for something more informative.”
The method works by:
- Holding out labeled image data for the conformal classification process
- Learning to aggregate augmentations on these held-out data
- Automatically augmenting images to maximize model accuracy
- Running conformal classification on TTA-transformed predictions
This approach is simple to implement, effective in practice, and doesn’t require model retraining. While it reduces the amount of labeled data available for conformal classification, the accuracy boost from TTA outweighs this cost.
The researchers plan to validate their approach in text classification contexts and explore ways to reduce computational requirements for TTA. Their work, funded in part by the Wistrom Corporation, will be presented at the Conference on Computer Vision and Pattern Recognition.

The development has significant implications for medical diagnosis and treatment, potentially streamlining the diagnosis process and improving patient outcomes. By providing more reliable and informative AI predictions, this technology could enhance clinician decision-making and patient care.