
Rethinking Supplemental Screening: Deep Learning Surpasses Breast Density in Breast Cancer Risk Prediction
Key Takeaways
- Federal density notification and state coverage expansions rely on BI-RADS density that labels 40–50% of women, limiting precision as a trigger for supplemental screening.
- Mirai derived 5-year absolute risk from four standard views without clinical inputs, stratified by NCCN thresholds, and outperformed density (AUROC 0.71 vs 0.53).
Deep learning model beats breast density for 5-year breast cancer risk, informing supplemental screening decisions and personalized care.
A deep learning (DL) model applied to standard screening mammograms substantially outperformed radiologist-assessed breast density in predicting which women will develop
Breast Density as a Cancer Risk Factor
Since September 2024, the FDA has required imaging facilities to notify patients whether their breasts are dense or nondense, given that dense breast tissue is both a recognized cancer risk factor and a masking agent that can obscure tumors on mammography. More than 30 states have tied expanded insurance coverage for supplemental screening to this binary designation. But breast density is subjectively assessed by radiologists, with documented variability between readers, and it applies broadly to 40% to 50% of screened women, raising questions about whether it is precise enough to serve as a policy trigger.
The study tested whether a DL mammography-based risk model called Mirai could outperform breast density in estimating future breast cancer diagnosis within 5 years and false-negative (FN) screening results, which are defined as mammograms read as normal that are followed within 1 year by a cancer diagnosis.
Mirai generates a 5-year absolute risk score from the 4 standard mammographic views, without requiring patient questionnaires or clinical data. Its architecture, training process, and source code have been previously published and validated across multiple international centers.2
Cohort Selection and Risk Classification Approach
This was a retrospective cohort study of consecutive bilateral screening mammograms performed between 2009 and 2018 across 5 sites of a large academic health system, with follow-up through December 2023 to ensure complete 5-year cancer ascertainment.1 After excluding mammograms used in Mirai's original training and validation sets as well as those with technical or data issues, the final cohort included 123,091 mammograms from 67,019 women (median age, 58).
Breast density was classified as dense or nondense using BI-RADS criteria, consistent with the FDA binary standard. DL risk scores were stratified as low (<1.7%), intermediate (1.7%-3.0%), or high (>3.0%) in alignment with National Comprehensive Cancer Network thresholds currently used to guide clinical management. Discriminatory performance was assessed using the area under the receiver operating characteristic curve (AUROC).
Deep Learning Model Showed Higher Discriminatory Accuracy
The DL model demonstrated significantly higher accuracy than breast density in predicting 5-year cancer risk (AUROC, 0.71 [95% CI, 0.70-0.72] vs 0.53 [95% CI, 0.52-0.54] for density alone; P < .001). Importantly, adding breast density to the DL model did not improve performance (AUROC, 0.70 [95% CI, 0.69-0.71] with density vs 0.71 [95% CI, 0.70-0.72] without; P = .08), suggesting that density-related imaging information is already captured within the model's predictions.
Cancer incidence increased with DL-estimated risk tiers: 1.0% in the low-risk group, 2.7% in the intermediate group, and 6.2% in the high-risk group (P <.001 for all comparisons). The difference associated with breast density was far more modest: 3.2% in women with dense breasts vs 2.6% in those with nondense breasts.
Women with nondense breasts who were classified as high risk by the DL model had a cancer incidence of 6.0%, which is nearly identical to the 6.4% seen in women with both dense breasts and high DL risk. By contrast, women with dense breasts but low DL risk had a cancer incidence of only 1.2%.
As the authors put it, these findings suggest that "DL risk models could offer a more precise and equitable alternative to breast density as a policy criterion for determining access to supplemental breast imaging."
The DL model was also able to stratify patients by their risk of an FN exam. FN rates were 0.6 per 1000 exams in the low-risk group, 1.0 in intermediate, and 2.1 in the high-risk group. The highest FN rate observed was in women with both dense breasts and high DL risk (3.0 per 1000), whereas the lowest was in the nondense group with low DL risk (0.2 per 1000). High-risk patients also had a higher proportion of FN invasive cancers (91.5%) compared with low-risk patients (76.9%). The DL model was not associated with significantly greater discriminatory accuracy than breast density for predicting FN.
The Future of AI-Guided Supplemental Imaging Approaches
The randomized ScreenTrustMRI trial published in 2024 provided a prospective look at AI-guided supplemental MRI selection. In that trial, an AI mammography tool was nearly 4 times more efficient than density-based selection in identifying cancers per 1000 MRI examinations (64 vs 16.5 cancers), with most additional cancers detected being invasive.3 The present study extends that evidence by directly comparing DL risk with breast density in a large contemporary cohort, specifically in the context of current FDA density notification policy.
The practical implications are substantial. Under the current framework, 41.4% of the cohort (50,974 exams) would be referred for supplemental imaging based on density alone. Under DL-based stratification, only 22.7% (27,906 exams) would be flagged as high risk while also capturing a higher proportion of cancers (49.1% vs 46.0%). This suggests that a DL-guided approach could reduce unnecessary supplemental imaging while maintaining or improving cancer detection.
Future work will need to address prospective implementation, cost-effectiveness analysis, clinical workflow integration, and reimbursement policy development. The authors also note that Mirai was developed on 2D full-field digital mammography and does not incorporate digital breast tomosynthesis, which may contain additional diagnostic information.
References
- Lamb LR, Mercaldo SF, Carney A, Lehman CD. A deep learning breast cancer risk model for precise supplemental screening. JAMA Netw Open. 2026;9(5):e2610559. doi:10.1001/jamanetworkopen.2026.10559
- Yala A, Mikhael PG, Strand F, et al. Toward robust mammography-based models for breast cancer risk. Sci Transl Med. 2021;13(578):eaba4373. doi:10.1126/scitranslmed.aba4373
- Salim M, Liu Y, Sorkhei M, et al. AI-based selection of individuals for supplemental MRI in population-based breast cancer screening: the randomized ScreenTrustMRI trial. Nat Med. 2024;30(9):2623-2630. doi:10.1038/s41591-024-03093-5




