Publication
Article
The American Journal of Managed Care
Author(s):
This study compared 6 algorithmic fairness–improving approaches for low-birth-weight predictive models and found that they improved accuracy but decreased sensitivity for Black populations.
ABSTRACT
Objective: Evaluating whether common algorithmic fairness–improving approaches can improve low-birth-weight predictive model performance can provide important implications for population health management and health equity. This study aimed to evaluate alternative approaches for improving algorithmic fairness for low-birth-weight predictive models.
Study Design: Retrospective, cross-sectional study of birth certificates linked with medical insurance claims.
Methods: Birth certificates (n = 191,943; 2014-2022) were linked with insurance claims (2013-2021) from the Arkansas All-Payer Claims Database to assess alternative approaches for algorithmic fairness in predictive models for low birth weight (< 2500 g). We fit an original model and compared 6 fairness-improving approaches using elastic net models trained and tested with 70/30 balanced random split samples and 10-fold cross validation.
Results: The original model had lower accuracy (percent predicted correctly) in predicting low birth weight among Black, Native Hawaiian/Other Pacific Islander, Asian, and unknown racial/ethnic populations relative to White individuals. For Black individuals, accuracy increased with all 6 fairness-improving approaches relative to the original model; however, sensitivity (true-positives correctly predicted as low birth weight) significantly declined, as much as 31% (from 0.824 to 0.565), in 5 of 6 approaches.
Conclusions: When developing and implementing decision-making algorithms, it is critical that model performance metrics align with management goals for the predictive tool. In our study, fairness-improving models improved accuracy and area under the curve scores for Black individuals but decreased sensitivity and negative predictive value, suggesting that the original model, although unfair, was not improved. Implementation of unfair models for allocating preventive services could perpetuate racial/ethnic inequities by failing to identify individuals most at risk for a low-birth-weight delivery.
Am J Manag Care. 2025;31(5):In Press
Takeaway Points
This study compared algorithmic fairness–improving approaches for low-birth-weight predictive models relative to an original model.
There has been substantial growth in attention to bias in clinical algorithms and health care decision-making tools and how these biases may perpetuate existing disparities in adverse clinical and health care outcomes.1,2 HHS in 2022 changed its interpretation of Section 1557 of the Affordable Care Act to ban discrimination in the use or application of clinical algorithms.3 This new interpretation prohibits discrimination in health care decision-making algorithms across the health care continuum—from clinical treatment decisions to resource allocation and care management by health plans.
Utilizing an algorithmically fair approach in predictive model development may be particularly critical for outcomes with different underlying rates for different groups of individuals. One of the most well-noted racial/ethnic health inequities is low birth weight, with non-Hispanic Black infants having more than twice the rate of low birth weight than non-Hispanic White infants.4 In addition to the large differences in underlying rates of low birth weight among infants of different races and ethnicities, there are well-documented disparities in underlying rates of risk factors for low birth weight (eg, hypertension).5 Because non-Hispanic White individuals account for the majority of births in the US,4 predictive models for low birth weight can be biased or unfair for individuals of minority race/ethnicity, as the models may work best for the largest group within the sample (ie, White individuals) if the models are fit using classic predictive modeling approaches.6 The large differences in rates of low birth weight and underlying risk factors for low birth weight by race/ethnicity highlight the need for carefully assessing approaches to improve algorithmic fairness in predictive models for low birth weight; however, evaluations of low-birth-weight algorithmic fairness are lacking.3
Health plans and clinicians have the opportunity to provide valuable outreach to reduce the risk of low birth weight, such as access to community health workers or group-based prenatal care,7-9 but provision of such resources is often guided by predictive models that identify high-risk populations.2 As such, building accurate and fair predictive models for low birth weight is critical for population health management because poor algorithmic fairness may unfairly allocate resources for preventing adverse perinatal outcomes by failing to identify individuals who most need intervention.
Methods for improving algorithmic fairness range from relatively simple techniques, such as removing race/ethnicity from the predictive models, to more advanced techniques that adjust the algorithm’s objective function (ie, the metric that the model aims to optimize).3,10 The goal of this study was to assess algorithmic fairness approaches that may be easily utilized without an extensive advanced analytic background yet provide information that may be relevant in clinical settings or to population health management programs within health plans.
METHODS
Data
This study utilized data from the Arkansas All-Payer Claims Database (APCD) from 2013 to 2022. The APCD includes birth certificates for all births in the state, which can be linked with medical claims data for commercial, Marketplace, state employee, and Medicaid health plans in Arkansas. We additionally included area-level information from the Agency for Healthcare Research and Quality Social Determinants of Health Database, Area Health Resources Files, and County Health Rankings data.
Variables
The primary outcome in this study was low birth weight (< 2500 g) from the birth certificate, and the primary independent variable of interest was self-reported race/ethnicity from the birth certificate. We included a total of 117 covariates (161 individual features), including demographic, pregnancy, clinical, and area-level information (eAppendix Table [available at ajmc.com]). Of these, 7 variables were excluded due to multicollinearity (variance inflation factor ≥ 10).
Pregnancy-related information was limited to variables that could be captured prior to pregnancy (eg, number of previous births) or early during pregnancy (eg, plurality). Clinical variables included conditions associated with adverse perinatal outcomes11-15 using birth certificate information (eg, prepregnancy diabetes) and insurance claims data from the 9 months of pregnancy through the date of delivery. Finally, we included area-level variables such as sociodemographic information (eg, Gini Index) and health care access variables (eg, primary care physicians per 1000).
Sample
We identified 285,975 birth records from 2014 to 2022 and kept the first birth record for multiple births (n = 4527 excluded; 1.6%). We then excluded 19 duplicate records (0.007%) and subsequently excluded births to mothers who did not have at least 1 month with Medicaid or private insurance coverage in the APCD (n = 83,909; 29.8%) or did not have at least 1 insurance claim (n = 2857; 1.5%) during the pregnancy period. Next, we excluded births without an Arkansas zip code and county of residence (n = 1760; 0.9%) or with missing information on study variables (n = 960; 0.5%). A “missing” category was created for variables that had at least 1.0% of records with missing information instead of removing births with the respective missing information. The final sample included birth certificate information and insurance claims (2013-2022; number of claims = 11,179,626) for 191,943 births.
Statistical Analysis
Predictive models for low birth weight were fitted using elastic net classification, which is a hybrid approach combining least absolute shrinkage and selection operator (LASSO) and ridge penalty terms for regularization. This modeling approach is well suited to control for model overfitting and may allow the model to better generalize to other data. For each algorithm, the training model included 3 repeats with 10-fold cross validation using a grid search to tune parameters using the caret package within R 4.2.0 (R Foundation for Statistical Computing). We trained models using a 70/30 random split and balanced the training set using oversampling. Model features were preprocessed using centering and scaling. The optimal model was selected based on the highest area under the receiver operating characteristic curve (AUROC).
To test for fairness, we first assessed 3 measures of overall performance: AUROC scores (ie, ability to discriminate between individuals with and without low birth weight), F1 scores (ie, harmonic mean of the positive predictive value and sensitivity), and accuracy (ie, percent predicted correctly). Next, we assessed 2 metrics that most align with the theoretical goal of providing a resource to individuals at risk of low birth weight, which likely has a low cost of false-positives and higher cost of false-negatives: sensitivity (ie, percent of low-birth-weight infants correctly predicted as low birth weight) and negative predictive value (NPV) (ie, percent not low birth weight among those predicted not low birth weight). Comparisons were made based on 95% CIs for each performance metric between individuals of different races/ethnicities.
To test methods to improve algorithmic performance, models were refit to include only races/ethnicities with at least 3080 births in the training sample prior to upsampling. This number was derived by multiplying a common rule of thumb for the number of records needed per predictive feature (20)16 and the number of individual features (154). In addition to fitting the initial model (original), we used 6 alternative approaches: (1) fitting separate models by race/ethnicity (race stratified), (2) removing all maternal and paternal race/ethnicity and origin information (racially blind), (3) upsampling such that each race/ethnicity had the same number of births (equal representation), and creating separate thresholds for positive prediction by race/ethnicity for the (4) original model (original-ST), (5) racially blind model (racially blind–ST), and (6) equal representation model (equal representation–ST). All thresholds for positive prediction were selected by maximizing Youden’s J statistic in the training sample. AUROC was not calculated separately for the approaches that calculated new risk thresholds separately by race/ethnicity because AUROC is not impacted by separate risk thresholds. Biserial Pearson correlations and φ correlations were calculated to ensure that no covariate inadvertently served as a proxy for binary indicators of each race/ethnicity (all ≤ 0.60).
This study was determined exempt from review by the University of Arkansas for Medical Sciences Institutional Review Board.
RESULTS
We identified 122,372 non-Hispanic White (White) births, 44,272 non-Hispanic Black (Black) births, 16,121 Hispanic births, 3130 non-Hispanic Asian (Asian) births, 2825 non-Hispanic Native Hawaiian/Other Pacific Islander (NHOPI) births, 2153 births of unknown race/ethnicity, and 1070 non-Hispanic American Indian/Alaska Native (AIAN) births. The overall rate of low birth weight in the testing sample was 9.0%, with large differences (χ2 test P < .001) between Black (14.3%), unknown race/ethnicity (12.7%), AIAN (9.7%), Asian (9.3%), NHOPI (9.2%), White (7.3%), and Hispanic (7.1%) births.
In our original model that included all races/ethnicities (Table 1), we found lower model accuracy for Black (0.530), AIAN (0.680), NHOPI (0.720), Asian (0.708), and unknown race/ethnicity (0.636) individuals relative to White individuals (0.783) and higher model accuracy for Hispanic individuals (0.806), as indicated by nonoverlapping CIs. The AUROC and NPV were lower for Black individuals (0.733 and 0.940, respectively) vs White individuals (0.775 and 0.961); however, the F1 score and sensitivity were higher for Black individuals (0.332 and 0.816, respectively) than White individuals (0.280 and 0.582). Sensitivity was lower for Hispanic individuals (0.491) vs White individuals (0.582).
Table 2 provides overall measures of model performance and calibration for the original model and the 6 approaches aimed at improving fairness in model performance. For each model, White individuals had a higher AUROC relative to Black individuals and statistically the same AUROCs compared with Hispanic individuals; however, the AUROC scores for any given race/ethnicity did not statistically differ across the alternative approaches.
With respect to accuracy (Table 2), it was lower for Black individuals than White individuals in all but 2 models (original-ST and equal representation–ST). For Black individuals, accuracy significantly increased in all 6 fairness-improving approaches vs the original model (0.508), with a more than 40% improvement in accuracy for the original-ST (0.735), equal representation–ST (0.730), racially blind–ST (0.703), and race-stratified (0.683) models. For Hispanic individuals, accuracy statistically significantly declined in 4 approaches (all but racially blind and equal representation).
With respect to F1 scores, all 6 of the fairness-improving approaches resulted in a larger relative increase in the F1 score for Black individuals compared with White individuals. When comparing the relative difference in F1 scores between Black and White individuals, the largest difference was in the original-ST model, where Black individuals had a 43% higher F1 score compared with White individuals (0.374 vs 0.262). Three approaches statistically significantly decreased F1 scores for Hispanic individuals relative to the original model (race stratified, racially blind–ST, and equal representation–ST).
Table 3 provides 2 common metrics for assessing model performance (sensitivity and NPV) when predicting an outcome with a high cost for false-negatives and a low cost for false-positives. Sensitivity statistically declined for 5 of 6 fairness-improving approaches for Black individuals (all but equal representation) and statistically increased for 1 approach for Hispanic individuals (racially blind–ST). The original model had a 35% higher sensitivity for Black individuals relative to White individuals (0.824 vs 0.610); however, in 3 of the 6 fairness-improving approaches, the respective models had a lower sensitivity for Black individuals than White individuals. The sensitivity amongBlack individuals declined as much as 31% (original: 0.824 vs original-ST: 0.565).
With respect to NPV (Table 3), all models had lower NPV for Black individuals relative to White individuals . In all but 2 of the approaches (racially blind and equal representation), there was a statistically lower NPV for Black individuals relative to the original model. The NPV did not statistically change for Hispanic individuals in any of the fairness-improving approaches.
DISCUSSION
This study used insurance claims linked with birth certificates to evaluate whether 6 fairness-improving approaches increased fairness in low-birth-weight predictive model performance vs the original model. We found lower model accuracy for Black, AIAN, NHOPI, Asian, and unknown race/ethnicity individuals relative to White individuals and higher model accuracy for Hispanic individuals in the original model.
After applying 6 common fairness-improving approaches, we found that none of the approaches improved AUROC for any race/ethnicity. Of significance for health equity, 5 of the 6 fairness-improving approaches reduced sensitivity for Black individuals, and 4 of the 6 approaches reduced NPV for Black individuals. In this context, it is critical to ensure identification of all at-risk individuals (sensitivity) and to have confidence that a negative prediction (“not low birth weight”) accurately identifies those not at risk of having an infant with low birth weight (NPV). As such, our findings of reduced sensitivity and NPV for Black populations in the “fairness-improving approaches” suggest that these approaches are likely not fairness-improving in this context and that a focus on other metrics, such as accuracy (which increased for Black individuals in each of the fairness-improving approaches) alone, may inadvertently increase health inequities.
In this study, we used 6 relatively simple approaches to test for improvements in model performance by race/ethnicity, including stratifying thresholds and models by race/ethnicity, removing race/ethnicity from the model, and upsampling on race/ethnicity to have equal sample sizes.3,17 These approaches are used to mitigate issues related to the potential for differential risk factors based on race/ethnicity, which allows underlying risk factors to contribute to risk scores rather than using race/ethnicity as a proxy. These approaches also ensure that the majority racial/ethnic group does not outweigh the minority groups during model development. Other more advanced approaches are available, such as modifying the objective function during model processing or adding a specific regularization term to make the sensitive feature (ie, race/ethnicity) less important in the model.3,10 We selected algorithmic fairness methods that may be more likely to be understood across audiences to facilitate transparency, following algorithmic fairness best practices.18
In terms of overall metrics of model performance, including AUROC and accuracy, we found consistently worse model performance for Black individuals compared with White individuals. Hispanic individuals did not have statistically different AUROCs relative to White individuals for any model and had higher accuracy in 3 models. There are multiple reasons for the lower model performance for Black individuals, including lower recognition and documentation of clinical conditions used as model features and inability to include key drivers of adverse infant outcomes among Black individuals, such as discrimination, stigma, and lived experience.2,19
When using only the overall measures of performance (F1 score and accuracy) to guide selection of a model to improve fairness, the approach that most significantly improved model performance for Black individuals was the original model with separate thresholds. With this approach, accuracy increased 45% for Black individuals. However, it is critical to recognize that sensitivity declined by more than 30%, suggesting better ability for the original algorithm to correctly predict “yes” among Black infants who actually experienced low birth weight. The highest F1 score and accuracy for Hispanic individuals were seen with the equal representation model; however, this model did not have statistically different sensitivity relative to the original model.
In the models that do not use separate thresholds for each race/ethnicity (ie, original, racially blind, and equal representation), individuals with a given predicted risk of low birth weight were all predicted as “yes” or “no” for low birth weight regardless of their race/ethnicity, depending on whether the predicted risk score was above or below the defined threshold. In the approaches that identified different thresholds by race/ethnicity, Black individuals had higher thresholds than White or Hispanic individuals across each of the approaches. That is, a Black individual needed a higher predicted risk to be predicted as “yes” than a White or Hispanic individual. These higher thresholds reduced the percentage of truly positive Black individuals being predicted positive (ie, sensitivity) from 82% in the original model to 57% to 64% in the approaches with separate thresholds by race, ultimately reducing the number of Black individuals who are truly positive who would receive the given resource because they would be falsely predicted as “no.”
Low birth weight is more common among Black women than White women, which may ultimately drive our overall conclusions that incorporation of race/ethnicity when establishing thresholds will likely lead to reduced provision of resources to Black individuals. The suggested approach for a given model will depend on the population studied and the distribution of the outcome. Given that many individuals are fearful of providing racial/ethnic data and given concerns that race/ethnicity may be used to inflict harm (eg, discrimination in the clinical setting),20 collection of racial/ethnic information and use of these data in predictive models should be approached with considerable caution. All predictive model development that includes racial/ethnic information should be grounded in algorithmic fairness.
Limitations
Linking data from birth certificates with insurance claims provides relatively robust information that would likely be available to clinical and population health programs; however, these data lack important information that may be predictive of low birth weight, such as laboratory tests or imaging information. One previous evaluation of performance of low-birth-weight predictive models among infants in Japan had an AUROC of 0.95 among term infants (37 weeks’ gestation); however, this study had thousands of variables including maternal and paternal genetic information, fetal ultrasonography data, and laboratory test data,21 highlighting the importance of linking claims data with electronic health records for population health management decision-making. To our knowledge, previous evaluations of algorithmic fairness in predictive models for infant outcomes are largely lacking.3 However, our predictive performance aligns well with previous evaluations (AUROCs of 0.72 and 0.81)22,23 that utilized data from prior to pregnancy or early in pregnancy that may be available to population health programs.
An additional potential limitation of our study is that individuals of minority race/ethnicity may have reduced health care service use and claims-based identification of clinical conditions,24,25 and some variables on the birth certificate may have low reliability and validity.26,27 We included insurance claims from the full pregnancy period to improve detection of clinical conditions, and we note that previous studies have found that low birth weight on the birth certificate has high reliability and validity.26,27 Finally, our data are from a single state, which may limit generalizability of the findings. Because racial/ethnic differences in low birth weight persist across regions and in nationally representative data, our findings will likely be relevant for policy at all levels of decision-making.
CONCLUSIONS
There has been considerable growth in literature evaluating fairness in predictive models across a wide range of clinical and health care outcomes. We add to this important literature by highlighting that many of the approaches typically used to improve model fairness may lead to fewer prenatal care resources for non-White populations, particularly Black individuals. In this study, use of sensitivity and NPV, which prioritize the goal of maximizing true-positives and minimizing false-negatives, resulted in substantially different conclusions compared with overall measures of model performance, such as accuracy. Thus, one of the most important conclusions of this study is that careful consideration of model performance metrics that align with the purpose of the predictive tool is not only critical for accurate predictions overall but also an important consideration for ensuring equitable models and appropriate resource allocation for individuals of different races/ethnicities.
Author Affiliations: University of Arkansas for Medical Sciences (CCB, HG-A, BCA, JMT, KB-M, MT), Little Rock, AR.
Source of Funding: Dr Brown is supported by the National Institute on Minority Health and Health Disparities of the National Institutes of Health (NIH; 1K01MD018072). Access to data for this project was provided by the Arkansas Insurance Department/Arkansas Biosciences Institute/Arkansas Center for Health Improvement All-Payer Claims Database Cooperative Agreement. Dr Tilford reports receiving support from the National Institute of Mental Health (R01MH133857) of the NIH, the National Center for Advancing Translational Sciences of the NIH (U54TR001629), and the National Institute of Diabetes and Digestive and Kidney Diseases of the NIH (1R01DK125641). The content of this paper is solely the responsibility of the authors and does not necessarily represent the official views of the funders. Funders had no role in the design, analysis, or writing of this article.
Author Disclosures: Dr Tilford reports receiving copyright income from TrestleTree Inc and personal fees from Merck. The remaining authors report no relationship or financial interest with any entity that would pose a conflict of interest with the subject matter of this article.
Authorship Information: Concept and design (CCB, HG-A, BCA, JMT, KB-M, MT);analysis and interpretation of data (CCB); drafting of the manuscript (CCB); critical revision of the manuscript for important intellectual content (HG-A, BCA, JMT, KB-M, MT); statistical analysis (CCB); administrative, technical, or logistic support (HG-A, JMT, MT); and supervision (HG-A, BCA, JMT, KB-M, MT).
Address Correspondence to: Clare C. Brown, PhD, MPH, University of Arkansas for Medical Sciences, 4301 W Markham St, Slot #820-12, Little Rock, AR 72205. Email: cbrown3@uams.edu.
REFERENCES
1. Gianfrancesco MA, Tamang S, Yazdany J, Schmajuk G. Potential biases in machine learning algorithms using electronic health record data. JAMA Intern Med. 2018;178(11):1544-1547. doi:10.1001/jamainternmed.2018.3763
2. Gervasi SS, Chen IY, Smith-McLallen A, et al. The potential for bias in machine learning and opportunities for health insurers to address it. Health Aff (Millwood). 2022;41(2):212-218. doi:10.1377/hlthaff.2021.01287
3. Cary MP Jr, Zink A, Wei S, et al. Mitigating racial and ethnic bias and advancing health equity in clinical algorithms: a scoping review. Health Aff (Millwood). 2023;42(10):1359-1368. doi:10.1377/hlthaff.2023.00553
4. Osterman MJK, Hamilton BE, Martin JA, Driscoll AK, Valenzuela CP. Births: final data for 2021. Natl Vital Stat Rep. 2023;72(1):1-53.
5. Robbins C, Boulet SL, Morgan I, et al. Disparities in preconception health indicators — Behavioral Risk Factor Surveillance System, 2013-2015, and Pregnancy Risk Assessment Monitoring System, 2013-2014. MMWR Surveill Summ. 2018;67(1):1-16. doi:10.15585/mmwr.ss6701a1
6. Paulus JK, Kent DM. Predictably unequal: understanding and addressing concerns that algorithmic clinical prediction may increase health disparities. NPJ Digit Med. 2020;3:99. doi:10.1038/s41746-020-0304-9
7. Redding S, Conrey E, Porter K, Paulson J, Hughes K, Redding M. Pathways community care coordination in low birth weight prevention. Matern Child Health J. 2015;19(3):643-650. doi:10.1007/s10995-014-1554-4
8. Abshire C, Mcdowell M, Crockett AH, Fleischer NL. The impact of CenteringPregnancy group prenatal care on birth outcomes in Medicaid eligible women. J Womens Health (Larchmt). 2019;28(7):919-928. doi:10.1089/jwh.2018.7469
9. Crockett AH, Chen L, Heberlein EC, et al. Group vs traditional prenatal care for improving racial equity in preterm birth and low birthweight: the Centering and Racial Disparities randomized clinical trial study. Am J Obstet Gynecol. 2022;227(6):893.e1-893.e15. doi:10.1016/j.ajog.2022.06.066
10. Kamishima T, Akaho S, Asoh H, Sakuma J. Fairness-aware classifier with prejudice remover regularizer. In: Flach PA, de Bie T, Cristianini N, eds. Machine Learning and Knowledge Discovery in Databases. Springer; 2012:35-50.
11. Elixhauser A, Steiner C, Harris DR, Coffey RM. Comorbidity measures for use with administrative data. Med Care. 1998;36(1):8-27. doi:10.1097/00005650-199801000-00004
12. Rezaeiahari M, Brown CC, Ali MM, Datta J, Tilford JM. Understanding racial disparities in severe maternal morbidity using Bayesian network analysis. PLoS One. 2021;16(10):e0259258. doi:10.1371/journal.pone.0259258
13. Mhyre JM, Bateman BT, Leffert LR. Influence of patient comorbidities on the risk of near-miss maternal morbidity or mortality. Anesthesiology. 2011;115(5):963-972. doi:10.1097/ALN.0b013e318233042d
14. Leonard SA, Kennedy CJ, Carmichael SL, Lyell DJ, Main EK. An expanded obstetric comorbidity scoring system for predicting severe maternal morbidity. Obstet Gynecol. 2020;136(3):440-449. doi:10.1097/AOG.0000000000004022
15. Reid LD, Creanga AA. Severe maternal morbidity and related hospital quality measures in Maryland. J Perinatol. 2018;38(8):997-1008. doi:10.1038/s41372-018-0096-9
16. Siddiqui K. Heuristics for sample size determination in multivariate statistical techniques. World Appl Sci J. 2013;27(2):285-287. Accessed August 1, 2024. https://www.idosi.org/wasj/wasj27(2)13/20.pdf
17. Huang J, Galal G, Etemadi M, Vaidyanathan M. Evaluation and mitigation of racial bias in clinical machine learning models: scoping review. JMIR Med Inform. 2022;10(5):e36388. doi:10.2196/3638
18. Chin MH, Afsar-Manesh N, Bierman AS, et al. Guiding principles to address the impact of algorithm bias on racial and ethnic disparities in health and health care. JAMA Netw Open. 2023;6(12):e2345050. doi:10.1001/jamanetworkopen.2023.45050
19. Alhusen JL, Bower KM, Epstein E, Sharps P. Racial discrimination and adverse birth outcomes: an integrative review. J Midwifery Womens Health. 2016;61(6):707-720. doi:10.1111/jmwh.12490
20. Baker DW, Hasnain-Wynia R, Kandula NR, Thompson JA, Brown ER. Attitudes toward health care providers, collecting information about patients’ race, ethnicity, and language. Med Care. 2007;45(11):1034-1042. doi:10.1097/MLR.0b013e318127148f
21. Mizuno S, Nagaie S, Tamiya G, et al. Establishment of the early prediction models of low-birth-weight reveals influential genetic and environmental factors: a prospective cohort study. BMC Pregnancy Childbirth. 2023;23(1):628. doi:10.1186/s12884-023-05919-5
22. Al Habashneh R, Khader YS, Jabali OA, Alchalabi H. Prediction of preterm and low birth weight delivery by maternal periodontal parameters: receiver operating characteristic (ROC) curve analysis. Matern Child Health J. 2013;17(2):299-306. doi:10.1007/s10995-012-0974-2
23. Patterson JK, Thorsten VR, Eggleston B, et al. Building a predictive model of low birth weight in low-and middle-income countries: a prospective cohort study. BMC Pregnancy Childbirth. 2023;23(1):600. doi:10.1186/s12884-023-05866-1
24. Obermeyer Z, Powers B, Vogeli C, Mullainathan S. Dissecting racial bias in an algorithm used to manage the health of populations. Science. 2019;366(6464):447-453. doi:10.1126/science.aax2342
25. Wallace J, Lollo A, Duchowny KA, Lavallee M, Ndumele CD. Disparities in health care spending and utilization among Black and White Medicaid enrollees. JAMA Health Forum. 2022;3(6):e221398. doi:10.1001/jamahealthforum.2022.1398
26. Roohan PJ, Josberger RE, Acar J, Dabir P, Feder HM, Gagliano PJ. Validation of birth certificate data in New York state. J Community Health. 2003;28(5):335-346. doi:10.1023/a:1025492512915
27. Northam S, Knapp TR. The reliability and validity of birth certificates. J Obstet Gynecol Neonatal Nurs. 2006;35(1):3-12. doi:10.1111/j.1552-6909.2006.00016.x