Using Machine Learning Models to Identify Factors Associated With 30-Day Readmissions After Posterior Cervical Fusions: A Longitudinal Cohort Study
Article information
Abstract
Objective
Readmission rates after posterior cervical fusion (PCF) significantly impact patients and healthcare, with complication rates at 15%–25% and up to 12% 90-day readmission rates. In this study, we aim to test whether machine learning (ML) models that capture interfactorial interactions outperform traditional logistic regression (LR) in identifying readmission-associated factors.
Methods
The Optum Clinformatics Data Mart database was used to identify patients who underwent PCF between 2004–2017. To determine factors associated with 30-day readmissions, 5 ML models were generated and evaluated, including a multivariate LR (MLR) model. Then, the best-performing model, Gradient Boosting Machine (GBM), was compared to the LACE (Length patient stay in the hospital, Acuity of admission of patient in the hospital, Comorbidity, and Emergency visit) index regarding potential cost savings from algorithm implementation.
Results
This study included 4,130 patients, 874 of which were readmitted within 30 days. When analyzed and scaled, we found that patient discharge status, comorbidities, and number of procedure codes were factors that influenced MLR, while patient discharge status, billed admission charge, and length of stay influenced the GBM model. The GBM model significantly outperformed MLR in predicting unplanned readmissions (mean area under the receiver operating characteristic curve, 0.846 vs. 0.829; p < 0.001), while also projecting an average cost savings of 50% more than the LACE index.
Conclusion
Five models (GBM, XGBoost [extreme gradient boosting], RF [random forest], LASSO [least absolute shrinkage and selection operator], and MLR) were evaluated, among which, the GBM model exhibited superior predictive performance, robustness, and accuracy. Factors associated with readmissions impact LR and GBM models differently, suggesting that these models can be used complementarily. When analyzing PCF procedures, the GBM model resulted in greater predictive performance and was associated with higher theoretical cost savings for readmissions associated with PCF complications.
INTRODUCTION
Posterior cervical fusion (PCF) is a common surgical intervention used to treat a variety of cervical spinal pathologies, including spondylosis, spinal tumors, and spinal deformity [1]. However, postoperative complications following PCF are not uncommon; in fact, one literature review reported that patients undergoing PCF have an overall complication rate of 15%–25% [2]. Furthermore, analysis of a spine-specific database for cervical fusion surgeries showed that patients who underwent the posterior approach had unplanned 90-day readmission rates of up to 12% [3]. Such unplanned readmissions contribute to an estimated hospital cost of $10 billion nationally [4]. These costs are increasingly relevant as the rate of cervical fusion surgeries is expected to increase by 13.3% for anterior cervical fusions and 19.3% for PCF [5].
Given these considerations, there is an ongoing effort to generate predictive models that can successfully identify patients at high risk of readmission following spine surgery [6-10]. By identifying high-risk patients, model simulations could offer potential interventions that may reduce readmissions and associated healthcare costs. In particular, machine learning (ML) models have shown promise in identifying intervention strategies that can inform how healthcare and hospital systems allocate resources to reduce postoperative readmission rates [11]. These models leverage patient demographic information and perioperative data to determine relevant factors that predict patient’s risk of being readmitted following surgery. Here, we build on this literature by using ML correlators, including logistic regression (LR) models, to identify risk factors associated with readmissions following PCF. We hypothesized that, in keeping with previously published work, ML models such as least absolute shrinkage and selection operator (LASSO), random forest (RF), stochastic gradient boosting machine (GBM), or extreme gradient boosting (XGBoost) can out predict and outperform traditional LR models, while also contributing to greater readmission-associated cost savings when implemented [12].
To test this 2-fold hypothesis, we (1) compared the predictive performance of 4 supervised ML algorithms to a traditionally-implemented ML model, multivariate LR (MLR) and (2) estimated the potential cost savings of reducing readmissions by implementing the best-performing ML model in a clinical setting and comparing it to the LACE (Length patient stay in the hospital, Acuity of admission of patient in the hospital, Comorbidity, and Emergency visit) index. While this study presents an initial effort to use supervised classification and regression ML models to predict unplanned readmissions rates and simulate readmission-associated costs savings, there are inherent limitations or biases of ML models that are unaccounted for and merit recognition. In addition, further clinical evaluation is needed to refine, finetune, and enhance the performance of the models presented in this study.
MATERIALS AND METHODS
1. Cohort
To analyze specific patient utilization, expenditure, and enrollment data between 1/1/2004 and 11/30/2017, the Optum Clinformatics Data Mart database (Optum, Inc., Eden Prairie, MN, USA) was used. Patients were identified using the Current Procedural Terminology code 22600 for posterior cervical decompression with fusion and 22840, 22842, 22843, or 22844 for posterior spinal instrumentation. Patients were subsequently filtered using our eligibility criteria (Fig. 1). To maintain the specificity of our study population to PCF and instrumentation, we ensured the exclusion of patients who had undergone anterior or lumbar procedures, as these represent distinct surgical categories with different risk profiles and outcomes. Since this database contains deidentified medical claims, patient consent is not applicable, and the study is exempted from requiring Institutional Review Board approval.
2. Predictors
The predictors used were based on previous studies and included patient demographics, socioeconomic status, procedural service, healthcare utilization, complications and comorbidities [13,14] (Table 1). These predictors were selected before analyzing the eligible patient data, and the algorithm implemented to analyze these predictors was in agreement with current medical literature [10-15].
3. Outcomes
The primary outcome assessed was the relative influence of each predictor on the risk for unplanned readmissions, which was normalized using the variable importance score for each model. For secondary outcomes, we measured the performance of each ML model in predicting 30-day readmissions by computing the area under the receiver operating characteristic curve (AUC), which serves as a measure of the model’s ability to estimate the probability of readmission as previously described in Bamber [16]. To calculate the potential cost savings associated with implementing these models, we applied the readmission reduction rates projected by the GBM model to the inflation-adjusted, all-payer national estimates of surgical readmissions within 30 days of surgery, as reported in a previously published Healthcare Cost and Utilization Project Statistical Brief [17]. The brief provides a comprehensive analysis of 30-day post-surgical readmissions and associated costs across a spectrum of high-volume surgeries, including spinal procedures, in various income demographics. The predictive performance and associated cost savings were compared between the top-performing GBM model and the LACE index, a previously-validated readmission model [18].
4. Predictive Modeling
Five models were generated and evaluated: a multivariate LR, a penalized LR model chosen based on elastic net variants of the LASSO, RF, stochastic GBM, and XGBoost. The evaluation of these models was based on AUC, sensitivity, and specificity. The parameters relevant to the prediction task were identified from the variable importance scores, which were calculated for each model and used to improve its interpretability. For model generation and tuning details, refer to the Supplementary Text and Table 1.
5. Statistical Analysis
R ver. 3.5.1 (R Foundation for Statistical Computing, Vienna, Austria) was used to perform all statistical analyses. To determine the risk factors associated with readmissions (Table 1), univariate LR models were generated. To determine the strength of each variable on readmission outcomes, the McFadden Pseudo R2 was computed [19], and the model performance metrics (i.e., AUC, specificity, and sensitivity) were measured using the “caret” statistical package [20]. For all experiments, statistical significance was defined as p < 0.05.
RESULTS
Of the 4,130 patients analyzed in this study, 874 (21.2%) were readmitted within 30 days following surgery. Demographic, medical, socioeconomic, and surgical characteristics were analyzed (Table 1) for all patients who met the selection criteria (Fig. 1). Notably, a univariate LR model identified patient age, insurance plan type, and Medicare use as highly associated with readmissions (p < 0.001). Other variables associated with readmissions were household size and income, presence of an assistant surgeon, number of diagnoses, and length of stay (p < 0.001). Readmitted patients were, on average, older (69.1 years vs. 60.6 years), more likely to have Medicare (81.1% vs. 43.3%), and experienced greater lengths of stay (5.5% vs. 4.8%, p < 0.001). These patients also had greater number of diagnoses (5.7% vs. 4.7%), were significantly more likely to be transferred to a skilled nursing facility (SNF) (63.5% vs. 1.7%), and were more likely to experience postoperative urinary complications (3% vs. 1%, p < 0.001).
Based on the risk factors the univariate LR model associated with readmissions, 5 models were generated and evaluated (GBM, XGBoost, RF, LASSO, and MLR). The performance metrics of these 5 models were measured by using a 50% random sample of the data for training and the remaining 50% for testing (Table 2). GBM outperformed all other models, with an AUC of over 15 independent runs (mean ± standard deviation, 0.844 ± 0.015). Based on these findings, new versions of the GBM (top-performing) and MLR (bottom-performing) models were generated. However, this time, instead of using a 50% random sample of data, all the data from 2014–2016 was used to train the models, and all 2017 cohort data was used to test their performance metrics (Table 3). When comparing the AUC mean values (0.846 vs. 0.829, p < 0.001), the GBM model significantly outperformed the MLR model. The specificity of the GBM model was also superior to that of the MLR model (0.986 vs. 0.966, p < 0.001). The 3 predictors with the greatest relative influence on the GBM model were the discharge status of the patients, the total billed charge of admission, and the length of stay (Fig. 2). Additional factors associated with increased readmissions included the patient’s age, number of prior outpatient visits, as well as the number of procedural and diagnosis codes for initial admission (Fig. 2).

Performance metrics of GBM and LR models generated using data from 2004–2016 for training and data from 2017 for testing

Graph depicts the relative weighting of various risk factors for 30-day readmissions, as determined by a Gradient Boosting Machine (GBM) machine learning model. The y-axis lists the risk factors, including both linear terms and polynomial transformations of certain variables like “Year of surgery” and “Household income.” The x-axis displays the relative weightings of each risk factor, scaled against the highest weighted risk factor, which was “Transfer to skilled nursing facility” (not shown in this part of the chart). Each bar’s length indicates the strength of its association with 30-day readmissions, where a longer bar signifies a stronger predictor. ED, Emergency Department.
To determine the outcomes of applying interventions to the top 25% of patients with the highest probability of readmission as determined by the GBM model, we subsequently used the GBM model on 2017 model-naive, inflation-adjusted, all-payer national estimates of 30-day surgical readmissions (Table 4). In the 2017 data, we analyzed 490 patients admitted between January and November, of which 133 (27.2%) had been readmitted within 30 days. Of these, the GBM model flagged 105 patients as constituting the top 25% of patients with the highest readmission probability, 99 of which were accurately identified by the model as having been readmitted, giving a true positive rate of 94% (105 top 25% high-risk patients/99 truly readmitted). Out of the remaining 385 patients who were not in the top 25th percentile of high readmission likelihood, 35 patients were nonetheless eventually readmitted, resulting in a missed patient rate of 9% (35 readmitted patients/385 unlikely readmissions). The GBM model predicted an estimated total costs savings of $803,633 over 11 months as a result of reduced readmissions. These cost savings were calculated assuming that 50% of readmissions were prevented by the interventions.
The LACE index was similarly analyzed: in this case, 62 patients out of the 105 total patients (top 25%) with the highest probability of readmission were correctly identified as having been readmitted (true positive rate of 59%). Out of a total of 385 patients that were not included in the top 25th percentile of high readmission likelihood, 73 patients were eventually readmitted (missed patient rate of 19%). The estimated total cost savings associated with reduced readmissions was $535,755 over an 11-month span. Together, this data shows that the GBM model outperformed the LACE index model when comparing the true positive rates (94% vs. 59%), missed patient rates (9% vs. 19%), and the cost savings ($803,633 vs. $535,755). In fact, the GBM model estimated a 50% decrease in readmission-associated costs when compared to those achieved by the LACE index model (Table 4).
DISCUSSION
In order to develop interventions that reduce readmission rates, it is critical to accurately identify patients who are at high risk of being readmitted. One strategy is to identify high-risk patients by analyzing the independent risk factors that are associated with readmission probabilities. In this study, we used epidemiological and supervised ML algorithms to analyze 4,130 patients undergoing PCF. We identified the demographic, socioeconomic, clinical, and procedural characteristics associated with patient readmissions within 30 days.
Univariate LR analysis found that patients’ age, their Medicare usage, their insurance plan type, the number of diagnoses, and length of hospital stay were all variables that influenced the readmission rates (Table 1). Interestingly, patients’ discharge to the SNF was strongly associated with readmissions in both multivariate and univariate LR models. Previous studies have identified similarly significant associations between transfer to SNF and readmission rates; in fact, as many as one in 4 SNF patients experience re-hospitalization within 30 days from their initial admission [7,21,22].
While LR models are commonly used to study and predict unplanned readmission rates, other ML models with the resolution to capture interactions between factors have become popular tools to predict patient outcomes and readmissions [11,13]. A growing body of medical literature has probed the potential of these models in supporting clinical decision-making and implementation [6-10]. Here, we used supervised classification and regression ML algorithms to predict readmissions and identify the risk factors that influence these rates. We found that while GBM identified patients’ discharge status, charge of admission, and length of stay as the most influential predictors of readmissions, MLR identified patients with comorbidities and number of procedure codes as the relevant variables. An explanation for this finding is that LR models make linear predictions on readmissions by computing principled estimates of confidence intervals, while other ML algorithms (i.e., GBM, LASSO, etc.), capture interactions between risk factors and nonlinear relationships. The observed differences in predictor weighting between models shows that, depending on which model is emphasized, it is possible to overestimate or underestimate the relevance of readmission predictors. Thus, by leveraging different ML models, it is possible to capture more realistic linear and nonlinear relationships.
Next, we tested the performance of the ML models, specifically, their ability in predicting 30-day readmissions. Compared to previously reported ML models that predicted readmissions following cervical spine surgery with an AUC mean of 0.63–0.81, our GBM model achieved a mean AUC value of 0.865— the highest predictability performance recorded to date [9,10,23-28]. One explanation for this improved performance is that the variables chosen for the analysis and consequentially, the model’s relative weighting of these variables is unique to this study. For instance, the most influential variables for the GBM model included the discharge destination (i.e., SNF), total charge billed for admission, length of stay, and patient age. To the best of our knowledge, the patient’s discharge destination has been analyzed as a model predictor for 30-day readmission by only one lumbar spine study, which similarly found that the discharge destination was the most influential variable accounted by the model [7].
Next, we used the GBM model to determine the top 25% of patients with the greatest probability of having unplanned readmissions. We then simulated the clinical outcomes of implementing an intervention for these flagged patients. We chose the 25% threshold value to account for hospital systems’ varying capacities to apply interventions for high-risk patients. In practice, however, this threshold should be tuned to the individual capacities and resources of different hospital systems. Of all the patients flagged as having high readmission probabilities, 94% were accurately predicted by the GBM model. The bottom 75% of patients with a high risk for readmissions totaled 385 patients. Of these, only 25 patients were eventually readmitted, accounting for a missed patient rate of 9%. When interventions were simulated for the top 25%, the GBM model presented in this study predicted an estimated cost savings of $803,633 over an 11-month period. It is important to note that this estimated cost savings was computed under the assumption that effective intervention(s) led to half of the high-risk patients not being readmitted.
The findings presented here demonstrate that ML models can identify patients with a high risk of readmission and provide targeted interventions that reduce these patient’s probabilities of being readmitted. While certain identified risk factors, such as age are inherently non-modifiable, the proposed interventions are designed to mitigate the risks associated with modifiable factors of patient’s postoperative care. For instance, while we cannot alter a patient’s age, hospital programs can proactively target those discharged to skilled nursing facilities—a predictor identified by the GBM model for readmissions—by increasing follow-up calls, home visits, and telemonitoring practices, all of which have been shown to reduce readmissions [29-31]. This ensures sustained care and strict adherence to postdischarge protocols, effectively mitigating the risk of readmission. Such measures would be particularly important for complex or invasive procedures, which are often associated with higher billed charges of admission, greater number of procedural and diagnostic codes at outset, and higher number of outpatient visits—all of which are factors that our GBM model identified as predictors of greater patient readmissions. Similarly, patient education is also important in the context of complicated diagnoses and procedures. Empirical evidence suggests that communication interventions at the point of discharge, including medication counseling and disease-specific education, can significantly reduce the likelihood of 30-day readmissions [32]. By integrating structured educational programs that specifically target patients with the highest risks of readmissions, as defined by the GBM model, we can equip patients with the knowledge to manage their conditions more effectively, recognize early signs of complications, and understand when to seek medical help. Last, medication management can address the risks associated with polypharmacy and complex medication regimens often seen in patients with multiple codes for procedures and diagnoses. A pharmacist-led approach, encompassing medication reconciliation, a patient-specific medication care plan, discharge counseling, and follow-up contact, can substantially decrease the incidence of medication errors postdischarge, thereby lessening the chances of readmission or emergency department visits [33]. By using ML models to identify patients at high risk of readmission and targeting these patients specifically, resources may be optimally allocated while contributing to a reduction in the costs associated with readmitting these high-risk patients. Nonetheless, future research could benefit from a closer examination of the direct impact of these interventions on readmission rates.
Both our cost savings simulation and previously reported data provide evidence that reducing the number of readmissions, even slightly, can significantly lower the economic burden of unplanned readmissions [34]. With further fine-tuning and customization, these models could (1) aid clinicians in identifying patients with high readmission risks; (2) guide perioperative resource allocation to decrease readmission probabilities; and (3) decrease overall healthcare costs.
In addition to comparing the GBM model to univariate and multivariate LR models, we also analyzed how the performance of our ML algorithm compared to that of clinically-employed predictive models of readmission (i.e., the LACE index model). When we analyzed the top 25% of patients with a high risk for unplanned readmissions, we found that in comparison to the LACE index model, the GBM had a higher true positive rate of readmission, a 50% higher cost savings from readmission prevention, and a lower missed patient rate. Several factors may explain GBM’s higher performance when compared to the LACE index model. For instance, we provided the ML model with an expanded set of predictors to utilize and GBM has an adaptive learning framework that can capture and leverage interactions between variables, including nonlinear relationships. Currently, there is no consensus on the performance power of the LACE index; while some reports strongly support the model’s ability to discriminate between variables [35-37], others cite the lack of strong predictors [38] as reason for the model’s moderate to poor discrimination [38-41].
Our study generated models that capture patients at high risk of readmissions and simulates how hypothetical clinical interventions can reduce healthcare utilizations costs. However, there are several limitations worth noting when interpreting these results. First, the insurance claims database used was not specific to spine surgery, which can impact accuracy [42,43]. Second, we must understand the limitations of using administrative claims data to guide and alter clinical practice [42,43]. For example, during the study period, insurers and hospitals attempted to decrease patient’s length of stay by discharging to home. This creates confounds in the associations captured between readmission probabilities and patient’s length of stay. Third, to simulate how intervening of high-risk patients would impact healthcare utilization, we assumed that 50% of the theoretical interventions were effective, which could vary by hospital systems.
These limitations highlight substantial avenues for improvement. Although our study represents an initial effort to utilize ML models to predict readmission rates and mitigate healthcare costs, future prospective studies and clinical trials are needed to refine, validate, and enhance the ML algorithms presented. Moreover, exploring the utilization of alternative data sources beyond administrative claims (i.e., prospective datasets) and predicting 90 days in addition to 30-day readmission rates, could further enhance the robustness and generalizability of our findings.
CONCLUSION
ML models, including MLR, have identified different risk factors associated with patients’ unplanned readmission probabilities. This is potentially a result of each model’s capacity to measure nonlinear relationships and interfactor interactions, and it suggests that models can complement each other when capturing risk predictors for readmissions following PCF. In addition, we found that when comparing 2 ML models, GBM outperformed the MLR model as measured by the mean AUC. These findings support the rationale for the continued generation, improvement, and eventual implementation of ML models in order to reduce readmissions and associated healthcare utilization costs.
Supplementary Material
Supplementary Text and Table 1 can be found via https://doi.org/10.14245/ns.2347340.670.
Data preprocessing, model generation, and model tuning
Notes
Conflict of Interest
The authors have nothing to disclose.
Funding/Support
This study received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.
Author Contribution
Conceptualization: ADGS, PGR, DH, JKR, MR, DS, AMD; Data curation: PGR; Formal analysis: PGR; Methodology: PGR, DH, JKR, MR, DS, AMD; Project administration: ADGS, JKR, MR, DS, IJ, AMD; Visualization: ADGS; Writing – original draft: ADGS; Writing – review & editing: ADGS, PGR, SST, IJ, AMD.