INTRODUCTION
Infectious spondylodiscitis (IS) is a septic inflammation of the spine involving vertebral bodies and paraspinal structures [
1]. During the progression of the disease, the formation of abscesses or edema can destroy vertebrae or cause neurologic disorders [
1,
2]. The overall incidence of spinal infection in adults is approximately 2.2 per 100,000 per year, with a slowly increasing trend worldwide in recent years [
1,
3]. IS is potentially life-threatening, with a mortality rate of 3%–20% [
3,
4]. Common causes of IS include pyogenic spondylodiscitis (PyS) and tuberculous spondylodiscitis (TbS), which account for 40%–80% and 17%–40% of all IS cases, respectively [
3,
5]. Insufficient specific signs and symptoms might cause delayed diagnosis and treatment, leading to disastrous consequences [
6,
7].
It is critical to distinguish between TbS and PyS to provide appropriate treatment. However, the identification of these 2 entities is challenging because of their nonspecific signs and symptoms. Microbiological diagnosis is the gold standard for differentiating between TbS and PyS. However, identifying the microbes is difficult. Previous reports on patients with PyS showed a negative culture rate ranging from 10% to 30%. In contrast, obtaining a positive culture for TbS typically requires 3 weeks, with a success rate ranging between 50% and 70% [
8,
9]. When microbiological identification is impossible, clinical, laboratory, and magnetic resonance imaging (MRI) findings may aid in identifying a potential causative microorganism. Previous studies have distinguished radiological findings between TbS and PyS [
10-
17]. However, few studies have developed a scoring system that uses predictive factors to stratify the probability of TbS from PyS [
17,
18].
In the present study, we aimed to compare and analyze the differences in the clinical, laboratory, and MRI findings between TbS and PyS, and to develop and validate a simplified multiparameter MRI-based scoring system for differentiating between TbS and PyS.
MATERIALS AND METHODS
We retrospectively collected medical records of patients diagnosed with IS admitted to the Maharat Nakhon Ratchasima Hospital between January 2015 and December 2020. Cases with microbiologically and pathologically documented evidence were included in this study. PyS was diagnosed when the etiological organism was identified through percutaneous vertebral biopsy, surgical drainage, or blood culture (a minimum of 2 separate sets). TbS was diagnosed based on pathological samples, tissue cultures, and polymerase chain reaction (PCR) tests. Patients with spondylodiscitis caused by other pathogens (e.g., fungal, or parasitic), unconfirmed spondylodiscitis (if no pathogens were isolated), lack of pretreatment MRI, absence of gadolinium administration during MRI, or lack of T1-weighted or fluid-sensitive sequences were excluded.
Clinical data included age, sex, predisposing factors and/or associated illnesses, onset of the symptoms, fever, Frankel grading, and causative organisms. Laboratory data comprised white blood cell (WBC) count, proportion of neutrophils, C-reactive protein (CRP) levels, erythrocyte sedimentation rate (ESR), and serum alkaline phosphatase (ALP) levels. All MRI examinations followed a standard protocol, including axial and sagittal T1-weighted sequences, axial and sagittal fluid-sensitive sequences, including T2-weighted with fat-saturation (T2w fat-sat) or short tau inversion recovery sequences, and axial and sagittal T1-weighted sequences after gadolinium administration. MRI findings were evaluated by consensus between a 5-year-experienced spine surgeon and a musculoskeletal radiologist. Details of each finding were evaluated and are described in
Table 1. The infection in the thoracolumbar region is classified as a thoracic or lumbar lesion based on the extent of vertebral body destruction, with thoracic areas being more severe.
This study was performed in accordance with the Helsinki Declaration and approved by the Maharat Nakhon Ratchasima Hospital Institutional Review Board (MNRH IRB No. 089/2020). The patients were informed that the data concerning their cases would be submitted for publication and provided their consent.
Statistical analyses were performed using Stata Statistical Software (ver. 14; StataCorp LP., College Station, TX, USA). After a descriptive study of the variables, t-tests were used to compare continuous variables. Chi-square tests were used to compare predisposing factors and associated illnesses. All tests were 2-sided, and a p-value of 0.05 was considered significant. All variables that were significant in the chi-square test were included in a multivariate logistic regression analysis using stepwise backward elimination for the derived independent variables.
The diagnostic accuracy of the reduced multivariate model was evaluated in terms of calibration and discrimination. Calibration was performed using Hosmer-Lemeshow goodness of fit statistics. A calibration plot comparing the agreement between the disease probabilities estimated using the model and the observed disease data is also presented. Discriminative power was evaluated using the area under the receiver operating characteristic (ROC) curve. Internal validation was performed using a bootstrapping procedure with 1,000 replicates. Bootstrap resampling is a statistical technique used for estimating the sampling distribution of a statistic by resampling with replacement from the observed data. This resampling is applicable in various situations, offering versatility in statistical problems like parameter estimation and hypothesis testing. It requires minimal assumptions and is easy to implement, making it a practical way to assess statistic variability without complex mathematical derivations. However, this resampling procedure has several drawbacks, including its reliance on the original sample, its inability to accurately represent population variability in small samples, and its assumption of stationary data distribution, which may not be suitable in dynamic environments.
Subsequently, a simplified risk score transformation was generated. Each item was assigned a specific score based on the logistic regression coefficients of the multivariate model. To achieve this, the regression coefficient of each item was divided by its lowest coefficient, the result was rounded to the closest integer. The total scores were then categorized into 2 groups (TbS and PyS) for clinical applicability. Sensitivity and specificity were calculated separately for each group using a population-analog approach. Calibration and discrimination were assessed using a score-based multivariate logistic model.
RESULTS
Among the 420 patients diagnosed with IS, 190 had a confirmed diagnosis, matched all inclusion criteria, and were retrospectively enrolled. The characteristics of the 190 patients are summarized in
Table 2. The mean age at diagnosis was 56.8 years (range, 18–84 years), with 106 males and 84 female patients. Data were collected from 67 patients with PyS and 123 patients with TbS. Among the 67 patients with PyS, the causative organism was confirmed by culture of percutaneous spinal biopsy and surgical drainage in 64.2% (n = 43), and blood culture in 35.8% (n = 24).
Staphylococcus aureus was the most common microorganism identified in 56.7% (n = 38), followed by
Streptococcus spp. (19.4%, n = 13),
Escherichia coli (13.4%, n = 9),
Bacillus spp. (2.9%, n = 2),
Klebsiella pneumoniae (2.9%, n = 2),
Brucellosis (1.4%, n =1),
Burkholderia pseudomallei (1.4%, n = 1), and
Pseudomonas aeruginosa (1.4%, n = 1). Among the 123 patients with TbS, the diagnosis was confirmed by percutaneous spinal biopsy and surgical drainage in 25.2% (n = 31) of patients. The remaining patients with TbS were confirmed by positive results for PCR of
Mycobacterium tuberculosis and pathology demonstrating caseous granulomatous inflammation.
Clinically, back pain was the most common symptom observed in both groups with 94.3% (n = 116) among TbS and 94% (n = 63) among PyS patients. The duration of symptoms lasted > 4 weeks in 102 TbS patients (82.93%) and 26 PyS patients (39.39%) (p < 0.01). The number of patients with diabetic mellitus was 12 (9.76%) in the TbS group and 16 (23.88%) in the PyS group (p < 0.01). Laboratory findings of the 2 groups are shown in
Table 2. PyS was more frequently associated with the following parameters: WBC > 10,000/mm
3, a higher proportion of neutrophils > 75%, and ALP > 120 IU/L (p < 0.01).
As shown in
Table 3, thoracic involvement was significantly more frequent in TbS than in PyS (61.78% vs. 22.39%, p < 0.001), while lumbar involvement was more common in PyS than in TbS (85.07% vs. 56.91%, p < 0.001). No significant differences were observed in cervical or sacral involvement. Moreover, no differences were found in the number of involved vertebrae, involvement of the posterior elements, or posterior wall retropulsion. On T1-weighted MRI, the vertebral body signal was typically hypointense in both groups. However, the TbS group exhibited a proportionately more heterogeneous intensity than the PyS group (8.13% vs. 0%, p < 0.03). Destruction of vertebral endplates and vertebral destruction > 50% were more common in the TbS group than in the PyS group (26.83% vs. 8.96%, p < 0.001 and 60.16% vs. 19.4%, p < 0.001, respectively). No differences were found in the disc signal or extent of disc destruction between the groups. On T1-weighted gadolinium MRI, the vertebral body was more frequently heterogeneously enhanced in the TbS group (89.43% vs. 38.81%, p < 0.001). Moreover, vertebral intraosseous abscesses were more frequent in TbS compared to PyS (69.1% vs. 7.46%, p < 0.001). No significant differences were reported between the intervertebral disc contrast enhancement and disc abscesses. In terms of paravertebral involvement, the TbS group exhibited a higher prevalence of well-defined paravertebral abscesses (82.93% vs. 44.78%, p < 0.001), abscesses with thin and regular walls (81.3% vs. 2.99%, p < 0.001), epidural abscesses (67.48% vs. 43.28%, p = 0.001), and anterior longitudinal subligamentous spreading (94.3% vs. 41.79%, p < 0.001). However, the presence of epidural phlegmon (4.88% vs. 80.6%, p < 0.001) and facet joint arthritis (46.34% vs. 80.6%, p < 0.001) was strongly associated with PyS. No significant differences were observed between the presence or absence of spinal cord compression.
The duration of symptoms > 4 weeks (odds ratio [OR], 8.19; 95% confidence interval [CI], 4.10–16.33; p < 0.001), Diabetes mellitus (OR, 0.49; 95% CI, 0.15–0.77; p = 0.01),WBC > 10,000/mm
3 (OR, 0.21; 95% CI, 0.11–0.40; p < 0.01), neutrophil proportion > 75% (OR, 0.35; 95% CI, 0.19–0.65; p = 0.001), ALP > 120 IU/L (OR, 0.31; 95% CI, 0.17–0.58; p < 0.001), presence of thoracic lesions (OR, 5.67; 95% CI, 2.87–11.20; p < 0.001), severe vertebral destruction (OR, 6.35; 95% CI, 3.14–12.86; p < 0.001), heterogenous contrast-enhanced vertebral body (OR, 19.21; 95% CI, 3.91–94.26; p < 0.001), presence of vertebral intraosseous abscess (OR, 28.06; 95% CI, 10.44–75.36; p < 0.001), well-defined paravertebral enhancement (OR, 84.75; 95% CI, 8.74–820.87; p < 0.001), presence of epidural abscess (OR, 2.75; 95% CI, 1.49–5.07; p = 0.001), absence of facet joint arthritis (OR, 4.88; 95% CI, 2.42–9.84; p < 0.001), and anterior longitudinal subligamentous spreading (OR, 23.28; 95% CI, 9.42–57.49; p < 0.001) were identified as possible risk factors for TbS in the univariate analysis (p < 0.2) in our study (
Table 4). No significant differences were found in temperature > 38°C (OR, 0.49; 95% CI, 0.24–1.02; p = 0.57), peak ESR > 40 mm/hr (OR, 0.96; 95% CI, 0.43–2.14; p = 0.93), peak CRP > 5 mg/dL (OR, 0.35; 95% CI, 0.07–1.65; p = 0.19), or ill-defined paravertebral enhancement (OR, 0.21; 95% CI, 0.03–1.49; p = 0.12). Multiple logistic regression analysis showed that thoracic lesion (OR, 819.81; 95% CI, 6.84–98,313.95; p = 0.006), absence of epidural phlegmon (OR, 900.86; 95% CI, 31.39–25,857.73; p < 0.001), anterior longitudinal subligamentous spreading (OR, 185.78; 95% CI, 7.92–4,360.64; p = 0.001), presence of vertebral intraosseous abscess (OR, 19.59; 95% CI, 1.75–219.70; p = 0.016), well-defined paravertebral enhancement (OR, 10.79; 95% CI, 1.28–90.80; p = 0.029), presence of epidural abscess (OR, 9.69; 95% CI, 0.78–121.06; p = 0.038), and absence of facet joint arthritis (OR, 7.25; 95% CI, 0.91–57.94; p = 0.042) were independent predictive factors for TbS (
Table 5).
1. MRI Scoring Transformation
Each potential predictor of TbS in the multivariate model was assigned a specific score derived from the logistic regression coefficient: thoracic lesion, 7 points, no epidural phlegmon 7 points, subligamentous spreading 5 points, intraosseous abscess 3 points, well-defined paravertebral abscess 2.5 points, epidural abscess 2.5 points, and no facet joint arthritis 2 points (
Table 5). The scoring scheme, with a total score ranging from 0 to 29, included categories for differentiation. This cutoff point was based on a calibration plot of sensitivity and specificity. For discriminative ability, the area under the parametric ROC curve for the score-based logistic regression model was 0.96 (95% CI, 125.40–3,257.95) (
Fig. 1). The calibration was illustrated using a calibration plot, with a p-value of < 0.001. The predicted probability of TbS increased as the score increased, with a high level of agreement between actual and predicted diseases (
Fig. 2). The total score was significantly different between the groups (> 14 points, p < 0.001), with a sensitivity of 97.58% and specificity of 92.54%. The application of this MRI scoring transformation is illustrated in
Figs. 3 and
4.
DISCUSSION
Despite the number of earlier studies [
10-
15] that have described the clinical, laboratory, and MRI features of PyS and TbS, we obtained significant distinguishing characteristics from our comparison of the 2 groups. However, no single result can distinguish between these circumstances. Compared to other studies [
10-
16,
19], our study represents the largest series comparing microbiologically confirmed cases of PyS and TbS. A longer symptom duration (> 4 weeks) and the absence of fever (
Table 2) were more frequently associated with TbS than with PyS. According to Yoon et al. [
20], TbS risk factors included a median latency to spondylodiscitis diagnosis of > 7 days, and patients with TbS experienced fever less frequently than those with PyS. The diagnosis of spinal infection is highly sensitive to inflammatory indicators, such as WBC count, neutrophil count, ESR, and CRP level [
21]. Our results were in agreement with the findings of Kim et al. [
4], who reported that high levels of ALP (> 120 IU/L) and neutrophil predominance (> 75%) in leukocytosis (> 10,000/mm
3) were more commonly predictive of PyS. Similarly, Lertudomphonwanit et al. [
17] observed that a neutrophil fraction < 78% and WBC count < 9,700/mm
3 were highly suggestive diagnostic clues for differentiating patients with TbS from those with PyS. Compared with previous studies, Kim et al. [
4] found that ESR > 40 mm
3 and CRP > 5 mg/dL were more frequently associated with PyS. In contrast to our study, these biomarker cutoff values were incapable of differentiating PyS from TbS. According to Lertudomphonwanit et al. [
17], ESR levels of < 92 mm/hr were highly suggestive indicators of TbS. Nevertheless, CRP level was not shown to be a predictive factor in their study. The demographics of our patient group may partially explain this outcome. As demonstrated in this study, both groups had delayed time to diagnosis; therefore, ESR and CRP levels may have been more significant at the time of diagnosis [
22].
MRI has substantially improved the diagnosis of spinal infections. Even in the early stages of spinal infections, the increased sensitivity of MRI allows the identification of pathogenic alterations in the spine. According to a previous study, contrast-enhanced MRI is a reliable method for differentiating TbS from PyS [
15]. Our study demonstrated the presence of thoracic lesions, intraosseous abscesses, anterior longitudinal subligamentous spreading, and well-defined paravertebral enhancement as predictive factors for TbS, which corresponded well with the review by Lee [
23] Tuberculous spondylitis typically begins in the anterior cancellous bone of the vertebral body. This is followed by the destruction of the vertebral body, extending beneath the anterior longitudinal ligament, leading to the formation of an abscess near the vertebral body. The thoracic spine is the region most frequently affected by this process [
18,
24,
25]. This finding is supported by a recent study that indicated a well-defined paraspinal abscess as one of the hallmarks of TbS, whereas PyS typically exhibits more widespread, ill-defined areas of enhancement [
12]. According to a recent study by Kanna et al. [
26], large abscesses with a thin wall are one of the MRI findings that are strongly predictive of TbS.
Epidural soft-tissue thickening, also known as epidural phlegmon, manifests as a diffuse and homogeneous contrast-enhancing process. This presentation may indicate an inflammatory process before turning into an epidural abscess and is less amenable to surgical drainage [
27]. Our study revealed a higher prevalence of epidural phlegmon among PyS patients. According to Zhang et al. [
18], patients with PyS presented with phlegmon characterized by ill-defined boundaries and occasional small abscesses with thick and irregular walls (97% in PyS vs. 37% in TbS). Conversely, the epidural abscesses that were more common in the TbS group were larger and had well-defined borders. They are more likely to develop into a ring-shaped, thin, smooth-walled, polysoluble abscess [
17]. These results were also observed in our study. Patients with TbS more frequently have a slow-growing infection that finally results in an epidural abscess at a later stage.
Septic arthritis of the facet joint was diagnosed based on erosion, edema, and enlargement of the facet joint space (
Fig. 5). Associated inflammatory changes in the epidural space or adjacent paraspinal muscles can be seen with gadolinium-enhanced T1-weighted MRI [
28]. In our study, this finding was a reliable predictor of PyS. Due to the aggressiveness of the organism and its propensity to spread outside the facet capsule, which can cause synovitis, perisynovial inflammation, and erosive changes to the articular surface, the synthesis of a proteolytic enzyme is implicated in the inflammatory process of PyS [
12,
15,
24]. Similar to Harada’s results [
7], the enhancement of soft tissues around the facet joints was more frequent in PyS than in TbS.
To the best of our knowledge, there are few diagnostic prediction tools available to effectively distinguish TbS from PyS. Zhang et al. [
18] analyzed the MRI findings of spinal infections (32 cases of PyS, 38 cases of TbS), to identify key distinguishing features between PyS and TbS, and establish a systematic scoring method. Using the scoring system, the correct coincidence rate was 95.23%, with a sensitivity of 91.67%, and specificity of 100%. However, the predictive parameters for detecting PyS and TbS were separated using a prediction tool. In our study, we developed a simplified MRI scoring system for the diagnostic prediction of TbS, based primarily on predictive factors (
Table 5). Total scores ≥ 14 points may significantly predict the risk of TbS, with a sensitivity of 97.58% and specificity of 92.54%. The discriminative ability of the score-based logistic regression model was 0.96, as indicated by the ROC curve.
Figs. 3 and
4 provide demonstrations of the scheme used. As the average duration of symptoms was 3 months in patients with TbS and 4 weeks in patients with PyS, this scoring system is a valuable diagnostic tool that can help distinguish between TbS and PyS, particularly in the subacute to chronic stages of the disease.
The clinical predictive model of this study provided significant advantages. We included all diagnostically relevant variables in the model and transformed the regression equation into a scoring system for use in clinical settings. Nevertheless, this study has some limitations. First, the study, a retrospective cohort study, included patients from a single hospital. However, a data imbalance occurred between groups, which was corrected using multivariable logistic regression analysis. Adjusting the data ratio could potentially reduce the study’s power. Second, the clinical or laboratory data regarding the onset of symptoms may have been biased because our center was a referral center. Larger population studies are required to assess the clinical relevance of these findings. Third, although internal validation was performed in our study, the reproducibility of the scoring remains unknown until a prospective external validation study is conducted in another setting or at a different time. Finally, because we did not include individuals with other low-virulence causative organisms, such as fungi, in our investigation, the generalizability of our findings for these patients may be limited.