Article Text

Download PDFPDF

Original research
Randomized controlled trial of vertebroplasty versus kyphoplasty in the treatment of vertebral compression fractures
  1. Avery J Evans1,
  2. Kevin E Kip2,
  3. Waleed Brinjikji3,
  4. Kennith F Layton4,
  5. Mary L Jensen1,
  6. John R Gaughen1,
  7. David F Kallmes3
  1. 1University of Virginia, Charlottesville, Virginia, USA
  2. 2University of South Florida, Tampa, Florida, USA
  3. 3Mayo Clinic, Rochester, Minnesota, USA
  4. 4Baylor University Medical Center, Dallas, Texas, USA
  1. Correspondence to Avery J Evans, MD, University of Virginia Hospital and Medical Center, Department of Radiology, 1215 Lee St, Charlottesville, VA 22903, USA; aje5u{at}hscmail.mcc.virginia.edu

Abstract

Background We present the results of a randomized controlled trial evaluating the efficacy of vertebroplasty versus kyphoplasty in treating vertebral body compression fractures.

Methods Patients with vertebral body compression fractures were randomly assigned to treatment with kyphoplasty or vertebroplasty. Primary endpoints were pain (0–10 scale) and disability assessed using the Roland–Morris Disability Questionnaire (RMDQ). Outcomes were assessed at 3 days, 1 month, 6 months, and 1 year following the procedure.

Results 115 subjects were enrolled in the trial with 59 (51.3%) randomly assigned to kyphoplasty and 56 (48.7%) assigned to vertebroplasty. Mean (SD) pain scores at baseline, 3 days, 30 days, and 1 year for kyphoplasty versus vertebroplasty were 7.4 (1.9) vs 7.9 (2.0), 4.1 (2.8) vs 3.7 (3.0), 3.4 (2.5) vs 3.6 (2.9), and 3.0 (2.8) vs 2.3 (2.6), respectively (p>0.05 at all time points). Mean (SD) RMDQ scores at baseline, 3 days, 30 days, 180 days, and 1 year were 17.3 (6.6) vs 16.3 (7.4), 11.8 (7.9) vs 10.9 (8.2), 8.6 (7.2) vs 8.8 (8.5), 7.9 (7.4) vs 7.3 (7.7), 7.5 (7.2) vs 6.7 (8.0), respectively (p>0.05 at all time points). For baseline to 12-month assessment in average pain and RMDQ scores, the standardized effect size between kyphoplasty and vertebroplasty was small at −0.36 (95% CI −1.02 to 0.31) and −0.04 (95% CI −1.68 to 1.60), respectively.

Conclusions Our study indicates that vertebroplasty and kyphoplasty appear to be equally effective in substantially reducing pain and disability in patients with vertebral body compression fractures.

Trial registration number NCT00279877.

  • Spine
  • Balloon
View Full Text

Statistics from Altmetric.com

Introduction

Osteoporotic vertebral body compression fractures are a significant cause of disability worldwide. Vertebral body compression fractures can cause significant disability secondary to pain, spinal deformity, reduced pulmonary function, impaired mobility, and depression.1 Both conservative and interventional therapies have been used to treat these fractures, and many clinical trials have demonstrated that both vertebroplasty and kyphoplasty are superior to conservative therapy in selected patients.2–6

While the overall utilization rate of kyphoplasty and vertebroplasty has decreased since the publication of the negative sham trials in 2009, these procedures are still commonly performed and represent a significant source of healthcare expenditure costs.7 ,8 Utilization of kyphoplasty is higher than vertebroplasty, largely due to the perception that kyphoplasty is safer and more effective than vertebroplasty. Studies comparing the efficacy of these two procedures are generally non-randomized or meta-analyses of non-randomized prospective studies.9 The optimal scientific approach to compare the safety and efficacy of these two procedures is a randomized controlled trial (RCT). Before the beginning of this trial, no RCT comparing the two procedures had been published.9 To test the null hypothesis that the procedures are similar in terms of efficacy, we performed an RCT of 115 patients randomized to vertebroplasty or kyphoplasty with reported pain and disability-related outcomes up to 12 months, now representing the third RCT on the topic.10 ,11

Methods

Patient characteristics and enrollment

Patients (study participants) were enrolled from nine centers in the USA. The centers selected were busy practices with experienced operators who performed both vertebroplasty and kyphoplasty, and who expressed clinical equipoise with regard to the risks and benefits of both techniques. All operators had at least 5 years of experience in spine augmentation procedures. All centers had clinical coordinators to facilitate conduct of the trial and to perform data entry and all required documentation. The trial protocol was evaluated by the Institutional Review Board (IRB) at each respective institution, and all patients gave signed written informed consent.

By protocol, all study participants were 50 years of age or older and had pain that had occurred in the previous 12 months attributable to one or more compression fractures of the vertebrae in the areas T4–L5 confirmed with a physical examination and imaging. All patients had fractures detected on plain radiography. MRI was routinely performed. Trial inclusion required that the participant reported pain from the compression fracture(s) of at least 5 on a numerical pain scale of 0–10. All participants were candidates for minimally invasive surgery, were able to successfully complete a battery of health questionnaires, and were available and willing to participate in follow-up. No patients had neurological deficits related to the compression fractures or other contraindications to vertebral augmentation. Participants had no history of surgery (within the last 60 days), no history of open back surgery, and no concomitant hip fracture, rib fracture, or sacral insufficiency fracture. In addition, participants had no malignant tumor deposit (multiple myeloma), tumor mass, or tumor extension into the epidural space at the level of the fracture to be treated.

Trial participants were recruited from patients who were scheduled for vertebral augmentation evaluations among nine academic and non-academic clinical sites nationwide. The research teams at each site determined participant eligibility, including describing the study to eligible participants and obtaining informed consent prior to randomization and conducting baseline evaluations.

Pain scales and functional assessments

Demographic information was collected at baseline and comorbidity details were assessed with the Charlson comorbidity index at baseline and at the year 1 follow-up assessment. A 0–10 numerical verbal scale was used to assess participants’ pain pretreatment, on day 3 after surgery and at 1 month, 6 months, and 12 months postoperatively. Patients were also asked to list their pain medications at all data collection time points. Outcome measures administered included the Roland–Morris Low Back Pain and Disability Questionnaire (RMDQ),12 the Short-Form Health Survey Instrument (SF-36),13 EuroQOL EQ-5 Health States Instrument,14 Study of Osteoporotic Fractures-Activities of Daily Living (SOF-ADL6) instrument,15 Modified Deyo Patrick Pain Frequency and Bothersomeness Scale,16 and the Osteoporosis Assessment Questionnaire (OPAQ) Body Image Scale.16 As with numerical pain assessment, symptom and health status questionnaires were completed prior to treatment, on day 3 after surgery and at 1 month, 6 months, and 12 months postoperatively. Similarly, follow-up was performed by telephone contact at 3 days, 1 month, and 6 months, and in person at 1 year.

Procedures

Both vertebral augmentation procedures were performed according to standard practice according to each practitioner’s preference. The approach, device, and cement used for the procedure were at the operators’ discretion. Complications were to be reported on adverse event forms.

Randomization

Trial participants were randomly assigned to either the kyphoplasty intervention or the vertebroplasty intervention using a variable block randomization scheme to ensure balance of assignment to both groups over time. The block sizes were randomly varied and of sufficient size to minimize the ability of any investigator or coordinator to guess the next assigned treatment. The random treatment assignments were placed in sequentially numbered envelopes and distributed to the clinical sites. The site coordinators and other personnel who collected trial participant data and administered the study questionnaires were blinded to the assigned treatment intervention to each participant.

Statistical analysis

Baseline demographic and clinical characteristics of study participants were described as mean±SD for continuous variables and percentages for categorical variables. To assess subject comparability in random assignment, Student t tests were used to compare means for continuous variables; the Fisher exact test of proportions was used for categorical variables. The same approaches were used to compare trial completers with those lost to follow-up. To assess overall mental and physical health status of study participants prior to surgical intervention, mean subscale scores on the SF-36 were age- and gender-matched to the US general population with an adjusted score of 50 being interpreted as equal to the general population.

The intention to treat principle was used for all analyses. Analysis of covariance (ANOVA) was used to compare treatment response (pain, disability, quality of life, mental and physical health) by random assignment at successive assessment intervals adjusting for the initial value. To illustrate, acute treatment response at 3 days after the intervention was evaluated by random assignment adjusting for baseline value; sustained treatment response at 12 months was evaluated by random assignment adjusting for the 6 month value. To assess slope of treatment response by random assignment and from baseline to 12 months, linear mixed models were fit specifying an autoregressive correlation structure. A two-sided p value of 0.05 was used to define statistical significance for all analyses and no correction for multiple comparisons.

In addition, to evaluate the magnitude of the treatment response by random assignment (ie, irrespective of formal statistical testing), standardized treatment effect sizes from baseline to different follow-up intervals were calculated as ((baseline kyphoplasty score − follow-up kyphoplasty score) − (baseline vertebroplasty score − follow-up vertebroplasty score)/SE of difference score). Effect size values of 0.2, 0.5, and 0.8 were interpreted as ‘small,’ ‘medium,’ and ‘large’, respectively.17

Statistical power

The trial was not designed to be definitive, but rather to be powered to detect modest to large differences in treatment outcomes by procedure, should they exist. Detection of small effects of little clinical significance were not sought in the design of the trial, particularly given the significantly higher procedural costs associated with kyphoplasty, and the presumption that a favorable cost-benefit ratio for kyphoplasty would best be supported by observation of at least modest differences in clinical outcomes when compared with the use of vertebroplasty. In order to be powered to detect modest to large differences in treatment outcomes by procedure, no correction was made for testing of multiple outcomes and at multiple follow-up intervals, recognizing the potential for increased probability of type I error. Given this approach, the trial was designed for a total of 120 subjects and assuming up to 20% loss to follow-up at 12 months. This net desired planned sample size of 96 subjects, if achieved, would provide 80% power to detect a ‘medium’ effect size of 0.58.

Results

Patient characteristics

A total of 115 subjects were enrolled in the trial with 59 (51.3%) randomly assigned to kyphoplasty and the remaining 56 (48.7%) assigned to vertebroplasty. The mean age of the full cohort was 75.6±10.0 years, 71% were women, 56% were married, 59% were never smokers, and 61% were on narcotic medication prior to surgery (table 1); 59% presented with comorbidities, the mean current fracture percentage was 24.1±19.0%, and 41% had osteoporosis. Mean duration of symptoms was 17.5±11.7 days in the vertebroplasty group and 18.0±10.3 days in the kyphoplasty group (p=0.81). Of note, baseline demographic and clinical characteristics were similar by random assignment (p>0.05).

Table 1

Demographic and clinical characteristics by random assignment

Pain and quality of life at study entry

On a 0–10 scale, the mean average pain at study entry was 7.7±2.0 overall (7.4±1.9 in the kyphoplasty group vs 7.9±2.0 in the vertebroplasty group, p=0.25; table 2). The mean number of days per month that pain kept study subjects from performing activities was high (17.7±11.0), as was the mean RMDQ score (16.8±7.0). As with demographic characteristics, measures of pain and quality of life were similar by random assignment. Of note, compared with the US general population that included age and gender adjustment, the mean score on the SF-36 aggregate physical health was much lower in the study cohort (26.4±7.2), whereas mean aggregate mental health was near the standardized value of 50 (43.9±13.5). The high physical disability of the study cohort was particularly evident for the SF-36 subscales of physical functioning, physical health problems, and pain, whereas mean scores on the SF-36 subscales of emotional well-being and emotional health problems actually exceeded general population norms (figure 1).

Table 2

Pain and quality of life indicators by random assignment (mean±SD)

Figure 1

Mean scores on the eight subscales of the SF-36 instrument among study participants prior to intervention with either kyphoplasty or vertebroplasty. The mean scores are age- and gender-adjusted to the US general population with a value of 50 (dashed horizontal line) depicting comparable status.

Loss to follow-up

Of 113 subjects with average pain scale ratings before surgery, 25 (25.7%) did not have 12-month follow-up data (ie, rate of attrition). This included 29.3% in the kyphoplasty group and 21.8% in the vertebroplasty group (p=0.40). Subjects with follow-up data did not differ statistically in age (p=0.36), gender (p=0.91), education (p=0.35), current fracture percentage (p=0.23), Charlson comorbidity score (p=0.25), average pain before surgery (p=0.10), or Roland disability score (p=0.15) from those without 12-month follow-up data. Thus, loss to follow-up was similar by treatment arm and demonstrated a profile characteristic of missing at random.

Treatment response

Figures 24 depict treatment response after kyphoplasty versus vertebroplasty for measures of pain, disability, and quality of life. As shown in figure 2, reductions in average pain, pain frequency, and functional limitations due to pain were substantial after surgery, while remarkably similar by treatment assignment. The similar treatment response by random assignment was evident for the RMDQ score, activities of daily living, and multiple measures of quality of life (figures 3 and 4). Thus, there was essentially no evidence of a differential response in clinical improvement between treatment with kyphoplasty and with vertebroplasty. Importantly, aggregate mental health (SF-36) for the study cohort improved after surgery and was consistent with or above general population norms, whereas aggregate physical health also improved over time but remained below general population norms (figure 4).

Figure 2

Plot of mean pain scores before intervention with either kyphoplasty or vertebroplasty and over 12-month follow-up by random assignment. (A) Average pain. (B) Pain frequency. (C) Pain impact measured as days in bed. (D) Pain impact measured as days kept from activities. See Methods for calculation of p values.

Figure 3

Plot of mean pain, disability, and quality of life scores before intervention with either kyphoplasty or vertebroplasty and over 12-month follow-up by random assignment. (A) Pain bothersome index. (B) Fracture-related impact on activities of daily living. (C) Disability scores. (D) Impairment in quality of life scores. See Methods for calculation of p values.

Figure 4

Plot of mean SF-36 aggregate health scores before intervention with either kyphoplasty or vertebroplasty and over 12-month follow-up by random assignment. (A) Aggregate physical health. (B) Aggregate mental health. See Methods for calculation of p values.

In terms of standardized effect sizes comparing reductions in average pain and RMDQ scores over follow-up by treatment assignment, seven of eight comparisons were in the ‘small’ effect range, meaning minimal difference in treatment response for kyphoplasty versus vertebroplasty (table 3). For average pain over follow-up, all four effect size estimates were negative, suggesting slightly better response with vertebroplasty, although all of the 95% CIs included the null value of zero. For RMDQ scores over follow-up, effect size estimates ranged from −0.04 to 0.18, again suggesting essentially no difference by treatment assignment.

Table 3

Effect sizes of kyphoplasty versus vertebroplasty for difference measures (from baseline) in mean pain and Roland–Morris disability scores over 12 months of follow-up

Subgroup analyses

In repeated measures analyses among selected subgroups, the slope of change in average pain score over follow-up was similar by random assignment (kyphoplasty vs vertebroplasty) among men (p=0.51) and women (p=0.27), age <75 years (p=0.09) versus ≥75 years (p=0.14), preoperative average pain score <7 (p=0.40) versus ≥7 (p=0.73), and preoperative RMDQ score <17 (p=0.70) versus ≥17 (p=0.69). Thus, consistent with the full cohort results, we did not observe any appreciable difference in the magnitude of reduction in pain following kyphoplasty versus vertebroplasty in selected subgroups examined.

Discussion

Our study data suggest that vertebroplasty and kyphoplasty are equally effective in substantially reducing pain and disability and improving both mental and physical health. Significant clinically meaningful improvements in pain and disability were seen within 3 days of the procedure in both groups. No subgroups demonstrated any appreciable differences in pain outcomes between kyphoplasty and vertebroplasty. Similarly, standardized effect sizes of treatment response between the two procedures were small and near the null value of zero, consistent with the postulate of comparable treatment response. Of note, estimates of effect size are invariant of sample size, although precision of the estimates is a function of sample size.

Currently, nearly 75% of patients undergoing spine augmentation in the USA receive kyphoplasty.18 ,19 This is largely due to the perception that balloon kyphoplasty is more effective in reducing pain and disability than vertebroplasty. Thus, the present results, which suggest comparable clinical benefit with both procedures, could have a significant economic impact as they suggest that the less costly and less utilized procedure—vertebroplasty—is equally effective in all measures when compared with kyphoplasty.

To date, two RCTs comparing vertebroplasty with kyphoplasty have been published. In a trial of 50 patients randomized to vertebroplasty and 50 patients randomized to kyphoplasty with a mean fracture age of 16–17 weeks, Liu et al demonstrated significant improvements in vertebral body height restoration and wedge angle in the kyphoplasty group. However, there was no significant difference in pain outcomes up to 6 months between the two treatment groups.10 Our results of similar clinical outcome between kyphoplasty and vertebroplasty corroborate those of Liu et al, but with a much broader range of outcomes and follow-up to 12 months. Recently, the KAVIAR trial has been published.11 This trial reported no significant difference in rates of subsequent fractures and no significant differences in improvement of pain, disability, and safety between the two techniques. The only statistically significant difference was shorter procedure time for vertebroplasty (31.8 vs 40 min). Thus, our findings are consistent with other recent trial results.

Much of the previous data comparing the safety and efficacy of vertebroplasty and kyphoplasty come from non-RCTs, retrospective studies, and multiple meta-analyses.9 Major limitations of these types of studies include selection and publication bias as they are generally retrospective or non-randomized.9 ,20 The thrust of these studies is that vertebroplasty and kyphoplasty are equally safe and effective in reducing pain and disability. These non-RCTs describe some differences in incidence of cement leakage, degree of kyphosis reduction, and risk of adjacent level fracture, but with no reported differences in overall outcome, such distinctions are not necessarily clinically relevant.9 Our study confirms the findings of the previous two RCTs: vertebroplasty and kyphoplasty are equally effective.

The relative long-term benefits of vertebroplasty versus kyphoplasty have yet to be conclusively established. Our study did not follow patients long enough to sufficiently examine potential differences in long-term mortality or fracture-free survival between the two treatments. In a study using the Medicare database, Chen et al21 found that kyphoplasty was associated with significantly lower long-term mortality rates compared with vertebroplasty and non-operative management. While this analysis controlled for important patient factors such as age and comorbidities, these data were from a retrospective administrative database and were thus non-randomized.21 Long-term follow-up data from patients treated in RCTs are needed to determine the relative survival benefits of vertebroplasty versus kyphoplasty.

We did not perform any cost-effectiveness analysis in our study. However, a number of previously published studies have demonstrated lower inpatient and outpatient costs for vertebroplasty compared with kyphoplasty.18 ,19 Svedbom et al22 found that kyphoplasty was more cost effective than vertebroplasty, with much of the benefit related to reduced mortality in patients undergoing kyphoplasty. Ong et al23 found that, despite higher initial treatment costs of kyphoplasty, vertebroplasty was less cost effective due to increased utilization of medical resources in the postoperative period over the 2 years after surgery. Edidin et al24 found that kyphoplasty was cost effective—and perhaps even cost saving—compared with vertebroplasty using Medicare claims data. Again, these cost-effectiveness analysis studies are from non-randomized populations and are subject to significant selection bias. These findings highlight the importance of long-term cost and mortality data from RCTs.

Limitations of the study

Our study is limited by a modest sample size. A total of 115 patients were randomized to either vertebroplasty or kyphoplasty. Given this limitation, it is difficult to compare outcomes of rare complications such as new-onset fractures. In addition, we cannot definitively conclude clinical equivalence between the two procedures. Nonetheless, the remarkable similarity in results observed in measures of pain, functional status, quality of life, and mental and physical health status provide no suggestion at all that we failed to detect treatment-related differences simply due to low statistical power. This is further evidenced by the calculation of standardized effects sizes comparing treatment response for the two procedures which suggest comparable clinical benefit, and in and of themselves, are invariant to sample size.

A quarter of the patients in our study did not complete follow-up, yet the pattern of loss to follow-up appeared to be missing at random. Whereas the study was not powered for subgroup analyses, again we found no indication of differential treatment response between the two procedures. Nonetheless, we cannot conclude whether or not certain subgroups of patients would benefit more from kyphoplasty or vertebroplasty. Another limitation of our study is the lack of standardization of the procedures. The approach, device, and cement used for the procedures were at the operators’ discretion. This could affect the reproducibility of the results of this study; however, this also more closely represents ‘real-world’ outcomes where approaches and devices are used at the discretion of the operator. Lastly, we did not include a placebo or sham arm in our study. Two prior studies by Kallmes et al25 and Buchbinder et al26 demonstrated that vertebroplasty resulted in similar reductions in pain when compared with a sham vertebroplasty arm. Inclusion of such a sham arm (ie, sham balloon kyphoplasty or sham spine augmentation) in our study would have provided additional insight into the benefits of these procedures in helping to determine the role of the placebo effect in outcomes of vertebroplasty and kyphoplasty.

Conclusions

The evidence from this trial is consistent with the findings of two previous RCTs that indicate that vertebroplasty and kyphoplasty are equally effective in reducing pain and disability related to vertebral compression fractures.

Acknowledgments

Harry Cramer, Costal Vascular & Interventional, Pensacola, Florida; John Curnes, Greensboro Radiology (NC), Greensboro, North Carolina; Patrick Noonan, Bronson Methodist Hospital, Kalamazoo, Michigan; Steven Dunnagan, Radiology Associates, Little Rock, Arkansas; Bassam Georgy, San Diego Interventional Pain Management, San Diego, California; Mark Myers, Nasseff Neuroscience Center, St Paul, Minnesota.

References

View Abstract

Footnotes

  • Contributors WB, KEK participated in data analysis and drafting and revision of the manuscript. AJE was the principal investigator, designed the trial, conducted vertebroplasties and kyphoplasties on participating patients, and drafted and revised the manuscript. DFK designed the trial, conducted vertebroplasties and kyphoplasties, drafted and revised the manuscript. KFL, MLJ, JRG helped with trial design, enrolled a large number of patients, conducted vertebroplasties and kyphoplasties and helped with manuscript revisions.

  • Funding This work was supported by Carefusion, Johnson and Johnson/DePuy Synthes Spine, Cardinal Health and Stryker.

  • Competing interests DFK: consultancy ev3, Medtronic, Codman; grants/grants pending: ev3, MicroVention, Sequent, Codman; payment for lectures (including service on speakers bureaus): MicroVention; royalties: UVA Patent Foundation; payment for development of educational presentations: ev3; travel/accommodations/meeting expenses unrelated to activities listed: MicroVention. WB, KEK, KFL, MLJ, JRG, AJE: none.

  • Ethics approval Ethics approval was obtained from the IRBs at all participating institutions.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Data sharing statement Any requests for additional unpublished data should be made by email to the corresponding author.

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Linked Articles