Forward- and Inverse-Planned Intensity-Modulated Radiotherapy in the CHHiP Trial: A Comparison of Dosimetry and Normal Tissue Toxicity

Aims The CHHiP (Conventional or Hypofractionated High-dose Intensity Modulated Radiotherapy In Prostate Cancer; CRUK/06/016) trial investigated hypofractionated radiotherapy for localised prostate cancer. Forward- (FP) or inverse-planned (IP) intensity-modulated techniques were permitted. Dose–volume histogram and toxicity data were compared to explore the effects of planning method. Materials and methods In total, 337 participants with intermediate-risk disease and prospectively collected toxicity data were included. Patients were matched on prostate and rectum/bladder volumes and on radiotherapy dose for toxicity comparisons. The primary outcome was grade 2 or higher Radiation Therapy Oncology Group (RTOG) bowel or bladder toxicity at 2 years. Results IP patients had smaller volumes of rectum irradiated to 50–70 Gy (P < 0.001); FP patients had smaller volumes of bladder irradiated to 74 Gy (P = 0.001). Acute grade 2 + bowel toxicity was worse with FP (27/53 [52%]; 11/53 [21%] IP; P = 0.0002); with no significant differences in acute urinary toxicity. At 2 years, RTOG grade 2 + bowel toxicity rates were FP 0/53 and IP 2/53 and RTOG grade 2 + bladder rates were FP 0/54 and IP 1/57. Conclusions Significant differences were found between dose–volume histograms from FP and IP methods. IP may result in small reductions in acute bowel toxicity but both techniques were associated with low rates of late radiotherapy side-effects.


Introduction
Prostate cancer is the most common cancer among men in the developed world [1,2] and radiotherapy is a curative treatment option. The conventional radiotherapy dose is limited by both acute and late side-effects in organs at risk (OAR) located in close proximity to the target volume; conformal radiotherapy (3DCRT) gave the opportunity for dose escalation [3]. There is a clear relationship between increasing radiation dose and improved clinical outcome (biochemical progression-free survival) [4]. Intensitymodulated radiotherapy (IMRT) has proven to be a powerful technique in terms of its dosimetric benefits for complex treatment sites and has become widely adopted for the treatment of prostate cancer [5e8].
CHHiP (Conventional or Hypofractionated High-dose Intensity Modulated Radiotherapy In Prostate Cancer; CRUK/06/016) was a randomised phase III trial in men with localised prostate cancer. It showed that hypofractionated radiotherapy (60 Gy/20 fractions) is safe and non-inferior to conventionally fractionated (74 Gy/37 fractions) in terms of the time to biochemical/clinical failure [9]. Radiotherapy treatment within CHHiP used a complex target volume treated with IMRT. When the trial was initiated in 2002, IMRT was a relatively new technique in the UK and was unavailable or restricted in clinical application at many centres [7]. Hence, both forward-planned (FP) and inverseplanned (IP) IMRT were permitted. The FP technique used a multi-segment three-field plan with optimal beam angles which had been compared with the two-phase 3DCRT technique used in the previous Medical Research Council (MRC) RT01 trial [4,10]. All CHHiP FP plans produced lower mean irradiated rectal volumes at all measured dose levels compared with the RT01 plans and also gave lower mean irradiated bladder volumes at both 50 and 60 Gy.
The study reported here compares doseevolume histogram (DVH) and rectum and bladder toxicity data for patients planned and treated using FP and IP techniques in the CHHiP trial. The aim was to determine if there were any systematic differences resulting from the two planning techniques. The analyses were planned and conducted in two stages. The first stage analysed dosimetry data to determine the relative merits of FP and IP on rectal and bladder DVHs; the second stage investigated whether any differences in the DVH data translated into clinically observable benefits in terms of a reduction in side-effects.

Trial Design
Full details of the CHHiP trial design, eligibility and treatment have been previously reported [9]. Briefly, men with histologically confirmed T1b-T3a,N0,M0 prostate cancer [11] suitable for radiotherapy were eligible. Patients were randomised (in a 1:1:1 ratio) to conventional fractionation (74 Gy/37 fractions over 7.4 weeks) or one of two hypofractionated schedules (60 Gy/20 fractions/4.0 weeks or 57 Gy/19 fractions/3.8 weeks). Randomisation was stratified by National Comprehensive Cancer Network (NCCN) risk classification (low versus intermediate versus high) [12] and radiotherapy treatment centre. Treatment allocation was not masked.

Treatment Details
Target volumes and doses are summarised in Supplementary Figure S1, with the core high-dose region receiving the target dose of 74, 60 or 57 Gy in accordance with allocated treatment [13].
Patients were computed tomography scanned at 5 mm intervals with a comfortably full bladder and an empty rectum, using approved immobilisation methods.
Radiotherapy treatment used either a single-phase FP method (field-in-field or segmented-field arrangement with three beam angles) or five-field IP IMRT, with 'step and shoot' or 'dynamic leaf' delivery. (Rotational arc delivery was permitted but was not widely used at the time of the trial.) OARs included the bladder, rectum, bowel and femoral heads. The entire bladder was outlined. The outer wall of the rectum was outlined from the anus (at the level of the ischial tuberosities or 1 cm below the lower margin of the planning target volume [PTV], whichever was more inferior) to the recto-sigmoid junction. OAR dose constraints were applied for treatment plan optimisation, defined for the conventional fractionation arm and linearly scaled to the same percentage of prescribed dose for the hypofractionated schedules. The rectum dose constraints were V74Gy < 3%, V70Gy < 15%, V65Gy < 30%, V60Gy < 50% and V50Gy < 60%. The bladder dose constraints were V74Gy < 5%, V60Gy < 25% and V50Gy < 50%. A plan assessment form, which provided a synopsis of DVH data for PTVs and OARs, was completed by the treatment centre for each patient treatment plan.
The planning methods, treatment delivery and verification techniques used within each centre were identical for each fractionation regimen and were reviewed and approved in advance by the national Radiotherapy Trials Quality Assurance group. Within a centre, different planning techniques were permitted for low-and intermediate-/high-risk groups.

Patients
CHHiP was conducted in three stages; this report uses toxicity data from stages 1 and 2 (safety), which between 18 October 2002 and 12 August 2006 recruited 457 patients from 11 UK centres using six different treatment planning systems [13]. Radiotherapy planning data were available for 442/457 patients; 337 had intermediate-risk disease and 105 had low-risk disease. The 105 low-risk patients were excluded from all analyses due to the small numbers of IP patients (15/105) ( Figure 1).

Volume-Matching Procedure
To reduce the potential for bias in the non-randomised comparisons of planning technique, analysis sets balanced for key variables that might affect the relationship between planning method and radiation dose to the bladder/rectum were defined. In particular, the centre treating the majority of IP patients was unique in using daily rectal microenemas, and there was significant variability between centres in the drinking volumes recommended by their bladder preparation procedures (200e750 ml). PTV differences resulting from the margin-growing algorithms of the various treatment planning systems have also been reported [14]. IP and FP patients were matched (1:1) using two volume parameters, each divided into six volume bands. For rectum DVH analyses, patients were matched according to PTV1 volume (including seminal vesicles) and rectal volume; for bladder DVH analyses, PTV2 and bladder volumes were used for matching.

DoseeVolume Histogram Comparison
Doseevolume data recorded on the plan assessment form for the rectum and bladder were compared. Although the three trial treatment groups differed in prescription dose and fractionation, the dose constraints, when scaled as a percentage of the prescribed dose, were identical, and each treatment group was planned and normalised in the same way within a treatment centre. DVH data could thus be compared directly in this planning study using relative dose without regard for treatment group. Descriptive statistics and boxplots were used to summarise the DVH data. The ManneWhitney test was used to compare the distribution of data at each dose level between planning methods.

Toxicity Comparison
The second stage was to investigate whether any observed differences in normal tissue dosimetry were associated with normal tissue toxicity. Dose and fractionation could potentially bias these results and so, for toxicity analysis, patients were additionally matched according to treatment dose schedule.
Acute side-effects were assessed using the Radiation Therapy Oncology Group (RTOG) scoring system for acute toxicity [15], completed weekly during treatment and at weeks 10, 12 and 18 from the radiotherapy start date. Late side-effects were assessed at 6, 12, 18 and 24 months using RTOG, the Late Effects on Normal Tissues: Subjective: Objective/Management (LENT/SOM) and Royal Marsden Hospital (RMH) scoring systems [16e18]. Patient-reported outcomes (PRO) were assessed before trial entry, before radiotherapy and at week 10, and 6, 12, 18 and 24 months after radiotherapy using the UCLA Prostate Cancer Index questionnaire [19]. The primary toxicity end point for this analysis was grade 2 or greater RTOG bladder or bowel toxicity experienced 2 years from the start of radiotherapy.
Baseline characteristics were summarised using descriptive statistics and, as the two groups were not generated by random allocation, statistical comparisons were made between the groups using chi-squared or ManneWhitney tests as appropriate. Patients were only included in toxicity analyses if they received at least one fraction of radiotherapy.
Toxicity and PRO data are presented as grade distributions at each time point and compared using ManneWhitney tests. The proportion of patients with grade 2 þ RTOG bladder or bowel toxicity at 2 years is presented together with exact binomial confidence intervals. The time from the radiotherapy start date to the first occurrence of grade 1 þ toxicity was analysed using KaplaneMeier methods used to estimate the cumulative proportion with an event at 2 years. All data reported were used; patients with no event were censored on the date of their last toxicity assessment. Cox proportional hazard models were used to estimate and test the effect of planning method (using the Wald test) with a hazard ratio of less than 1 favouring IP. The proportional hazards assumption was found to hold for all time-to-event analyses reported. Changes in PRO scores between the pre-radiotherapy assessment and after 2 years were calculated and are presented graphically.
All analyses were exploratory in nature. However, statistical analysis plans were written before conducting each of the pre-planned stages. A significance level of 1% was used to allow for multiple testing. Analyses were based on a database snapshot taken on 1 April 2010 and were conducted using Stata version 11.2.

Volume Matching
There was considerable imbalance between the rectum, bladder, PTV1 and PTV2 volumes in the FP and IP groups, with larger volumes for all four structures in the FP group. Following the volume-matching process there were no significant differences between FP and IP groups (P > 0.05 for rectum, bladder, PTV1 and PTV2) (Supplementary Figure S2). Seventy-eight FPeIP pairs of patients were matched on rectum and PTV1 volume (i.e. 156/337 available patient datasets) and 86 pairs matched on bladder and PTV2 volume (172/337). Following additional matching on trial treatment allocation, the number of pairs for toxicity analyses was reduced further to 53 for the rectum dataset (106/337) and 61 for the bladder (122/337) (Supplementary Table S1). There was reasonable balance in the clinical baseline characteristics of the matched datasets (Supplementary Table S2). Initial prostate-specific antigen levels were lower in the IP group for the bladder but not the rectum subset. More patients in the FP group required a modification to the posterior target volume margins in both rectum (FP 4 [8%]: IP 0 [0%]) and bladder (FP 6 [10%]: IP 1 [2%]) subsets. Derivation of the patient datasets for analysis is summarised in Figure 1.

Toxicity
There was a statistically significant difference in the worst acute bowel toxicity (P ¼ 0.0002), with 52% (27/52) FP compared with 21% (11/53) IP experiencing grade 2 þ toxicity during the first 18 weeks from the start of radiotherapy ( Figure 3A). Late toxicity was low with both planning methods, with 0/49 (0%; 95% confidence interval 0e7.2%) and 1/50 (2.0%; 95% confidence interval 0.1e10.6%) RTOG bowel grade 2 þ events in the FP and IP groups, respectively, at 2 years ( Table 1). The RMH and LENT/SOM tools suggested benefits for IP at almost all time points from 6 to 24 months (Table 1), although the only statistically significant difference was for LENT/SOM assessment at 18 months (P ¼ 0.008). The time to first post-radiotherapy grade 1 þ RMH and LENT/SOM bowel toxicity was reduced for FP patients compared with IP patients (RMH grade 1þ: hazard ratio ¼ 0.40; 95% confidence interval 0.21e0.73; P ¼ 0.003; LENT/SOM grade 1þ: hazard ratio ¼ 0.48; 95% confidence interval 0.27e0.84; P ¼ 0.01), although this was not seen with RTOG assessment (Figure 4 and Supplementary Table S3). However, there was no difference for grade 2 þ or 3 þ events using any scoring system   (Supplementary Table S3), but the number of events was very small. PROs showed an approximate doubling of 'overall bowel problems', 'distress' and 'rectal urgency' at week 10, in keeping with the physician-based scores. However, no consistent differences in PROs between planning methods remained from months 6e24 when outcomes appeared very favourable in both groups (Table 2). Change scores from pre-radiotherapy to 24 months confirmed the generally favourable bowel outcomes and similarities between planning methods (Supplementary Figure S3).
For the bladder dataset, there was no evidence of a difference in the worst acute bladder toxicity (P ¼ 0.709), with 45% (27/60) FP and 46% (28/61) IP patients experiencing grade 2 þ toxicity during the first 18 weeks ( Figure 3B). However, grade 1 þ toxicity was higher at all timepoints from weeks 1e18 in the IP group. There was no evidence of a difference for grade 2 þ late bladder toxicity (Table 1), which was low in both groups. At 2 years, RTOG grade 2 þ bladder toxicity was reported in 0/54 (0%; 95% confidence interval 0e6.6%) and 1/57 (1.8%; 95% confidence interval 0.1e9.4%) patients in the FP and IP groups, respectively. Time-to-event analyses indicated no statistically significant differences in bladder toxicity but there was a trend for higher grade 1 þ in the IP group for RTOG and LENT/SOM scales (Figure 4 and Supplementary Table S3). Patient-reported urinary outcomes seemed to be slightly higher at baseline in the IP group. At 2 years, both groups had similarly favourable profiles (Table 2). Change scores indicated that at 24 months an improvement in overall urinary function was evident for some patients in both planning method groups (Supplementary Figure S3).

Discussion
Careful matching of patients on rectum and bladder volumes was necessary because patients were not randomised to planning method and there were systematic differences in patient preparation techniques between the centre recruiting the majority of the IP patients and elsewhere (e.g. daily use of a rectal enema). The procedural differences resulted in a significant difference in rectum and bladder volumes, which were both smaller in the IP group, and these were successfully accounted for by the matching process.
Both FP and IP techniques were successful in achieving the rectal and bladder dose constraints. The use of IP IMRT enabled the dose to be conformed more optimally to the shape of the PTV, in particular to the concavity formed by the seminal vesicles wrapping around the rectum. This largely explains the differences seen in the IP and FP doseevolume data for the rectum, where IP reduced the volume of rectum irradiated to doses of 50 Gy and above. Both techniques were successful at limiting the rectal volume receiving the prescribed 74 Gy dose, where the PTV excluded seminal vesicles, so no difference was apparent at this dose. These results are similar to those reported in previous studies [10,20e23], where IMRT significantly reduced volumes of rectum exposed to doses >60 Gy, with no significant difference near the prescription dose.
The higher volume of bladder irradiated to 74 Gy with IP may be due to the five-field beam geometry used, which resulted in an anterior peak in the dose distribution above the PTV and up into the bladder from the overlapping of the two anterior-oblique beams. This did not occur with FP, as an orthogonal beam arrangement was used (anterior and two lateral beams) so, although the isodoses did not conform so well to the circular shape of the prostate PTVs when viewed on axial computed tomography images, the anterior shape of the isodoses was generally flat across the top of the PTV for FP. By contrast, past studies have reported a slight reduction in bladder volumes exposed to high doses with IMRT, with volumes exposed to intermediate and low Weeks from start of RT Weeks from start of RT  doses often higher for IMRT. This may be due in part to the different beam configurations used in these studies for FP, with three to nine coplanar beams, and the use of multiphase plans instead of field-in-field techniques. It is well documented that the most favourable conformal radiotherapy (CFRT) dose distributions are obtained using three orthogonal fields, as used in the CHHiP trial [24]. Acute bowel toxicity was greater in the FP group, with an approximate doubling in the proportion of patients with RTOG grade 2 þ events (FP 50% and IP 21%, respectively) mirrored by a similar increase in PRO moderate or worse symptoms of rectal urgency, distress and overall problems with bowels assessed at week 10. The main toxicity end point in the main CHHiP trial study was grade 2 þ RTOG toxicity at 2 years [13]. However, the low level of grade 2 þ toxicity observed across the whole trial, as well as in this analysis (only one case each of grade 2 bowel and bladder toxicity), make it an insensitive tool for dissecting differences between FP and IP groups. Although there was no consistent difference in late RTOG toxicity scores, both RMH and LENT/SOM tools showed benefits for IP, with less than half the recorded RMH grade 1 þ toxicity (hazard ratio 0.40) and LENT/SOM documented symptoms (hazard ratio 0.48). It is well documented that there are different components to prostate radiotherapy side-effects and proctopathy [25]. The RTOG scale reflects proctitis and bleeding, whereas RMHeLENT/SOM instruments include bowel frequency and looseness. Our previous studies on the impact of different dose levels on bowel symptoms suggest that higher doses in the 60e70 Gy range are associated with bleeding and 'proctitis', whereas a moderate dose 'bath' of 50e60 Gy is associated with frequency, looseness and sphincter control [26,27]. In the present study, IP produced both benefits, particularly in the 50e65 Gy dose range for the RMH and LENT/SOM assessments. The favourable PRO in both FP and IP groups underlines the low level of late toxicity seen with both techniques. There were no obvious differences in either acute or late grade 2 þ bladder toxicity between FP and IP groups, although the IP group seemed to have a slight increase in grade 1 acute and late side-effects.

A -RTOG bowel toxicity (rectum volume matched dataset) B -RTOG bladder toxicity (bladder volume matched dataset)
The lack of substantial differences in long-term effects between FP and IP methods is in keeping with recent findings from RTOG trial 0126, which showed similar dosimetric advantages of IMRT compared with carefully designed 3DCRT, but with no difference in patient-reported bowel or bladder function [28]. One implication of the impact of the improvement in contemporary radiotherapy treatment is a need to use increasingly sensitive physicianand patient-reported outcome measures to dissect differences between alternative radiotherapy strategies.

Conclusions
Significant differences were found between the DVHs for FP and IP patients for rectum and bladder. There were some associations between DVH differences and normal tissue effects, which were statistically significant for acute bowel toxicity and for minor levels of toxicity using LENT/SOM and RMH late side-effect bowel subscales favouring IP techniques. Conversely, IP techniques were associated with a small excess of grade 1 bladder side-effects. Both FP and IP planning techniques were associated with low levels of late normal tissue toxicity.

Conflicts of Interest
D. Dearnaley reports grants from Cancer Research UK during the conduct of the study and personal fees from Takeda, Amgen, Janssen, Astellas, Sandoz and ICR. In addition, D. Dearnaley has a patent issued EP1933709B1. E. Hall reports grants from Cancer Research UK during the conduct of the study and grants from Accuray outside of this study. V. Khoo reports personal fees and non-financial support from Accuray, Astellas and Bayer and non-financial support from Janssen. V. Khoo also reports honoraria for Speakers Bureaus with Accuray, Astellas, Bayer, Ipsen, Janssen, Takeda and Tolmar.