Ensuring superior reporting of radiotherapy non-inferiority trials: A systematic review.

Open AccessPublished:January 20, 2023DOI:



      : Although the frequency of non-inferiority trials is increasing, the consistency of reporting of these trials can vary. The aim of this systematic review was to assess the reporting quality of radiotherapy non-inferiority trials.


      : PubMed, EMBASE and Cochrane databases were queried for randomized controlled radiotherapy trials with non-inferiority hypotheses published in English between January 2000 and July 2022 and was performed by an information scientist. Descriptive statistics were used to summarize data.


      : Of 423 records screened, 59 (14%) were included after full-text review. All were published after 2003 and open label. The most common primary cancer type was breast (n=15, 25%). Altered radiation fractionation (n=26, 45%) and radiation de-escalation (n=11, 19%) were the most common types of interventions. The most common primary endpoints were locoregional control (n=17, 29%) and progression-free survival (n=14, 24%). Fifty three (90%) reported the non-inferiority margin, and only 6 (10%) provided statistical justification for the margin. The median absolute non-inferiority margin was 9% (interquartile range [IQR]: 5–10%), and the median relative margin was 1.51 (IQR: 1.33–2.04). Sample size calculations and confidence intervals were reported in 57 studies (97%). Both intention-to-treat and per-protocol analyses were reported in 27 studies (46%). In 31 trials (53%), non-inferiority of the primary endpoint was reached.


      : There was variability in the reporting of key components of non-inferiority trials. We encourage consideration of additional statistical reasoning such as guidelines or previous trials in the selection of the non-inferiority margin, reporting both absolute and relative margins, and the avoidance of statistically vague or misleading language in the reporting of future non-inferiority trials.


      Non-inferiority trials aim to demonstrate that an experimental treatment is not worse than the standard treatment by a prespecified threshold called the non-inferiority margin. These studies are often conducted when the experimental treatment is more convenient for patients, less toxic, more readily available, less costly, and/or when it is unethical to perform a placebo-controlled trial (1).
      In a superiority trial, the null hypothesis asserts that two arms are the same. If the lower bound of the 95% confidence interval (CI) of the treatment difference is above zero, one can reject the null hypothesis (Figure 1A). In contrast, the null hypothesis in a non-inferiority trial states that the experimental arm is worse than the control arm by a specified margin (δ). There are 6 possible outcomes from a non-inferiority trial as shown in Figure 1B. If the lower bound of the 95% CI of the treatment difference is above the non-inferiority margin, one can conclude non-inferiority. Depending on if the 95% CI lies wholly above or below 0, one can also conclude statistical superiority or inferiority respectively.
      Figure 1
      Figure 1Conclusions from the 95% confidence intervals of treatment differences in superiority trials (A) and non-inferiority trials (B).
      As with other types of trials, the methodologic quality of non-inferiority trials should be appraised before drawing conclusions. A 2006 review of non-inferiority trials published between 2003 and 2004 showed that only 20.3% of studies fulfilled reporting requirements to adequately allow readers to make conclusions (2). To improve the quality of reporting, the Consolidated Standards of Reporting Trials (CONSORT) group published a statement regarding reporting standards for non-inferiority and equivalence clinical trials (1). A summary of the recommendations from this report are listed in Table 1.
      Table 1Summary of methodologic and statistical reporting recommendations from the Consolidated Standards of Reporting Trials (CONSORT) 2010 extension for non-inferiority and equivalence trials.
      Section/topicChecklist item
      • Identification of the study as a non-inferiority or equivalence trial
      • Rationale for a non-inferiority study
      • Specification of a non-inferiority margin with the rationale for its choice
      • Description of trial design
      • Eligibility criteria
      • Description of interventions and whether the reference treatment is identical to that in any trial that established efficacy
      • Specify primary and secondary outcomes and whether hypotheses for each are non-inferiority or superiority
      • Sample size calculation using a non-inferiority criterion
      • Method used for randomization
      • Blinding details
      • Statistical methods, including whether a 1- or 2-sided confidence interval approach was used
      • For the primary non-inferiority outcome, report results in relation to the non-inferiority margin with measures of precision (e.g., confidence intervals)
      • For outcomes for which non-inferiority was hypothesized, a figure showing confidence intervals and the margin may be useful
      Previous reviews of non-inferiority trials in cancer have mainly focused on pharmacologic trials (3). To our knowledge, none have examined those involving radiotherapy. Non-inferiority trials are important in radiation oncology as many trials test different schedules to make treatments more convenient or less toxic. This review aims to evaluate the reporting quality of non-inferiority clinical trials involving radiotherapy by analyzing the reported data and to describe the characteristics of these studies.


      This systematic review was performed and reported according to the Preferred Reporting Items for Systematic reviews and Meta-Analyses (PRISMA) statement (4). The prespecified protocol was registered with the International Prospective Register of Systematic Reviews (PROSPERO), CRDXXX.


      A literature search of PubMed, EMBASE and Cochrane databases of randomized controlled radiotherapy trials with non-inferiority hypotheses published in English between January 1, 2000 and July 18, 2022 was performed by an information scientist (XXX) on July 18, 2022. The exact search strategy is detailed in the Appendix.

      Study selection

      Population: We included publications of randomized controlled trials of pediatric and adult patients. We did not include abstracts, study protocols, follow-up, or interim analyses.
      Intervention: Trials must have described a non-inferiority hypothesis. Although the initial protocol stated that we intended to review both non-inferiority and equivalence trials, this was amended to only include non-inferiority trials. The hypothesis must be relevant to radiotherapy; studies in which the same dose/volume of radiotherapy was provided to all patients were excluded (e.g., studies examining difference in concurrent systemic treatments). All forms of radiotherapy were included (e.g., external beam radiotherapy, stereotactic radiotherapy, brachytherapy), except for radionuclide therapy.
      Outcomes: Trials must have reported a clinical outcome (e.g., survival, toxicity or response to treatment). We excluded planning studies in which the primary outcome was a dosimetric quantity.
      Two reviewers (XXX, XXX) independently reviewed title and abstracts for eligibility of independent full-text review. A third reviewer (XXX) was available in case of discrepancies.

      Data collection and analysis

      One researcher (XXX) performed data collection and analysis. Data pertaining to study size, primary cancer type, type of comparison, endpoints, and statistical measures were collected. Descriptive statistics were performed to summarize data. Risk of bias assessment was not performed as study biases would not affect the outcomes of our review. A meta-analysis was not performed in line with the objective of this review. Covidence software was used for data management (Veritas Health Innovation, Melbourne, Australia).


      Of 423 records screened, 59 trials (14%) were included after full-text review. A diagram summarizing the screening and selection process is shown in Figure 2.
      Figure 2
      Figure 2Summary of the screening and selection process.
      Study characteristics are summarized in Table 2. The median number of participants was 486 (range: 40–4823). All studies were open label and were published after 2003. One trial (2%) was funded by industry exclusively, while three (5%) had joint funding from public and private sources. Four studies (7%) did not provide a rationale for a non-inferiority design. The most common primary cancer type was breast (n=15, 25%). The majority of studies (n=53, 90%) had 2 treatment arms. Altered radiation fractionation (n=26, 45%) and radiation de-escalation (n=11, 19%) were the most common types of interventions. Nine studies (16%) compared radiation to another treatment modality (e.g., surgery, radiofrequency ablation, etc.), and eight studies (14%) examined the omission of radiation.
      Table 2Summary of study characteristics.
      CharacteristicValueN (total = 59)%
      Date of publication2000–200412
      Country/region in which the study was performedAustralia23
      United Kingdom610
      United States35
      Other countries/regions: Korea, India, Iran.
      Study fundingGovernment or academic5186
      None/not specified47
      Primary cancer typeBreast1525
      Head and neck47
      Other primary cancer types: bladder, seminoma.
      Purpose of conducting a non-inferiority trialFewer adverse events2847
      More convenient1932
      Not specified47
      Number of treatment arms25390
      Type of interventionsRadiation de-escalation1119
      Altered fractionation2645
      Alternate modality916
      Omission of radiation814
      Other types of interventions: delay of surgery after radiation, timing of radiation, difference in systemic therapy, difference in radiation volumes.
      Systemic therapyConcurrent47
      Not allowed3356
      Study examined concurrent vs.12
      sequential systemic therapy
      Systemic therapy alone was a comparator47
      Radiation modalityPhoton5288
      Multiple/study compared modalities610
      Radiation techniqueField-based1424
      Not specified814
      Multiple/study compared techniques2237
      Study compared fractionation schemes3254
      Abbreviations: 3D-CRT—3-dimensional conformal radiotherapy; IMRT—intensity modulated radiotherapy; VMAT—volumetric modulated arc therapy; SRS—stereotactic radiosurgery; SABR—stereotactic ablative body radiotherapy.
      low asterisk Other countries/regions: Korea, India, Iran.
      low asterisklow asterisk Other primary cancer types: bladder, seminoma.
      low asterisklow asterisklow asterisk Other types of interventions: delay of surgery after radiation, timing of radiation, difference in systemic therapy, difference in radiation volumes.
      Endpoints and statistical data are summarized in Table 3. The most common primary endpoints were locoregional control (n=17, 29%) and progression-free survival (n=14, 24%). Fifty three (90%) reported the non-inferiority margin, and only 6 (10%) provided statistical justification for the margin based on previous clinical trials or published data. The median absolute non-inferiority margin was 9% (interquartile range [IQR]: 5–10%), and the median relative margin was 1.51 (IQR: 1.33–2.04). Sample size calculations and CIs were reported in 57 studies (97%). Both intention-to-treat (ITT) and per-protocol (PP) analyses were reported in 27 studies (46%).
      Table 3Summary of endpoints and statistical reporting.
      Primary endpointProgression-free survival1424
      Locoregional survival1729
      Disease-free survival47
      Overall survival814
      Response (e.g., pain response)610
      Were adverse events reported?Yes59100
      Was a non-inferiority margin specified?Yes5390
      Was statistical justification of the non-inferiority margin specified?Yes917
      Was a sample size calculation performed and rationalized?Yes5797
      Were confidence intervals reported?Yes5492
      Confidence interval type2-sided1833
      Not specified2444
      Confidence interval size97.5%12
      Other: 91%12
      Was a p-value reported?Yes



      Type of analysis reportedITT2237
      Modified ITT23
      Both ITT and PP2746
      Abbreviations: ITT—intention-to-treat; PP—per-protocol.
      In 31 trials (53%), non-inferiority of the primary endpoint was reached. Authors concluded non-inferiority in 34 trials (58%), and there was a discrepancy between the conclusion of non-inferiority and statistical results in 3 studies (5%).


      In this systematic review of radiation non-inferiority clinical trials, we found that the reporting of key methodologic components was inconsistent. Non-inferiority margins, CIs, and p-values were not always reported, making it impossible to interpret results of these trials. Despite lacking the statistical rationale, a conclusion of non-inferiority was claimed on the basis of inappropriate metrics in 3 studies. In light of these findings, we stress the importance of trialists reviewing CONSORT guidelines prior to the design of a non-inferiority trial and reporting their data.
      Selection of the non-inferiority margin is the most important aspect in the design of a non-inferiority trial as it is used to confirm or reject the hypothesis. A previous systematic review of non-inferiority clinical trials of oncologic drugs showed that the median non-inferiority margin was large at 12.5% (3). This is similar to the median non-inferiority margin in our study of 9%. A larger non-inferiority margin makes it easier to conclude non-inferiority, and can therefore be problematic if not appropriate. In contrast, a smaller margin would require a larger sample size to conclude non-inferiority. Although reporting guidelines recommend that authors report the method to set the margin (1), only a minority of studies (10%) in our review reported statistical justification for the non-inferiority margin. The EMA and FDA provide guidance on deciding the margin for trials involving drugs (6, 7). The margin is statistically defined as the lower bound 95% CI of the standard treatment effect compared to placebo based on historic clinical trials. A more conservative margin can also be considered to account for differences between historic trial conditions and the current trial; the FDA suggests the non-inferiority margin to be 50% of the lower bound 95% CI of the historic standard treatment effect. These guidelines are difficult to apply to trials involving treatments that are historically not compared to placebo, such as in radiation oncology. Without statistical justification for the non-inferiority margin, many authors relied on expert opinion and stakeholder analyses alone to derive their margins. This was in keeping with trials of medical devices which rely on expert opinion to select a non-inferiority margin (8).
      Further, margins can be expressed as absolute (e.g., 2% decrease) or relative values (e.g., HR 1.3). Many studies (n=27, 51%) in our review reported only absolute margins. Absolute margins can bias towards non-inferiority when event rates are lower than expected, whereas relative margins correspond to the same relative risk independent of event rates (9). A recent systematic review and meta-analysis of coronary stent non-inferiority trials showed that the majority of trials only reported absolute margins (55 of 58, 94.8%), and the majority of those (n=43) overestimated the control event rate, making the non-inferiority margin more permissive (10). When the authors performed a re-analysis of the trials with adjusted margins, they found that 17 of the 50 trials (34%) that met non-inferiority using the absolute margin did not meet criteria using the relative margin. Absolute margins can be more practical as it increases power, but this is contingent on accurate control event rate estimation.
      Previous reviews of non-inferiority clinical trials in other settings have also found variability in reporting. A review of all non-inferiority and equivalence trials published between 2003 and 2004 found that only 20.4% of studies provided justification for the non-inferiority margin, and only 42.6% of studies reported both ITT and PP analyses (2). Most studies (n=156, 96%) reported a prespecified non-inferiority or equivalence margin. However, the authors were only able to adequately assess non-inferiority and equivalence in 33 (20%) studies. Even among this small subgroup of studies, 4 reports (12%) misleadingly concluded non-inferiority or equivalence. In a 2013 review of non-inferiority trials involving oncologic drugs, the authors found that 62 of 75 studies (83%) reported a prespecified non-inferiority margin (3). The authors found that the number of studies that did not report a non-inferiority margin did not change after the publication of the CONSORT guidelines.
      We found that 3 studies concluded non-inferiority despite not reporting CIs of the primary endpoint. In addition, some authors used statistically vague terminology such as “comparable” and “as effective” in concluding statements of trials in which non-inferiority was not reached. This misleading reporting in clinical trials has been termed “spin” (11). A recent systematic review of oncologic non-inferiority clinical trials that did not meet statistical significance for non-inferiority showed that 75% had spin (12). Compared to a previous review of spin, the authors reported the prevalence of spin in non-inferiority clinical trials was higher than superiority clinical trials. Spin strategies included emphasizing trends for primary endpoints, conclusions based on secondary endpoints, or conclusions based on subgroup analyses. Spin was more likely associated with trials without for profit funding, without data managers, and with novel treatments. The authors posited that trials with external funding were held to stricter standards, hence less likely to have spin. They also suggested that trials with novel treatments had higher spin because a negative trial could result in the treatment not becoming standard of care, or the report not being published. Authors should be cautious when making conclusions based on analyses outside of the primary endpoint as this could be easily misconstrued.
      With the increasing frequency of non-inferiority trials, clinicians should also be wary of bio-creep, a phenomenon that describes a situation in which an ineffective or even harmful treatment may be deemed effective (13). This can happen when there is a series of non-inferiority trials in which a new drug is slightly worse than another, and this cycle may eventually lead to a drug that will eventually be ineffective or harmful when compared to the original standard. For example, a new treatment B is found to be non-inferior to treatment A and becomes the new standard of care. A subsequent trial uses treatment B as the active control against a new treatment C, which is found to be non-inferior to treatment B. It would be wrong to conclude that treatment C is also non-inferior to the original treatment A. Although this phenomenon has mostly been discussed theoretically, simulations have suggested that this is possible, but can be avoided by choosing an active control that has been compared to placebo, choosing an appropriate non-inferiority margin, and accurately estimating the control event rate (14).
      To our knowledge, this is the first systematic review to examine the reporting quality of non-inferiority clinical trials involving radiotherapy. Given the focused nature of this review, we were also able to describe radiation-specific details of the studies. Limitations include that our review focused on only English language articles, and that we did not assess the statistical rigor of the reported data as this was outside of the scope of this review.


      There was variability in the reporting of key components of non-inferiority trials including the non-inferiority margin. Adherence to standards of data reporting and statistical methodology are important to ensure proper interpretation of trial results.

      Author responsible for statistical analysis

      Andrew Arifin Email:



      Data availability statement

      Research data are available upon reasonable request.


      • 1. Piaggio G, Elbourne DR, Pocock SJ, Evans SJ, Altman DG, Group C. Reporting of noninferiority and equivalence randomized trials: extension of the CONSORT 2010 statement. JAMA. 2012;308(24):2594-604.
      • 2. Le Henanff A, Giraudeau B, Baron G, Ravaud P. Quality of reporting of noninferiority and equivalence randomized trials. JAMA. 2006;295(10):1147-51.
      • 3. Riechelmann RP, Alex A, Cruz L, Bariani GM, Hoff PM. Non-inferiority cancer clinical trials: scope and purposes underlying their design. Ann Oncol. 2013;24(7):1942-7.
      • 4. Page MJ, Moher D, Bossuyt PM, Boutron I, Hoffmann TC, Mulrow CD, et al. PRISMA 2020 explanation and elaboration: updated guidance and exemplars for reporting systematic reviews. BMJ. 2021;372:n160.
      • 5. Nevens D, Duprez F, Daisne JF, Dok R, Belmans A, Voordeckers M, et al. Reduction of the dose of radiotherapy to the elective neck in head and neck squamous cell carcinoma; a randomized clinical trial. Effect on late toxicity and tumor control. Radiother Oncol. 2017;122(2):171-7.
      • 6. European Medicines A. Guideline on the Choice of the Non-inferiority Margin 2005 [Available from:
      • 7. Food, Drug A. Non-Inferiority Clinical Trials to Establish Effectiveness 2016 [Available from:
      • 8. Lin CJ, Saver JL. Noninferiority Margins in Trials of Thrombectomy Devices for Acute Ischemic Stroke: Is the Bar Being Set Too Low? Stroke. 2019;50(12):3519-26.
      • 9. Kaul S, Diamond GA. Good enough: a primer on the analysis and interpretation of noninferiority trials. Ann Intern Med. 2006;145(1):62-9.
      • 10. Simonato M, Ben-Yehuda O, Vincent F, Zhang Z, Redfors B. Consequences of Inaccurate Assumptions in Coronary Stent Noninferiority Trials: A Systematic Review and Meta-analysis. JAMA Cardiol. 2022;7(3):320-7.
      • 11. Boutron I, Dutton S, Ravaud P, Altman DG. Reporting and interpretation of randomized controlled trials with statistically nonsignificant results for primary outcomes. JAMA. 2010;303(20):2058-64.
      • 12. Ito C, Hashimoto A, Uemura K, Oba K. Misleading Reporting (Spin) in Noninferiority Randomized Clinical Trials in Oncology With Statistically Not Significant Results: A Systematic Review. JAMA Netw Open. 2021;4(12):e2135765.
      • 13. Everson-Stewart S, Emerson SS. Bio-creep in non-inferiority clinical trials. Stat Med. 2010;29(27):2769-80.
      • 14. Odem-Davis K, Fleming TR. A simulation study evaluating bio-creep risk in serial non-inferiority clinical trials for preservation of effect. Stat Biopharm Res. 2015;7(1):12-24.

      Conflicts of interest

      AJA is a board member for the Canadian Association of Radiation Oncology. DAP reports research funding from the Ontario Institute for Cancer Research, and a consultant relationship with Need Inc., unrelated to the current manuscript. AVL has received honoraria from AstraZeneca for advisory board participation and speaker's fees.

      Appendix. Supplementary materials