RT-Dose-Response

RT Dose Response - A Critique of "HYTEC" Methodology

Recently, work on dose response modeling for radiotherapy, specifically the “HYTEC” project, has been published, work for which this reader is appreciative, and would thank the authors for their contribution.

As an example is work [1] searching, compiling, and analyzing relevant data in a group of small brain metastases ≤ 2.0 cm, with the authors estimating 1-year local control of 85% and 95% for 18 and 24 Gy, respectively, and estimating 50% tumor control dose (TCD50) 11.21 Gy single fraction equivalent dose (SFED) using alpha/beta=20, with 95% confidence interval of 10.43-11.90.

However, several issues undercut the conclusions of the authors, issues which likely generalize to the greater HYTEC work. First, the authors describe use of a logistic model applied to SFED with outcome of local control (LC).

A fundamental assumption inherent to specification of the authors’ model is a y-intercept of 0; this implies 0 local control from other background therapies, including whole brain radiotherapy and systemic therapies, and ignores competing risks including death from extracranial disease. These are not valid assumptions.

Maximal likelihood estimates depend on the distributional assumptions made for the dose-response model. [2]

For binomial data, the likelihood function [3] takes the form:

Of which taking the ln of both sides gives the log-likelihood function:

Minimization of the negative log-likelihood function is then performed, which for continuous data is minimization of nonlinear least squares, for response y_i as a function of dose x_i and with weights w_i:

Where beta are the model parameters. The Hessian matrix of second-order partial derivatives can be calculated to determine the variance-covariance matrix solution numerically [2].
The authors’ treatment of the actuarial local control data is not specified in the manuscript, and only in a separate “primer” article (https://doi.org/10.1016/j.ijrobp.2020.11.020) do they note general use of log likelihood function for binomial data, so one must assume that is their treatment here as well. The authors’ provided tumor control probability equation was created as a function and modeled for small metastases outcome of 1-year LC using R package drc [2]. As an example, treating 1-year LC rates as a continuous variable produces results with TCD50 of 15.6.

As an aside, I notice that the author’s table EA1 would total to N=12,197 for ≤ 2.0 cm brain metastases; underneath this, table EA4 for ≤ 2.0 cm metastasis notes N=10,106 - an error? At minimum, this requires clarification.

type="binomial", AIC=1375, log likelihood=-686

type="continuous", AIC=449 log likelihood=-221

Profile likelihood estimates are provided, the methodology of which is unspecified, but appear much smaller than nonparametric bootstrapped [4] 1-year local control of created HYTEC “logistic” function with package drc, type=binomial. As example, bootstrapping this HYTEC “logistic” model to estimate model parameters of TCD50 and Gamma50 for comparison to author’s reported data:

Number of bootstrap replications R = 1000
original bootBias bootSE bootMed
1 11.02912 -1.827515 4.6973 11.22436
2 0.88248 -0.073918 0.4194 0.87013

boot.ci(results, index=1)
BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
Based on 1000 bootstrap replicates

CALL :
boot.ci(boot.out = results, index = 1)

Intervals :
Level Normal Basic
95% ( 3.65, 22.06 ) ( 7.77, 22.05 )

Level Percentile BCa
95% ( 0.01, 14.28 ) ( 0.01, 14.39 )
Calculations and Intervals on Original Scale

Notice the magnitude of the empiric CI of TCD50 parameter by nonparametric bootstrapping, with basic bootstrap 95% confidence intervals from ~7.8-22, suggesting model parameter instability - something is badly wrong with trying to force fit the authors’ “logistic” model to this data!

“Fisher exact test, median splits” p-values are provided, but it is unclear what the hypothesis being tested is.

At this point, let’s check the distribution of the outcome 1-year LC data, making use of fitdistrplus:

So 1-year LC, a continuous probability bounded by 0 and 1, is consistent (not surprisingly) with Beta distribution. Was data type/distribution not considered or examined by the authors?

This is of importance in terms of maximal likelihood estimates, because it appears the likelihood function for the wrong data distribution was used. The likelihood function for beta distribution, as per Owen:[5]

Goodness-of-fit parameters were compared with other models for correct data type; a penalized cubic regression spline generalized additive model (GAM), k=5, beta regression family, was created with package mgcv.

Akaike information criterion [6] (AIC) was estimated at 1375.4 (authors’ model) vs -33535 for the GAM, evidence of poor fit of the authors’ chosen model. Similarly, log likelihood was estimated at -686 vs 16773, respectively, further evidence of poor fit of authors’ model. Authors’ fitted model demonstrated an estimated 36% higher bias than the maximal likelihood fitted GAM model estimates as below. Unfortunately, the authors make no such estimation of model goodness-of-fit, performance, or alternate model comparison.

library(Metrics)
bias(df$LC1Yr, fitted(drm.bin))/bias(df$LC1Yr), fitted(gam_k5))
[1] 1.355909

GAM, k=5, beta regression family fit and confidence intervals:

Let’s now examine the fit, with confidence intervals, of zero and one inflated beta regression via brms:

Family: zero_one_inflated_beta
Links: mu = logit; phi = identity; zoi = identity; coi = identity
Formula: LC1Yr | weights(N) ~ SFED
Data: df (Number of observations: 44)
Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
total post-warmup samples = 4000

Population-Level Effects:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept 0.54 0.06 0.42 0.67 1.00 5303 3020
SFED 0.08 0.00 0.07 0.08 1.00 5616 3193

Family Specific Parameters:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
phi 9.41 0.13 9.17 9.68 1.00 3687 2582
zoi 0.00 0.00 0.00 0.00 1.00 3830 2645
coi 0.98 0.02 0.92 1.00 1.00 3403 1953

And importantly, the predictive intervals:

Notably absent from the authors’ work are predictive intervals, which demonstrate what can be predicted from this data in terms of dose response.

Next, the median 1-year overall survival was estimated as 32%, with a range of 18-71% and multiple missing values. Such high competing risk of death with local control warrants consideration, suggesting significant individual study level variance in terms of 1-year local control, simply due to censoring alone. Variances, including of the individual study-level outcomes being modeled is essential data, the absence of which confounds meaningful interpretation of this work.

There is no mention of assessment of publication bias in the included studies, as is standard for meta-analysis/meta-regression; in following with meta-analysis methods, inverse variances should be used at weights, rather than sample sizes [7]

Once again, the work of the authors of such work is appreciated; authors’ reported dose-response for small metastases is spurious, including with the wide predictive intervals as above, and this example likely generalizes to the larger HYTEC work. Furthermore, would not such research question be much better served with proper methodology, i.e. dose response meta-regression, to estimate a dose-response curve from multiple summarized dose-response data, accounting for correlation amongst observations and heterogeneity across studies, under the employ of expert statistical support? Jackson et. al. [8] provide example of this for prostate cancer.

Basic statistical considerations, such as type of data/data distribution, appear to have not been examined in authors’ work. Rather than assume the data fits a model, would it not be better to select a model that best fits the data? It is discomforting to see a guest editor also be author/co-author on same work. The necessity of having the best possible information to apply clinically argues for better methodology here.

1) Redmond KJ, et al. Tumor Control Probability of Radiosurgery and Fractionated Stereotactic Radiosurgery for Brain Metastases. Int J Radiat Oncol Biol Phys. 2020 Dec 31:50360-3016(20)34451-5. Doi: 10.1016/j.ijrobp.2020.10.034. Epub ahead of print. PMID: 33390244.
2) Ritz C, Baty F, Streibig JC, Gerhard D. Dose-Response Analysis Using R. PLoS One. 2015 Dec 30;10(12):e0146021. doi: 10.1371/journal.pone.0146021. PMID: 26717316; PMCID: PMC4696819.
3) http://courses.atlas.illinois.edu/spring2016/STAT/STAT200/RProgramming/Maximum_Likelihood.html
4) Davison AC, Hinkley DV (1997). Bootstrap Methods and Their Applications. Cambridge University Press, Cambridge. ISBN 0-521-5739 2, http://statwww.epfl.ch/davison/BMA/.
5) Owen, Claire Elayne Bangerter, “Parameter Estimation for Beta Distribution” (2008). Theses and Dissertations. 1614.
6)Sakamoto Y, Ishiguro M, Kitigawa G. (1986). Akaike Information Criterion Statistics. D. Reidel Publishing Company.
7)M Borenstein, L Hedges, H Rothstein. Introduction to Meta-Analysis. 2007. https://www.meta-analysis.com/downloads/Meta%20Analysis%20Fixed%20vs%20Random%20effects.pdf 8)Jackson WC, Silva J, Hartman HE, Dess RT, Kishan AU, Beeler WH, Gharzai LA, Jaworski EM, Mehra R, Hearn JWD, Morgan TM, Salami SS, Cooperberg MR, Mahal BA, Soni PD, Kaffenberger S, Nguyen PL, Desai N, Feng FY, Zumsteg ZS, Spratt DE. Stereotactic Body Radiation Therapy for Localized Prostate Cancer: A Systematic Review and Meta-Analysis of Over 6,000 Patients Treated On Prospective Studies. Int J Radiat Oncol Biol Phys. 2019 Jul 15;104(4):778-789. doi: 10.1016/j.ijrobp.2019.03.051. Epub 2019 Apr 6. PMID: 30959121; PMCID: PMC6770993.

RSS

R-bloggers