As noted earlier, static borrowing methods do not incorporate information about the current cycle into the prior specification. In contrast, dynamic borrowing methods do incorporate the current cycle into the prior specification of the model parameters. In this section, we extend methods of dynamic borrowing to growth curve models, concentrating on Bayesian dynamic borrowing and commensurate priors.
Extensions of Bayesian dynamic borrowing to growth curve models
We adapt the cross-sectional multilevel modeling notation of Kaplan et al. (2022) to the case of growth curve models. We begin by borrowing from historical cycles to estimate the growth parameters. This requires defining a joint distribution of the growth parameters over the historical cycles (denoted as cycles 1 to H) and the current cycle (denoted as cycle 0), which is assumed to be a multivariate Guassian distribution with the \(Q(H+1) \times 1\) mean vector \({{\varvec{\mu }}}_{\eta _{ig}}\) and \(Q(H+1) \times Q(H+1)\) block-diagonal covariance matrix \({{\textbf {T}}}_{\eta }\).
$$\begin{aligned} \text{vec}\left( {\varvec{\eta }}_{ig}^0,{{\varvec{\eta }}}_{ig}^1{},\ldots ,{{\varvec{\eta }}}_{ig}^{H-1}{}, {{\varvec{\eta }}}_{ig}^H{}\right) \sim N({{\varvec{\mu }}}_{\eta _{ig}}, {{{\textbf {T}}}}_\eta ), \end{aligned}$$
(11)
where following Eq. (7b), \({{\varvec{\mu }}}_{\eta _{ig}} = \text{vec} \Bigl ({\varvec{\Gamma }}^0_g{\varvec{x}}^0_{ig}, {\varvec{\Gamma }}^1_g{\varvec{x}}^1_{ig}, \ldots , {\varvec{\Gamma }}^H_g{\varvec{x}}^H_{ig}\Bigr )\), \(\varvec{\Gamma }_{g}^{h}\) \((h = 1,2,\ldots ,H)\) represents a \(Q \times P\) vector of the time-invariant regression coefficients for the \(h^{th}\) historical cycle and \(\varvec{\Gamma }_{g}^0\) represents a \(Q \times P\) vector of the current time-invariant regression coefficients.
The covariance matrix of the random growth parameters can be written as
$$\begin{aligned} {{\textbf {T}}}_{\eta }= \begin{bmatrix} {\varvec{\Sigma }}_{\eta }^0 &{} &{} &{} &{} &{} \\ &{} {\varvec{\Sigma }}_{\eta }^1 &{} &{} &{} &{}\\ &{} &{} \ddots &{} &{} &{}\\ &{} &{} &{} {\varvec{\Sigma }}_{\eta }^{H-1} &{}\\ &{} &{} &{} &{} {\varvec{\Sigma }}_{\eta }^H\\ \end{bmatrix}, \end{aligned}$$
(12)
where each element of Eq. (12) is a symmetric matrix as in Eq. (4).
Next, the joint distribution of \(\varvec{\Gamma }_{g}^{0}, \varvec{\Gamma }_{g}^{1}, \dots , \varvec{\Gamma }_{g}^{H}\) are assumed to be multivariate Gaussian with the \(QP(H+1)\) \(\times\) 1 mean vector \(\varvec{{{\varvec{\mu }}}}_{\Gamma _{g}}\) and \(QP(H+1) \times QP(H+1)\) block-diagonal covariance matrix \({{{\textbf {T}}}}_\Gamma\) —viz.
$$\begin{aligned} \text{vec} ({\varvec{\Gamma }}_{g}^0, {{\varvec{\Gamma }}}_{g}^1{},\ldots , {{\varvec{\Gamma }}}_{g}^H{}) \sim N({{{\varvec{\mu }}}}_{\Gamma _{g}}, {{{\textbf {T}}}}_{\Gamma }), \end{aligned}$$
(13)
where following Eq. (7c), \({\varvec{\mu }}_{{\Gamma }_{g}} = \text{vec} \Bigl ({\varvec{\Pi }}^0{\varvec{w}}_g^0, {\varvec{\Pi }}^1{\varvec{w}}_g^1, \dots , {\varvec{\Pi }}^H{\varvec{w}}_g^H\Bigr )\), \(\varvec{\Pi }^h\) \((h=1, \ldots H)\) represents a \(QP \times M\) matrix of the school-level regression coefficients for the \(h^{th}\) historical cycle, and \(\varvec{\Pi }^0\) represents a \(QP \times M\) matrix of the school-level regression coefficient for the current cycle.
The covariance matrix of the time-invariant regression coefficients over the current and historical cycles, \({{{\textbf {T}}}}_{\Gamma }\), is specified as being block diagonal,
$$\begin{aligned} {{{\textbf {T}}}}_{\Gamma } = \begin{bmatrix} {\varvec{\Sigma }}_{\Gamma }^0 &{} &{} &{} &{} &{} \\ &{} {\varvec{\Sigma }}_{\Gamma }^1 &{} &{} &{} &{}\\ &{} &{} \ddots &{} &{} &{}\\ &{} &{} &{} {\varvec{\Sigma }}_{\Gamma }^{H-1} &{}\\ &{} &{} &{} &{} {\varvec{\Sigma }}_{\Gamma }^H\\ \end{bmatrix}, \end{aligned}$$
(14)
where the elements of \({{{\textbf {T}}}}_{\Gamma }\) contain the variances and covariances of the regression coefficients within each historical or current data set. We assume that elements outside the block diagonal of \({{{\textbf {T}}}}_{\Gamma }\) are null matrices.
Finally, the joint distribution of \(\varvec{\Pi }^{0}, \varvec{\Pi }^{1}, \dots , \varvec{\Pi }^{H}\) is also assumed to be multivariate Gaussian with a QPM \(\times\) 1 mean vector \(\varvec{\mu }_\Pi\) and QPM \(\times\) QPM covariance matrix \({{{\textbf {T}}}}_{\Pi }\) —namely
$$\begin{aligned} \text{vec}\left( {{\varvec{\Pi }}}^0{}, {{\varvec{\Pi }}}^1{},\ldots ,{{\varvec{\Pi }}}^H{}\right) \sim N({{\varvec{\mu }}_{\Pi }}, {{{\textbf {T}}}}_{\Pi }). \end{aligned}$$
(15)
The covariance matrix \({{{\textbf {T}}}}_{\Pi }\) can be diagonal with elements \(\tau ^2\), which controls the degree of borrowing across cycles. Note that \(\varvec{\Pi }^{0}, \varvec{\Pi }^{1}, \dots , \varvec{\Pi }^{H}\) follows the same mean vector \({{\varvec{\mu }}_{\Pi }}\) and covariance matrix \({{{\textbf {T}}}}_{\Pi }\) as shown in Eq. (15) and thus the elements of \({{\varvec{\mu }}_{\Pi }}\) and \({{{\textbf {T}}}}_{\Pi }\) are not cycle specific, indicating that borrowing across cycles takes place at the top level of the hierarchy.
Commensurate priors
Hobbs et al. (2011) proposed dynamic versions of power priors, referred to as commensurate power priors, where the coefficient used to downweight the historical data is viewed as random and estimated based on a measure of the agreement between the current and historical data. Hobbs et al. (2011) also proposed general commensurate priors where the prior mean for the current parameters of interest is conditioned on the historical population mean and the prior precision \(\tau\), referred to as the commensurability parameter, which reflects the commensurability between the current and historical parameters.Footnote 1 Hobbs et al. (2011) evaluated commensurate priors in a scenario of borrowing one historical trial to analyze a single-arm trial, that is, assuming there is only one historical study and one parameter of interest, \(\beta\). The location parameter or mean for \(\beta\) is \(\mu ^0_{\beta }\) for the current data and \(\mu ^H_{\beta }\) for the historical data. Then the commensurate prior for \(\mu ^0_{\beta }\) can be specified as \(\mu ^0_{\beta } \sim\) N(\(\mu ^H_{\beta }\),\(1/\tau\)).
As discussed in Hobbs et al. (2012), the commensurate prior in Hobbs et al. (2011) suffers from the fact that diffuse priors could actually become undesirably informative and that the historical likelihood is considered as a component of the prior rather than data. Therefore, Hobbs et al. (2012) proposed a modified commensurate prior that incorporates historical data as part of the likelihood for the current parameter estimation and employs empirical and fully Bayesian modifications for estimating the commensurate parameter \(\tau\) (e.g., as illustrated in Eq. 1 of their paper). They also extended the method to general and generalized linear mixed regression models in the context of two successive clinical trials.
The modified commensurate priors approach in Hobbs et al. (2012) was compared to several meta-analytic models where priors for the historical parameters and current parameters were jointly modeled, but historical data were not incorporated in the likelihood of the current parameter estimation and thus the priors were not commensurate or dynamic. Commensurate priors were shown to provide more bias reduction compared to the meta-analytic approaches they evaluated. The bias reduction was larger when there was only one historical study compared to when there were two or three historical studies.
Although Kaplan et al. (2022) extended Bayesian dynamic borrowing to cross-sectional single-level and multilevel models with covariates, they did not examine commensurate priors. For this paper, we consider the modified commensurate prior in Hobbs et al. (2012) and implement it in the multilevel setting with multiple historical studies in the following way. For regression coefficients, we assume:
$$\begin{aligned} \varvec{\beta }^1 = \dots =\varvec{\beta }^H ={\varvec{{\beta }}}^{Hist}, \end{aligned}$$
(16)
where \(\varvec{\beta }^1, \dots ,\varvec{\beta }^H\) are regression coefficients of interest for each historical cycle. Although regression coefficients are likely to be different within a historical cycle, the common regression coefficients (e.g., intercepts) are assumed to be equal across historical cycles and denoted as \({{\varvec{{\beta }}}^{Hist}}\), where \({{\varvec{{\beta }}}^{Hist}}\) can be given a vague Gaussian prior.
The parameters for the current cycle (denoted as cycle 0) follow a prior distribution with the historical regression coefficients as the prior mean as follows:
$$\begin{aligned} {{\varvec{\beta }}}^0 \sim N({{\varvec{{\beta }}}^{Hist}}, {{\varvec{\Sigma _{\beta }}}}), \end{aligned}$$
(17)
where \({\varvec{\Sigma _{\beta }}}\) can be specified as a diagonal matrix, \(\text{ diag }(\sigma _1^2,\ldots ,\sigma _P^2, )\), for P regression coefficients, and each element of the diagonal matrix can be provided its own prior distribution, such as inverse-gamma (IG), half-Cauchy (see e.g. Gelman, 2006), spike-and-slab (Mitchell & Beauchamp, 1988), etc.
Considering a two-level setting such as students nested in schools, the priors for the school-level covariance matrices for historical cycles can be specified as follows:
$$\begin{aligned}&{\varvec{\Sigma }}^1 =\cdots {}={\varvec{\Sigma }}^H={\varvec{\Sigma }}^{Hist} \end{aligned}$$
(18a)
$$\begin{aligned}&{\varvec{\Sigma }}^{Hist}={\varvec{\sigma }}{\varvec{\Omega }}{\varvec{\sigma }}\end{aligned}$$
(18b)
$$\begin{aligned}&{\varvec{\sigma }}\sim \text {half-Cauchy}(0,1) \end{aligned}$$
(18c)
$$\begin{aligned}&{\varvec{\Omega }}\sim \text {LKJCorr}(1), \end{aligned}$$
(18d)
where \({\varvec{\Sigma }}^1, \dots , {\varvec{\Sigma }}^H\) are covariance matrices for each historical cycle. The common elements of the covariance matrices (e.g., variances) are assumed to be equal across historical cycles and denoted as \({\varvec{\Sigma }}^{Hist}\), and \({\varvec{\Omega }}\) is a correlation matrix following the Lewandowski et al. 2009 (LKJ) prior.Footnote 2
For the current cycle (denoted as cycle 0), the inverse of covariance matrix can follow a prior distribution conditioning on the historical covariance matrix as
$$\begin{aligned} ({\varvec{\Sigma }}^0)^{-1}&\sim \text {Wishart}(\nu ,\nu ({\varvec{\Sigma }}^{Hist})^{-1}). \end{aligned}$$
(19)
For multilevel settings with three levels or more, the higher level covariance matrices may follow the similar prior specifications as the above.
Note that Bayesian dynamic borrowing differs from commensurate priors insofar as the joint prior distribution for the former contains the current cycle, while commensurate priors place a prior on the common historical parameters of interest first and then the current parameter has a prior distribution with the historical regression coefficients as the mean as shown in Eq. (17).
Extensions of commensurate priors to growth curve models
Our extension of commensurate priors to growth curve models closely follows the notation for commensurate priors in multilevel settings. We consider a multilevel growth curve model with multiple time points within an individual and individuals nested in groups such as students nested in schools. To simplify the notation, we stack regression coefficients of the growth curve model, including growth parameters and regression coefficients of individual-level time-invariant predictors and group-level predictors, together to be \({\varvec{\beta }}\). We let \({\varvec{\Sigma }}_I\) and \({\varvec{\Sigma }}_G\) denote the corresponding individual-level and group-level covariance matrices.
The prior specification for historical regression coefficients \(\varvec{\beta }^1, \dots ,\varvec{\beta }^H\) can follow those in Eq. (16) and the prior specification for current regression coefficients \({\varvec{\beta }}^0\) can follow those in Eq. (17). Similarly, the prior specification for historical covariance matrices \({\varvec{\Sigma }}_I^1,\dots {},{\varvec{\Sigma }}_I^H\) and \({\varvec{\Sigma }}_G^1,\dots {},{\varvec{\Sigma }}_G^H\) can follow those in Eq. (18) and the prior specification for current covariance matrices \({\varvec{\Sigma }}_I^0\) and \({\varvec{\Sigma }}_G^0\) can follow those in Eq. (19).
We introduce a modification to the estimation of the commensurate prior for this study. Instead of using the spike-and-slab prior (Mitchell & Beauchamp, 1988) used by Hobbs et al. (2012) for the commensurability parameter, for computational simplicity and numerical stability, we utilize an extension of the horseshoe prior (Carvalho et al., 2010) developed by Piironen and Vehtari (2017) to account for commensurability. The horseshoe prior is a global-local shrinkage prior that combines together two priors: a global prior for all of the coefficients in the current cycle, which has the effect of shrinking all coefficients toward historical coefficients, and a local prior for each of the predictors in the current cycle, which has the effect of relaxing the shrinkage due to the global prior for coefficients that are away from historical coefficients.
Following the notation in Betancourt (2018), the horseshoe prior for the \(p^{th }\) element of \({\varvec{\beta }}^0\) can be specified as follows:
$$\begin{aligned} \beta ^0_{p}&\sim N(0, \tau \lambda _{p}) \end{aligned}$$
(20a)
$$\begin{aligned} \lambda _{p}&\sim \text{ half-Cauchy }(0,1) \end{aligned}$$
(20b)
$$\begin{aligned} \tau&\sim \text{ half-Cauchy }(0, \tau _0), \end{aligned}$$
(20c)
where \(\tau _0\) is a hyperparameter that controls the behavior of the global shrinkage prior \(\tau\) (Carvalho et al., 2010).Footnote 3
A limitation of the conventional horseshoe prior relates to the regularization of the large coefficients. Specifically, it is still the case that large coefficients can transcend the global scale set by \(\tau _0\) with the impact being that the posteriors of these large coefficients can become quite diffused, particularly in the case of weakly-identified coefficients (Betancourt, 2018). To remedy this issue, Piironen and Vehtari (2017) proposed a regularized version of the horseshoe prior (also known as the Finnish horseshoe prior) that has the following form:
$$\begin{aligned} \beta _{p}^0&\sim N(0, \tau {{\tilde{\lambda }}}_{p}) \end{aligned}$$
(21a)
$$\begin{aligned} {{\tilde{\lambda }}}_{p}&= \frac{c\lambda _m}{\sqrt{c^2 + \tau ^2\lambda ^2_m}} \end{aligned}$$
(21b)
$$\begin{aligned} \lambda _{p}&\sim \text{ half-Cauchy }(0,1) \end{aligned}$$
(21c)
$$\begin{aligned} c^2&\sim \text{ inv-gamma }\left( \frac{\nu }{2},\frac{\nu }{2}s^2\right) \end{aligned}$$
(21d)
$$\begin{aligned} \tau&\sim \text{ half-Cauchy }(0, \tau _0), \end{aligned}$$
(21e)
where \(s^2\) is the variance for each of the p predictor variables, assumed to be constant, and c is the slab width. The hyperparameters of the inverse-gamma distribution in Eq. (21d) induce a Student-\(t_{\nu }(0, s^2)\) distribution for the slab (see Piironen & Vehtari, 2017, for more detail). For our paper, two changes were implemented to the regularized horseshoe. First, we set the mean of \(\beta _p^0\) to \(\beta _p^{hist}\) rather than to zero in order for shrinkage to be toward the historical mean, i.e., \(\beta _{p}^0 \sim N(\beta _p^{hist}, \tau {{\tilde{\lambda }}}_{p})\). Second, we set \(s^2=1\) due to standardization of the data.