Yi Zhao
Postdoc at Johns Hopkins Biostat,
To-be TT Assistant Prof at Indiana Univ
fMRI Experiments
Task fMRI: performs tasks under brain scanning
Randomized stop/go task:
press button if "go";
withhold pressing if "stop"
Not resting-state: "do nothing" during scanning
Goal: infer brain activation and connectivity
fMRI data: blood-oxygen-level dependent (BOLD) signals from each
cube/voxel (~millimeters),
$10^5$ ~ $10^6$ voxels in total.
Multilevel fMRI Studies
Sub 1, Sess 1
Time 1
2
…
~200
⋮
Sub i, Sess j
…
⋮
Sub ~100, Sess ~4
…
Large, multilevel (subject, session, voxel) data
e.g. $1000 \times 4 \times 300 \times 10^6 \approx 1 $ trillion data points
Raw Data: Motor Region
Time (seconds)
Black: fMRI BOLD activity
Blue: stop onset times;
Red: go onset times
$Z_t$: Stimulus onsets convoluted with Canonical HRF
$M_t$, $R_t$: fMRI time series from two brain regions
Review: Granger Causality/VAR
Given two (or more) time series $x_t$ and $y_t$
$$\begin{align*} x_t &= \sum_{j=1}^p \psi_{1j} x_{t-j} + \sum_{j=1}^p \phi_{1j} y_{t-j} + \epsilon_{1t} \\
y_t &= \sum_{j=1}^p \psi_{2j} y_{t-j} + \sum_{j=1}^p \phi_{2j} x_{t-j} + \epsilon_{2t} \end{align*}$$
Also called vector autoregressive models
$y$ Granger causes $x$ if $\phi_{1j} \ne 0$ Granger, 69
Models pair-wise connections not pathways
Granger Causality/VAR
Granger Causality (VAR) popular for fMRI
Over 10,000 google scholar results on "granger causality neuroimaging", as of May 29, 2019
Models multiple stationary time series
AR($p$) (small $p$) fits fMRI well Lingdquist, 08
Not for non-stationary/task fMRI
Cannot model stimulus effects
Conceptual Brain Model with Stimulus
Goal: quantify effects
stimuli →
preSMA → PMC regions Duann, Ide, Luo, Li (2009). J of Neurosci
Model: Mediation Analysis and SEM
$$\begin{align*}M &= Z a + \overbrace{U + \epsilon_1}^{E_1}\\ R &= Z c + M b + \underbrace{U g + \epsilon_2}_{E_2}, \quad \epsilon_1 \bot \epsilon_2\end{align*}$$
Nonzero $\phi$'s and $\psi$'s denote the temporal influence from stimulus to mediator/outcome and etc
$A$, $B$, $C$ are causal following a similar proof in Sobel, Lindquist, 04
Causal Conditions
The treatment randomization regime is the same across time and participants.
Models are correctly specified, and no treatment-mediator interaction.
At each time point $t$, the observed outcome is one realization of the potential outcome with observed treatment assignment $\mathbf{\bar S}_{t}$, where $\mathbf{ \bar S}_{t}=( \mathbf{S}_{1},\dots,\mathbf{S}_{t})$.
The treatment assignment is random across time.
The causal effects are time-invariant.
The time-invariant covariance matrix is not affected by the treatment assignments.
Estimation: Conditional Likelihood
The full likelihood for our model is too complex
Given the initial $p$ time points, the conditional likelihood is
$$ \begin{align*} & \tiny \ell\left(\boldsymbol{\Theta},\delta~|~\mathbf{Z},\mathcal{I}_{p}\right) = \sum_{t=p+1}^{T}\log f\left((M_{t},R_{t})~|~\mathbf{X}_{t}\right) \\
& \tiny = -\frac{T-p}{2}\log\sigma_{1}^{2}\sigma_{2}^{2}(1-\delta^{2})-\frac{1}{2\sigma_{1}^{2}}\|\mathbf{M}-\mathbf{X}\boldsymbol{\theta}_{1}\|_{2}^{2} \\
& \tiny -\frac{1}{2\sigma_{2}^{2}(1-\delta^{2})}\|(\mathbf{R}-\mathbf{M}B-\mathbf{X}\boldsymbol{\theta}_{2})-\kappa(\mathbf{M}-\mathbf{X}\boldsymbol{\theta}_{1})\|_{2}^{2} \end{align*} $$
Multilevel Data: Two-level Likelihood
Second level model, for each subject $i$
$$(A_i,B_i,C_i) = (A,B,C) + (\eta^A_i, \eta^B_i, \eta^C_i)$$
where errors $\eta$ are normally distributed
The two level likelihood is conditional convex
Two-stage fitting: plug-in estimates from the first level
Block coordinate fitting: jointly optimize first level likelihood + second level likelihood
Theorem: Assume assumptions (A1)-(A6) are satisfied.
Assume $\mathbb{E}(Z_{i_{t}}^{2})=q\lt \infty$, for $i=1,\dots,N$. Let $T=\min_{i}T_{i}$. 1. If $\boldsymbol{\Lambda}$ is known, then the two-stage estimator $\hat{\delta}$ maximizes the profile likelihood of model asymptotically, and $\hat{\delta}$ is $\sqrt{NT}$-consistent. 2. If $\boldsymbol{\Lambda}$ is unknown, then the profile likelihood of model has a unique maximizer $\hat{\delta}$ asymptotically, and $\hat{\delta}$ is $\sqrt{NT}$-consistent, provided that $1/\varpi=\bar{\kappa}^{2}/\varrho^{2}=\mathcal{O}_{p}(1/\sqrt{NT})$, $\kappa_{i}=\sigma_{i_{2}}/\sigma_{i_{1}}$, $\bar{\kappa}=(1/N)\sum\kappa_{i}$, and $\varrho^{2}=(1/N)\sum(\kappa_{i}-\bar{\kappa})^{2}$.
Using the two-stage estimator $\hat{\delta}$, the CMLE of our model is consistent, as well as the estimator for $\mathbf{b}=(A,B,C)$.
Theory: Summary
Under regularity conditions, $N$ subs, $T$ time points
Our $\hat \delta$ is $\sqrt{NT}$-consistent
This relaxes the unmeasured confounding assumption in mediation analysis
Our $(\hat{A},\hat{B}, \hat{C})$ is also consistent
Simulations & Real Data
Comparison
Our methods: GMA-h and GMA-ts
Previous methods: BK Baron & Kenny, MACC Zhao and Luo, KKB Kenny et al
Other methods do not model the temporal correlations or time series like ours
Simulations
Low bias for $AB$
Low bias for temporal cor
Gray dash lines are the truth
GMA performs the best, and recovers the temporal correlations
Real Data Experiment
Public data: OpenFMRI ds30
Stop-go experiment: withhold (STOP) from pressing buttons
Expect "STOP" stimuli to deactivate brain region M1
Goal: quantify the role of region preSMA
Result
Result
STOP deactivates M1 directly ($C$) and indirectly ($AB$)
preSMA mediates a good portion of the total effect
Help resolve the debates among neuroscientists
Other methods under-estiamte the effects
Novel feedback findings: M1 → preSMA after lag 1 and 2 (not shown)