Brown University Department of Biostatistics
Center for Statistical Sciences
Computation in Brain and Mind
Brown Institute for Brain Science
ABCD Research Group
The Big Data and Medical Imaging Workshop Sanya, CHINA December 27, 2016
Yi Zhao
(4rd Yr PhD Student, on job market this year)
Receipient of multiple student awards, including the 2015 ENAR distinguished paper
Brown University
Math Problem: Infer Causality from Obs. Data Only
Many unmodeled factors: vascular, respriatory, mind wandering, ...
Stop/Go Task fMRI
Task fMRI: performs tasks under brain scanning
Randomized stop/go task:
press button if "go";
withhold pressing if "stop"
Not resting-state: "rest" in scanner
Goal: causal mechanisms from stimuli to motor cortex
fMRI data: blood-oxygen-level dependent (BOLD) signals from each
cube/voxel (~millimeters),
$10^5$ ~ $10^6$ voxels in total.
Raw Data: Motor Region
Time (seconds)
Black: fMRI BOLD activity
Blue: stop onset times;
Red: go onset times
Question: can "big and complex" fMRI data be helpful?
Brain Networks
Functional/Effective Connectivity
nodes/connections removed to enhance visualization
Network Model with Stimulus
Goal: quantify effects
stimuli →
preSMA → PMC regions Duann, Ide, Luo, Li (2009). J of Neurosci
Model
Mediation Analysis and SEM
$$\begin{align*}M &= Z a + \overbrace{U + \epsilon_1}^{E_1}\\ R &= Z c + M b + \underbrace{U g + \epsilon_2}_{E_2}, \quad \epsilon_1 \bot \epsilon_2\end{align*}$$
Theorem:
Given $\delta$, unique maximizer of likelihood, expressed in closed form
Theorem:
Given $\delta$, our estimator is root-n consistent and efficient
Bias (and variance) depends on $\delta$
Maximum Likelihood: “Tragedy may lurk around a corner”
[Stigler 2007]
"Tragedy" of ML
Theorem: The maximum profile likelihood value is the same for every $\delta \in (-1,+1)$.
Likelihood provides
zero info about $\delta$
Cannot simply apply priors on $\delta$
Two different models generate same single-trial BOLD activations if only observing $Z$, $M$, and $R$
without measuring $U$
Our Approach: Step 2
Cannot identify $\delta$ from single sub and single sess (see our theorem)
Intuition: leverage complex data structure to infer $\delta$
Step 2: Some Details
Step 1 model for each sub and each sess
$$\begin{pmatrix}{M}_{ik} & {R}_{ik}\end{pmatrix}=\begin{pmatrix}{Z}_{ik} & {M}_{ik}\end{pmatrix}\begin{pmatrix}{a}_{ik} & {b}_{ik}\\ 0 & {c}_{ik} \end{pmatrix}+\begin{pmatrix}{E}_{1_{ik}} & {E}_{2_{ik}}\end{pmatrix}$$
Limited variability in $\delta$ across sub/sess
Random effect model cf AFNI, FSL, SPM, and etc $$\begin{pmatrix}{A}_{ik}\\ {B}_{ik}\\ {C}_{ik} \end{pmatrix}=\begin{pmatrix}{A}\\ {B}\\ {C} \end{pmatrix}+\begin{pmatrix}\alpha_{i}\\ \beta_{i}\\ \gamma_{i} \end{pmatrix}+\begin{pmatrix}\epsilon_{ik}^{{A}}\\ \epsilon_{ik}^{{B}}\\ \epsilon_{ik}^{{C}} \end{pmatrix}=b+u_{i}+\eta_{ik}$$
Option 2: Integrated Modeling
Optimize all parameters in joint likelihood $$\begin{align*} &\sum_{i=1}^{N}\sum_{k=1}^{K}\log\Pr\left(R_{ik},M_{ik}|Z_{ik},\delta,b_{ik},\sigma_{1_{ik}},\sigma_{2_{ik}}\right)\quad \mbox{Data}\\ & + \sum_{i=1}^{N}\sum_{k=1}^{K}\log\Pr\left(b_{ik}|u_{i},b,\boldsymbol{\Lambda}\right)\quad \mbox{Subject variation}\\ & +\sum_{i=1}^{N}\log\Pr\left(u_{i}|\boldsymbol{{\Psi}}\right) \quad \mbox{Prior}\end{align*}$$
Large computation: $5NK + 3N + 11 > 2000$ paras
Algorithm
Theorem: The joint likelihood is conditional convex for groups of parameters.
Algorithm: block coordinate descent with projections.
Leverage conditional convexity to reduce computation
details in our paper
Theory for SEM and Confounding
Theorem: Asymptotically, the joint multilevel likelihood has a unique maximum and the maximizer is consistent for $\delta$.
Contributions: data-driven estimation of confounding and consistency in SEM