Design Papers

Useful Papers - Experimental Design

Also check out JitterPapers, ScanningPapers, FmriPhysicsPapers and PhysiologyPapers...


Mechelli et. al (2003), "Comparing event-related and epoch analysis in blocked design fMRI," NeuroImage 18, 806-810 PDF

Summary: Traditionally, blocked design experiments (i.e., many trials in a row of the same condition) have been analyzed by convolving an HRF with a 'boxcar' regressor - an "epoch-related" model. Mechelli et. al show that even in block experiments, using an "event-related" model - i.e., treating each trial as a separate event, convolving each impulse regressor with an HRF, and summing the resulting regressors - can be better at detecting activations for at least some experiments.

Bottom line: Whether you choose blocked or randomized stimulus presentation, using an event-related model for your analysis is worth a try.

Dale (1999), "Optimal experimental design for event-related fMRI," Human Brain Mapping 8, 109-114 PDF

Summary: Within event-related designs, there is a debate about how best to organize stimuli - in a long periodic train, with randomized inter-stimulus intervals (ISIs), or somewhere in between. Dale shows that in terms of efficiency (the accuracy of estimating the shape of the HRF), event-related functions with randomized ISIs outperform those with fixed ISIs, and as the mean ISI drops, that outperformance becomes incredibly large.

Bottom line: For event-related designs, if efficiency is a concern, having a randomized inter-trial interval and a short mean inter-trial interval is a large advantage over either a fixed ISI or a long mean ISI.


Liu et. al (2001), "Detection power, estimation efficiency, and predictability in event-related fMRI", NeuroImage 13, 759-773 PDF

Summary: Introduces the distinction between efficiency (accuracy at estimating the shape of the HRF) and power (ability to detect any activation at all), and shows that any experimental design makes a fundamental tradeoff between the two. Describes mixed designs - essentially event-related designs with blocks of relatively higher concentrations of a particular stimulus type - as being a good tradeoff between efficiency and power, at the expense of longer experiments.

Bottom line: Mathematically shows block designs are great for power but poor for efficiency, and event-related designs make the opposite tradeoff. Confirms the goodness of randomized ISIs for event-related designs.

  • See also JitterPapers for Liu's more recent paper on the subject.

Desmond & Glover (2002), "Estimating sample size in functional MRI (fMRI) neuroimaging studies: statistical power analyses," Journal of Neuroscience Methods 118, 115-128 PDF

Summary: Used real fMRI data to get good estimates of average within-subject and between-subject variability in fMRI, and brought those into a mathematical model to determine how many subjects are necessary to detect an "average"-sized activation under several variability assumptions. Also worked on how many trials per condition are necessary to achieve a certain power level.

Bottom line: At reasonable p-thresholds, with an average activation, around 20-24 subjects were needed to maintain a reasonable power level. This is especially crucial with low effect sizes, which increase the number of subjects needed faster than linearly. The benefits of increasing the number of trials/condition don't level off until around 100. Spatial smoothing is a good strategy to decrease within-subject variability.

Mechelli et. al (2003), "Estimating efficiency a priori: a comparison of blocked and randomized designs," NeuroImage 18, 798-805 PDF

Summary: Mechelli et. al show that even when activation values are identical in a block and an event-related experiment, the block-related design may show higher t- or Z-values, due to differences in the standard error across experiments induced by the experimental design.

Bottom line: Comparing t- or Z-statistics directly across experiments is tricky, because a good part of the variability in those numbers depends on the actual experimental design - and that part of the variance is difficult to establish before the experiment.


Skudlarski et. al (1999), "ROC analysis of statistical methods used in functional MRI: individual subjects," NeuroImage 9, 311-329 PDF

Summary: A signal detection statistic (the receiver operator characteristic (ROC)) is calculated for real fMRI data into which fake activations of known extent and timecourse have been introduced, in order to find out what effect various preprocessing and experimental design manipulations have on detecting those known activations.

Bottom line: Lots of interesting results, but of prime interest for this week: with behavioral and hemodynamic response times on the (typical) order of a few seconds, blocks of around 18 seconds are best for block design experiments. Having several sessions (experimental runs in one scanning session) is a good way to improve signal detection. Other bottom lines of interest: Global intensity normalization and low-pass filtering don't help. High-pass filtering (linear and quadratic) and smoothing raw images does.

Huettel & McCarthy (2001), "The effects of single-trial averaging upon the spatial extent of fMRI activation," NeuroReport 12, 1-6 PDF

Summary: The authors used real event-related fMRI data, extracted varying sizes of subsets of trials from their runs, and treated those subsets as independent experiments, to determine what effect the number of trials per condition had on two factors: the spatial extent of their activation (power), and the accuracy of their estimated hemodynamic response (efficiency).

Bottom line: The extent of activation had a clean exponential relationship to number of trials, and the asymptote (point of diminishing returns) of that relationship wasn't reached until around 150 trials per condition; in other words, the benefits of increasing number of trials per condition continued to go up until around 150. However, variability of the estimated hemodynamic response was pretty stable after getting to about 25 trials per condition - so that might be a good benchmark for experiments concerned with estimating HRFs.