Imaging Knowledge Base

This book is a wholesale port of the original Stanford Gablab Wiki. It is in the process of being ported from a static HTML dump of the site, and as such contains non-functional Wiki links and references to documents that existed on the old site. Bugs will be fixed and content updated after the original content has been transplanted here.

3D Deconvolve


        3dDeconvolve \
          -input fxnl_dataset+orig \
          -polort 1 \
          -mask roi_mask_1.25+orig \
          -censor initials_censor.1D \
          -num_stimts 1 \
          -stim_file 1 ones_zeros.1D -stim_label "ones_zeros" \
          -stim_maxlag 1 3 \
          -tout -rout -fout -bucket analyzed_data

3D Despike

  • Purpose: Removes "spikes" from your 3D dataset and replaces it with something better. I haven't used

    SPM for a while but I think this is similar to the globalvariate option in the ROI's toolbox. There is a good explanation of why you should use the 3dDespike command on the Afni website:

  • Usage: Below is an example for 3dDespike.
           3dDespike -ignore 0 -nomask -prefix despiked_dataset slicetimed_realigned_dataset+orig
  • The "-ignore" option allows you to ignore the first "X" number of timepoints within a dataset. I have never had a reason to ignore any of them, so I perform the command on all of my timepoints.
  • The "-nomask" option tells 3dDespike to perform its magic on all of your voxels. Likewise, there is also a command if you just want to perform 3dDespike on a specific ROI or mask, but I've never used it before, so I'll refer you to the linux prompt 3dDespike help menu.
  • As always the "-prefix" command allows you to determine the name of the output dataset.
  • Remember to put in the input dataset as your last subcommand in the 3dDespike string. Also, the makers of Afni highly recommend that you use 3dDespike only after you've realigned your data.

3D Fourier

  • Purpose: This command allows the user to apply a highpass and lowpass filter to the functional dataset.
  • Usage: 3dFourier can be used by putting the below command (or one similar to that) within an executable text

    file. The text file should be placed within the path or within the directory you'll be using. Values for the high and low pass filters are in Hz, and are typically calculated by using the following formula: 1)highpass filters are typically calculated by taking 1/(2xtriallength; 2)lowpass filters are typically calculated by taking 1/(something less than 6).

           3dFourier \
            -prefix my_filtered_fxnls \
            -lowpass .2 \
            -highpass .008 \
  • The subcommand "-prefix" is used to specify the output file after filtering is complete.
  • The subcommand "-lowpass .2" tells 3dFourier that you want to eliminate anything within your data that has a frequency of .2 Hz or higher
  • "-highpass .008" tells 3dFourier to filter anything within your data that has a frequency of .008 or lower.
  • "mydataset+orig" is the input dataset you want filtered.

3D Tshift

  • Purpose: 3dTshift allows users to perform slice timing correction so that each slice that is acquired is

    aligned to the same temporal origin.

  • Usage: Below is an example of this command along with a definition of each subcommand. If you are interested

    in changing any of these parameters, you can type in "3dTshift -help" into the command line of any linux operating system that has afni installed. That will send a read out of all of the available options and a brief description about each one. To run the script below on a dataset, just enter it into an executable text file that is within your path, or in the directory you'll be running it from. Afni creator Bob Cox recommends performing slicetiming before other preprocessing steps, but the choice is up to you.

        3dTshift \
          -slice 5 \
          -prefix mydataset_slicetimed \
  • the "-slice 5" option tells the 3dTshift which slice number you want your slices to be aligned with temporally. Remember that afni starts counting from 0, so if you have 12 slices and want slice 6 to be used as the origin, you have to input it as slice 5.
  • the "-prefix mydataset_slicetimed" option allows you to specify the output name.
  • the last line with "mydataset+orig" is the dataset you want to use as your input.

NOTE: There are several other options you can use for this command, so I encourage you to check out the 3dTshift

help pages in linux

3D Volreg

  • Purpose: 3dvolreg stands for 3d volume registration and is used for correcting your data for motion. As with

    other commands in afni you can learn more about all of the subcommands and parameters by typing in "3dvolreg -help" into the command line of a linux operating system that has afni installed.

  • Usage: Below are two examples of a 3dvolreg script. The first example illustrates how to realign your

    functional data to any timepoint (or subbrick as it's called by afni users) within that same scan. The second example illustrates how to realign your data to any timepoint (subbrick) within another scan.

Example 1: Realigning to same scan

           3dvolreg \
             -prefix My_Realigned_Dataset \
             -base 1 \
             -1Dfile motion.1D \
             mydataset+orig                (mydataset should be slice timed)

Example 2: Realigning to separate scan

           3dvolreg \
             -prefix My_Realigned_Dataset \
             -base 'firstscan+orig[1]' \
             -1Dfile motion.1D \
  • The subcommand "-Fourier" tells 3dvolreg to use the fourier interpolation method for motion correcting
  • The option "-prefix" tells Afni the name of the output file after it realigns your data.
  • The option "-base #" tells 3dvolreg which timepoint you want your dataset aligned to. Philippe Goldin suggests you determine this by looking at your raw dataset across time (very easy to do in AFNI) and see where there are a series of continuous timepoints that are relatively stable, and use one of those as your realignment timepoint.
  • The option "-base 'scanname[ ]'" is the command for realigning your dataset to a timepoint within a separate scan. I don't use this much because with the spinal cord, we can't have any movement, so we take anatomicals in between each functional run.
  • The subcommand "-1Dfile motion.1D" tells 3dvolreg to output the 6 corrected motion parameters to a text file called motion.1D.
  • "mydataset+orig" is the input dataset you would like to have realigned.

AFNI Notes


Notes regarding AFNI and its operation...

A tremendous presentation by Doug Ward about proper thresholding for your functional results, based on a function space method that allows you to infer causality, is linked to here:

Unix stuff:

Some causes of spatial correlation in the FMRI "noise":

  • non-square k-space trajectories (i.e., spiral)
  • image reconstruction onto a finer matrix than the image resolution (e.g., reconstruction 64x64 EPI data to 128x128 grid)
  • "noise" that is induced by respiration and/or cardiac effects, or by micro movements of the head
  • "noise" in the scanner operation from shot to shot (e.g., a wire in the shim coils moving slightly)

That's all I can think of offhand, but I'm sure there are other sources. bob cox

Helpful websites:

Home Site:

AFNI Educational Material

AFNI seminar notes

Hillary Schaefer has created a good introduction:

Analysis of fMRI data with AFNI

AFNI documentation on how to get started

info on k-space


Math explanations

AFNI publication reference:

Comput Biomed Res. 1996 Jun;29(3):162-73. AFNI: software for analysis and visualization of functional magnetic resonance neuroimages.

Cox RW.

Biophysics Research Institute, Medical College of Wisconsin, Milwaukee 53226-0509, USA.

A package of computer programs for analysis and visualization of three-dimensional human brain functional magnetic resonance imaging (FMRI) results is described. The software can color overlay neural activation maps onto higher resolution anatomical scans. Slices in each cardinal plane can be viewed simultaneously. Manual placement of markers on anatomical landmarks allows transformation of anatomical and functional scans into stereotaxic (Talairach-Tournoux) coordinates. The techniques for automatically generating transformed functional data sets from manually labeled anatomical data sets are described. Facilities are provided for several types of statistical analyses of multiple 3D functional data sets. The programs are written in ANSI C and Motif 1.2 to run on Unix workstations.

False Detection Rate stuff: How to use FDR manual

3DDeconvolve handout: - a pretty clear 3dDeconvolve explanation, using the general linear model as a starting point...

Artifact Detection

PLEASE NOTE: This web page has been taken verbatim from the original Stanford Gabrieli Lab wiki, and contains outdated information about the Artifact Detection tool by Sue Gabrieli. For current information about the tool, please see the documentation contained in the current version of ART. We regret any confusion this may cause.

Artifact Detection

Artifacts are anything that disturb a nice clean view of activation regions. They can be due to subject movement, drift, scanner fluctuations, hemodynamics, breathing, and other physiological noises. Normally, the effects of subject movement are removed during preprocessing, and drift and other slow scan variations are removed as regressors. Whatever is left after this data cleaning is a potential artifact. Given that these effects are normally in the data, the real issue is whether the size of the artifacts will cause data fluctuations too severe to be able to make sensitive statistical detections from the data.

Artifacts can be detected automatically, or by having a user look at good displays of the data. In addition, the artdetect program has provisions for repairing bad data. The list below describes some of the artifact detection programs.

artdetect Program

Artdetect5 is the latest version of a script intended to allow visual inspection of global properties of fMRI data and automated or by-hand removal of 'outlier' scans, whose intensities are radically different from the mean of the timeseries.

"Radically different" is a user-set threshold of standard deviations from the mean which is set interactively in the graphical viewing window. A default minimum threshold for outliers is set at 1.8% of the mean value. This limit corresponds to the 3.5-sigma limit for expected physiological noise with an RMS value of 0.5% on a 3T scanner.

The script offers several choices of removal methods - insertion of the global mean, interpolation from surrounding scans or simple removal from the timeseries. The repair function currently works only in SPM99 (to be fixed in the future).

New features of this version include an option to automatically generate a custom mask for the image, and an option to run cases without a movement parameter file, i.e. before realignment is done. Note for some cases that the automatically generated mask or a user defined mask are necessary for accurate results. The automatic mask worked well in all our test cases. It writes an image file called ?ArtifactMask if you want to review the mask.

The program is called from the Global Variate button in the ROI-Toolbox (“roimod1”), and can also be called as “artdetect5” from a Matlab command line. It runs in either SPM2 and SPM99.

spm_movie, biac_movie

These programs show movies of all the collected data, so that a user may spot unusual features in the data by watching the movie.

spm_movie is well-described in the SPM documentation. A user chooses a slice plane, and the movie will cycle through all the scans on that slice plane.

biac_movie displays bulk data for quick visual review by a user. Typical time to process and display all the data in 100 scans of size (79,95,69) is a few minutes. Every voxel of scan data can be seen. Note all these programs often run much faster if the data is local on your machine!

Each scan is made into a montage of slices with orientation chosen by input. The montage may be all slices or a larger view of 20 consecutive slices. The slice displays may either be image data or contrast data. Contrast data shows the difference between each image and a reference image, amplified so that small data variations are more visible. In movie mode, every scan montage is shown, followed by a time history of mean intensity and position of each scan. In slider mode, every scan montage is available by selecting the slider position. The display is a default Matlab window where the zoom button can be used to examine individual pixels, where each pixel corresponds to an individual voxel in the 3D image.


This program checks for artifacts at the slice level. It was written after the movie programs showed that some of the artifacts just affected particular slices in a 3D volume. Possible causes for these types of artifacts include sharp subject movement for a short duration of just a few slices within a volume, or dropouts in the data during processing. Note in the latter case, if the raw scanned data does not have a slice artifact, then the slice artifact on later preprocessing stages can be "repaired" just by doing the preprocessing again.

The program has preset limits for global variations and slice variations, and when the limits are exceeded an artifact is declared. The program writes an artifact log and an artifact time history in the directory of the source images. It runs automatically, the user only need review the output.

Time Series Explorer

This program is accessed from the "roimod1" toolbox. It lets a user examine time series on particular voxels, in order to see if there are unusual spikes in the data. The program includes many other functions.

Basic Statistical Modeling FAQ

Frequently Asked Questions - Basic Statistical Modeling

1. What is "estimating a model?" How do the various programs do it?

Once you've performed all the spatial preprocessing you like on your functional data, you're ready to test your hypotheses about the data. Most standard analysis pathways in fMRI proceed in a hypothesis-driven fashion based on the general linear model (GLM), in which the researcher sets up a model of what she believes the brain may be doing in response to the variations in some parameter of the experiment, and then tests the truth of that hypothesis. (This contrasts with non-model-driven approaches like principal components analysis (PCA), which we'll talk about later). Estimating your model is the core statistical step of this process: the researcher describes some model of brain activity, and the program calculates a giant multiple regression of some kind to find out the extent to which that model correctly accounts for the real data, at every voxel in the brain.

Different programs have different methods of setting up a design matrix, but they all share certain elements: the user describes a set of different experimental conditions (or effects), and describes, for each of them, start times and end times (onset times and offset times, or durations). Experiments can have massively varying designs, from the simplest on-off block design to multi-condition randomly-timed event-related design - check out the sections on experimental design for more on this. The basic hypothesis is that some voxels in the brain had their intensity values covary, to a statistically significant degree, with some combination of the experimental conditions and parameters. The design matrix consists of a matrix with a row for each timepoint in the experiment (each functional image), and a column for each modeled experimental effect.

Usually, the user will then modify the design matrix to make it a more accurate model of what brain activity might be. Oftentimes, a constant term is added to the matrix, to account for the mean value of the session; sometimes linear or polynomial drifts are added to the matrix as well. Sometimes the columns of the matrix are convolved with some model of the hemodynamic response function, to reflect the blurring in signal the HRF applies to neural activity. (Another option is to simply separate the various timepoints for the response to a given condition into different columns, estimating each separately - a finite impulse response (FIR) model that effectively deconvolves the contribution of the HRF.) (See HrfFaq for more info.)

Once the design matrix is set up, the program uses the methods of GLM theory - essentially multiple regression - to calculate how accurately the model described by the design matrix accounts for the real data. The standard GLM equation is Y = BX + E, where Y is the time-varying intensities from one voxel, X is the design matrix, E is an error term, and B is the "parameters" or "beta weights" - a vector of values, one for each experimental condition, that tells the researcher how big the effect of the corresponding condition was in explaining the values at that voxel. If condition A's beta weight is significantly greater than condition B's beta weight at a given voxel, the hypothesis that A had a greater effect than B at that voxel is confirmed. Generally, programs create some voxel-by-voxel image of the beta weights - a beta image or parameter image.

Once the parameters are estimated, the program has both a measure of effect size and of error in the model for each voxel. Generally, the program then normalizes each effect size by the error to calculate some measure of statistical significance for effects - a contrast image (see ContrastsFaq for more info). The estimation, depending on the program used, the complexity of the model, and the number of images, can take a few seconds or several hours. Every major program, though, uses essentially the same methods of regression to estimate the betas, usually based on taking a pseudoinverse of the design matrix (see the Holmes et. al below for more details).

2. When should global mean scaling be used? What does it do?

Nutshell answer - global mean scaling should be used for PET, but not for fMRI.

Longer answer: One problem in neuroimaging experiments is that you're generally trying to pick out some signal from a noisy timeseries at every voxel. One form that noise can take is a global shift in intensities across the whole brain, which can be caused by scanner thermal noise, subject movement, physiological effects, etc. One way to get rid of a whole bunch of those noise sources at once, then, would be to look for timepoints where every voxel in the brain shows the same sudden shift and infer that that's a change in global response, not a regional change, and therefore not of interest to you. A simple way of doing that is just by finding the global mean of every timepoint - a global mean timeseries - and dividing every voxel's timeseries by the global mean timeseries.

The obvious problem with this is that removing the effect of the global mean from your model means you also remove signal that covaries with the global mean. In PET, this wasn't a big deal - the changes in the global mean could be on a different order of magnitude from task-related changes, and so regional activations weren't likely to bias the global mean particularly. In fMRI, though, it's a problem. Global mean intensity shifts are often about the same size, or at least comparable, to the size of task-induced activations. So large activations can seriously bias the global mean calculation, such that the global mean will go significantly up with large activations. Removing the effect of the global mean will then remove those large activations as well.

Using global scaling has been tested fairly extensively now in fMRI, and it almost always seems to negatively affect the sensitivity of the analysis. Generally, it's a bad idea.

3. What is autocorrelation correction? Should I do it?

The GLM approach suggested above and used for many types of experiments (not just neuroimaging) has a problem when applied to fMRI. It assumes that each observation is independent of the next, and any noise present at one timepoint is uncorrelated from the noise at the next timepoint. On that assumption, calculating the degrees of freedom in the data is easy - it's just the number of rows in the design matrix (number of TRs) minus the number of columns (number of effects), which makes calculating statistical significance for any beta value easy as well.

The trouble with this assumption is that it's wrong for fMRI. The large bulk of the noise present in the fMRI signal is low-frequency noise, which is highly correlated from one timepoint to the next. From a spectral analysis point of view, the power spectrum of the noise isn't flat by a long shot - it's highly skewed to the low frequency. In other words, there is a high degree of autocorrelation in fMRI data - the value at each time point is significantly explained by the value at the timepoints before or after. This is a problem for estimating statistical significance, because it means that our naive calculation of degrees of freedom is wrong - there are fewer degrees of freedom in real life than if every timepoint were independent, because of the high level of correlation between time points. Timepoints don't vary completely freely - they are explained by the previous timepoints. So our effective degrees of freedom is smaller than our earlier guess - but in order to calculate how significant any beta value is, we need to know how much smaller. How can we do that?

Friston and Worsley made an early attempt at this in the papers below. They argued that one way to account for the unknown autocorrelation was to essentially wash it out by applying their own, known, autocorrelation - temporally smoothing (or low-pass filtering) the data. The papers below extend the GLM framework to incorporate a known autocorrelation function and correctly calculate effective degrees of freedom for temporally smoothed data. This approach is sometimes called "coloring" the data - since uncorrelated noise is called "white" noise, this smoothing essentially "colors" the noise by rendering it less white. The idea is that after coloring, you know what color you've imposed, and so you can figure out exactly how to account for the color.

SPM99 (and earlier) offer two forms of accounting for the autocorrelation - low-pass filtering and autocorrelation estimation (AR(1) model). The autocorrelation estimation corresponds more with pre-whitening (see below), although it's implemented badly in SPM99 and probably shouldn't be used. In practice, however, low-pass filtering seems to be a failure. Tests of real data have repeatedly shown that temporal smoothing of the data seems to hurt analysis sensitivity more than it helps, and harm false-positive rates more than it helps. The bias in fMRI noise is simply so significant that it can't be swamped without accounting for it. In real life, the proper theoretical approach seems to be pre-whitening, and low-pass filtering has been removed from SPM2 and continues to not be available in other major packages. (See TemporalFilteringFaq for more info.)

4. What is pre-whitening? How does it help?

The other approach to dealing with autocorrelation in the fMRI noise power spectrum (see above), instead of 'coloring' the noise, is to 'whiten' it. If the GLM assumes white noise, the argument runs, let's make the noise we really have into white noise. This is generally how correlated noise is dealt with in the GLM literature, and it can be shown whitening the noise gives the most unbiased parameter estimates possible. The way to do this is simply by running a regreession on your data to find the extent of the autocorrelation. If you can figure out how much each timepoint's value is biased by the one before it, you can remove the effect of that previous timepoint, and that way only leave the 'white' part of the noise.

In theory, this can be very tricky, because one doesn't actually know how many previous timepoints are influencing the current timepoint's value. Essentially, one is trying to model the noise, without having precise estimates of where the noise is coming from. In practice, however, enough work has been done on figuring out the sources of fMRI noise to have a fairly good model of what it looks like, and an AR(1) + w model, where each noise timepoint is some white noise plus a scaling of the noise timepoint before it, seems to be a good fit (it's also described as a 1/f model). This pre-whitening is available in SPM2 and BrainVoyager natively and can be applied to AFNI (I think). This procedure essentially estimates the level of autocorrelation (or 'color') in the noise, and removes it from the timeseries ('whitening' the noise).

Theoretically, it should work well, but as its adoption is relatively new to the field, few rigorous tests of the effectiveness of pre-whitening have been done. We'll keep you posted as more info arrives...

5. How does parametric modulation work? When would I use it?

As described above, there are all kind of modifications the researcher can make to her design matrix once she's described the basics of when her conditions are happening. One important one is parametric modulation, which can be used in a case where an experimental condition is not just ON or OFF, but can happen at a variety of levels during the experiment. An example might be an n-back memory task, where on each trial the subject is asked to remember what letter happened n trials before, where n is varied from trial to trial. One hypothesis the research might have is that activity in the brain varies as a function of n - remembering back 3 trials is harder than remembering 1, so you might expect activity on a 3-back trial to be higher than on a 1-back. In this case, a parametric modulation of the design matrix would be perfect.

Generally, a parametric modulation is useful if you have some numerical value for each trial that you'd like to model. This contrasts with having a numerical value to model at each timepoint, which would be a time for a user-specified regressor (see below). In the parametric case, the user specifies onset times for the condition, and then specifies a parameter value for each trial in the condition - if there are 10 n-back trials, the user specifies 10 parameter values. The design matrix then modulates the activity in that column for each trial by some function of the parameter - linear, exponential, polynomial, etc. - set by the user. If the hypothesis is correct, that modulated column will fit the activity significantly better than an unmodulated effect. In SPM (and possibly others), the program splits the effect into two columns - an unmodulated effect and a parametric column - so that the researcher can separately estimate the effect of the condition itself and of the parametric modulation of that effect. This would be a way of separating out, in the example, the effect of doing any kind of retrieval from the load effect of varying the parameter.

6. What's the best way to include reaction times in my model? If you have events for which participants' response times vary widely (or even a little), your model will be improved by accounting for this variation (rather than assuming all events take identical time, as in the normal model). A common way of including reaction times is to use a parametric modulator, with the reaction time for each trial included as the parameter. In the most common way of doing this, the height of the HRF will be thus modulated by the reaction time. Grinband et al. (HBM06) showed this method actually doesn't work as well as a different kind of parametric regression - in which each event is modeled as an epoch (i.e., a boxcar) of variable duration, convolved with a standard HRF.

In other words, rather than assuming that neural events all take the same time, and the HRF they're convolved by varies in height with reaction time (not very plausible, or, it turns out, efficient), the best way is to assume the underlying neural events vary in reaction time, and convolve those boxcars (rather than "stick functions") with the same HRF.

In either case, as with most parametric modulation, the regressor including reaction time effects can be separate from the "trial regressor" that models the reaction-time-invariant effect of the trial. This corresponds to having one column in the design matrix for the condition itself (which doesn't have any reaction time effects) and a second, parametrically modulated one, which includes reaction times. If your goal is merely to get the best model possible, these don't need to be separated (only the second of the two, which includes RTs, could go in the model), but this will not allow you to separate the effect of "just being in the trial" from neural activations that vary with reaction time. To separate those effects, you need separate design matrix columns to model them. That choice depends on how interested you are in the reaction-time effect itself.

7. What kinds of user-specified regressors might I use? How do I include them?

Another modification you can make to the design matrix is simply to add columns or effects that don't correspond to some condition you want convolved with an HRF. A user-specified regressor is just some vector of numbers, one for each timepoint/functional image, that you'd like to include in the model because you believe it has some effect. If you have a numerical value for each timepoint (TR/functional image) that you'd like to model, a user-specified regressor is the way to go. This contrasts with the case of having a numerical value for each trial you'd like to model, in which case you'd use a parametric modulation (see above).

An example of a user-specified regressor might be if you have continuous self-reports of positive affect from each subject, and you'd like to see where there are voxels in the brain whose activity co-varied with that affect. You could include the positive affect regressor in your model and have a beta value estimated separately for it. Depending on what your hypothesis is about that effect, you may want to lag its values to account for the hemodynamic delay.

The user-specified regressor is a powerful tool for many types of modifications to the design matrix, but note that in many obvious cases in which you might want to separate out the contribution of a given effect of no interest - things like movement parameters, physiological variation, low-frequency confounds, etc. - programs may already have ways to deal with those things built in, in a more efficient fashion. At the very least, in any case when you include a user-specified regressor than you plan to simply ignore, you should try to ensure it doesn't covary significantly with your task and hence remove task-induced signal.


Basic Statistical Modeling Papers

Useful Papers - Basic Statistical Modeling


Friston et. al (1995), "Analysis of fMRI time-series revisited," NeuroImage 2, 45-53 PDF

Worsley & Friston (1995), "Analysis of fMRI time-series revisited - again," NeuroImage 2, 173-181 PDF

Summary: Friston et. al is the theoretical work extending the GLM to account for a known autocorrelation function, to enable the 'coloring' approach to noise autocorrelation in fMRI to be used. The authors argue that swamping unknown autocorrelation by temporally smoothing the data with a known kernel can produce less-biased parameter estimates than no correction. Worsley & Friston is essentialy a correction to the Friston et. al paper, fixing up some math issues.

Bottom line: Seemed like a good idea at the time, but the temporal smoothing approach to autocorrelation correction has been pretty discredited at this point for most fMRI work. This is useful historical background, though. Check out TemporalFilteringFaq for more details.


Holmes et. al (1997), "Characterizing brain images with the general linear model," in: Frackowiak et. al (Eds.), Human Brain Function, San Diego: Academic Press, 59-84 PDF (or see Jeff for paper copies)

Summary: A ground-up description of the general linear model and how it's applied to neuroimaging data. Describes how the GLM works in good (but not too dense) mathematical detail, and how it's modified in the case of PET (and fMRI, to a shorter degree).

Bottom line: Actually an extremely helpful and quite intelligible description of the GLM as it's applied to neuroimaging. Very useful background on statistical analysis.

Bandettini et. al (1993), "Processing strategies for time-course data sets in functional MRI of the human brain," Magnetic Resonance in Medicine 30, 161-173 PDF (or see Jeff for paper copies)

Summary: Describes the physical and theoretical background for fMRI and a variety of different analysis pathways for dealing with data. One of the earliest papers to describe statistical analysis of fMRI data in thorough detail. Strategies described include voxel-by-voxel analysis, GLM methods, frequency domain methods, cross-correlation methods, and a number of others.

Bottom line: Historical background, more than anything. A good number of the strategies outlined here are pretty obsolete.


Common SPM Errors

Common SPM Error Messages

This page collects the most common error messages found in SPM, and suggests some typical ways to troubleshoot or fix the error. Feel free to add your own!

Also see:

Error Message: "Error reading information on <image file name>. Please check that it is the correct format."
Error Message: "Can't get volume information for <image file name>."
Situation: These two errors often occur together, and sometimes even pop up a warning box with the first message. The errors come from spm_vol, the function that maps .img files to Matlab's workspace. spm_vol is called in a variety of situations - beginning to estimate a model, beginning any spatial preprocessing step, or beginning any of the ArtifactDetection tools in the GablabToolbox. Any time you've got images being read, basically, you could get this error.
Cause: This is a generic error message that crops up if there's any trouble finding or reading an image file name. It can occur because the named file doesn't exist or isn't in the specified location. It can occur if the .img file is present, but the .hdr file is not, or vice versa. Or if there is some corruption in the .img file, or it's not a standard Analyze-format image, you'll get this as well.
Common Fixes: 90% of the time this error message comes up, it's because the image file it's looking for isn't in the right place. The very, very first thing to do is look in the specified directory and make sure both the .hdr and the .img files you've specified are in there. But what if you don't know where SPM is looking for them? The easiest way to find out is to turn on the Matlab debugger with dbstop if error and then run the exact steps that caused the error again. (See MatlabDebugging for more info on the debugger.) When the program stops again, your Matlab prompt will look like this: K>> At that prompt, type P(i,:). This asks Matlab to print the filename that spm_vol had the error with (stored in a matrix called P). When Matlab prints the filename, inspect it and make sure it lines up with where you think the images should be. You should use cd to try to go to that named directory to make sure it exists - oftentimes there will be a typo in the directory name. If there is no directory name listed, only a filename, Matlab is looking for them in the present working directory, so check there. If the directory exists, use dir to list the files in there and make sure both the .hdr and .img file are in there.

If it turns out the image files aren't in the right place, you need to adjust where SPM looks for them; this might mean fixing a typo in a script file, or something more complicated. A common source of this problem is moving or renaming or compressing the image files after you've specified the design, and then trying to estimate the model or run a script that calls the design file. In this case, you have two options: 1) move the files back to where they're supposed to be (or uncompress or rename them), 2) use the modifySPM script or a variant to change the design file to point in the right place. See GablabScripts for more info on those functions.

If you're really sure both the .hdr and .img files are in the exact directory they're supposed to be, then try looking at the file in question with SPM's "Display" button - see if it will load. If not, the file may be corrupted and you may have to re-create it or replace it with a backed-up version.

Error Message: "Cant open image file."
Situation: A variant of the error above, except this is called by spm_sample_vol, the function SPM uses to get values from individual voxels in an image, rather than read the whole image. Can be called in all the same situations - estimating a model, using a GablabToolbox function, preprocessing, etc.
Cause: Same reasons as above - some problem in opening a filename.
Common Fixes: Again, 90% or more of the time, this error is due to an image file not being in the right place. With spm_sample_vol, though, it can be trickier to figure out where the problem filename points, because just running the debugger won't put you into spm_sample_vol directly. (It's a compiled C function, so it can't be accessed by the debugger.) Run the debugger, and then when you have the K>> prompt, take a look at the error message. It should print the line the error came from: something like "On line 252 ==> Y(j) = spm_sample_vol(VY(j), X, Y, Z, 1)". The first argument to spm_sample_vol is a structure that has the filename of the desired image in it. So if you take the name of that variable - in this case VY(j) and add .fname to it, you'll get Matlab to print the filename. In this case, at the K>> prompt, you'd type VY(j).fname to get the filename that's giving you trouble. Then follow the steps above - use cd and dir to try and find the image file, and do what you need to do to get it in the right place or change the design file that points to it.

Error Message: "Error while evaluating uicontrol Callback."
Situation: Almost anytime, anywhere - anytime you're using a graphical interface.
Cause: This is a totally generic error message in Matlab that crops up whenever anything goes wrong in a function you've called from a graphical interface. By itself, it doesn't mean anything and doesn't give you any information about what went wrong. Any element in a graphical interface you can push is called a uicontrol in Matlab, and any function called from a uicontrol is called a callback - so all this says is you have some error in a function that was just called from the interface.
Common Fixes: Almost always, you can force Matlab to give you more information. If you find the button that you clicked on to call the function that crashed, and close the window that button is in (i.e., the SPM results interface, the SPM control window, the GablabToolbox, etc.), you can often get a more detailed error message out of Matlab. In the non-graphical Matlab interface, sometimes simply hitting return a few times will do the trick. Finally, if you turn on the Matlab debugger with dbstop if error and re-run the steps that caused the error, the error message during the debugger should be much more comprehensive. See MatlabDebugging for more on reading error messages and the debugger.

Error Message: "Index exceeds matrix dimensions."
Situation: Almost any situation during SPM, but most common during a batch script or other user-programmed interaction.
Cause: This is a pretty generic error in Matlab that simply indicates you tried to reference an element of a vector that doesn't exist. If A is a vector with five elements, and you try to say B = A(6), for example, you'll get this error.
Common Fixes: Fixing this error generally involves using the Matlab debugger, so see MatlabDebugging for a little primer if you're unfamiliar with it. Launch the graphical version of Matlab, start the debugger with dbstop if error and re-run the steps that caused the error. When it happens, read the line of the function that caused the error (the text window that pops up should have it highlighted with a green arrow). See if you can figure out what matrix is being referenced and what the index is. Use whos to figure out how big the local matrices are. Oftentimes the help text at the top of the function or the SPM Programmer's Cheat Sheet (see GablabScripts) can be helpful in understanding what the various named variables refer to. Find out the value of the index - often named i or j - and see if it's too big, then look back a few lines in the code and see if you can understand what i or j might be set to. Common reasons you might get this might include: your batch script has nsess equal to one number, but the number of images you specified with nscans or selecting your images doesn't line up; your number of subjects doesn't line up with your number of images in a large batch script; your number of conditions isn't the same in every session, but you haven't accounted for that in your batch script; etc. See MatlabProgramming for some more tips on reading m-files and so forth.

Error Message:
Common Fixes:

Error Message:
Common Fixes:

Error Message:
Common Fixes:

Connectivity FAQ

Frequently Asked Questions - Connectivity

1. What is functional connectivity? What is effective connectivity?

The concept of "brain connectivity" is, as Horwitz points out on ConnectivityPapers, rather a tricky one to define. Ideally, you'd like to be able to measure the spatial (and temporal) path that information follows, from one point to another, millisecond by millisecond, and neuron to neuron (or at least region to region), in a directed fashion, such that you could say, "Ah, yes, activation starts in the visual cortex, moves to V2, gets shuttled from there to these other three visual areas and parietal cortex, and from parietal there to this other bit." Then you'd know something about what was being calculated and what calculations were being done where (and when). But, of course, you can't do that (yet). In fact, in general, most neuroscience recording methods, be they single-cell recording or fMRI or anything else, deal with isolated units of analysis - single cells or single voxels. You can't, in general, really well measure one neuron's connection to another in a living, behaving animal, much less noninvasively in a person.

What you can do is sample several sites at once and try and see how the patterns of activity you get are connected to each other. An obvious pattern to look for would be if two sites/voxels have intensities that are highly correlated. If the timeseries from one voxel looks exactly like the timeseries from another voxel, it might be a good bet they're doing similar things. If they're right next to each other, you call it a cluster; if they're far away from each - say, in visual cortex and PFC - you might guess they're connected to each other somehow.

Trouble is, of course, you run into the old adage that correlation doesn't imply causation. High correlation between remote sampling sites might imply some direct connection, or it might imply some third site driving their joint activation, or it might imply them jointly driving some third site. And even if they are connected, it's difficult to tell the direction of the connection, even if there is a "direction" to it.

Hence the two different terms used to describe connectivity in neuroimaging, a split introduced by Friston in 1993. Functional connectivity is the correlation concept - it's a descriptive concept, simply defined as the temporal correlation between remote timeseries or samples or what have you. Finding functional connectivity essentially reduces, as Lee et. al point out, to finding whether activity in two regions share any mutual information or not. Effective connectivity, by contrast, is the causation concept. It's defined as "the influence one neural system exerts over another either directly or indirectly." It doesn't imply a direct physical connection - simply a causative influence. It's a concept meant to support explanation and inference, more than just description, and it requires some account of causative direction or why there isn't any. It's also a lot trickier to figure out, generally, than functional connectivity. You'll hear both terms tossed around a fair amount, but remember: functional is simply correlation, whereas effective requires some causation somewhere.

2. How do I measure connectivity in the brain?

Good question. Almost every method for functional neuroimaging has ways to measure connectivity, and almost all of them boil down to the same concept: measuring the connection between timeseries at different points in the brain. In other words, almost every method out there measures functional connectivity, rather than directly measuring effective connectivity. Whether you're doing EEG and correlating timeseries from different electrodes, or using the fanciest dynamic causal modeling mathematical strategy with fast-TR fMRI, you're restricted generally to the data you can measure, which are samples from voxels that are treated independently. You can rule out some possible directions of influence by rules like temporal precedence (if a spike in one area precedes one from another area, the latter area can't have caused the earlier spike), but in general, most connectivity measures work on this simple foundation: Sample timeseries of activity from many different areas (voxels, electrodes, etc.) and then mathematically derive some measure of the mutual information between selected timeseries.

There are a couple obvious exceptions to this foundation. Measuring anatomical connectivity is a different type of procedure, and it's not clear how much influence anatomical connectivity (as we can measure it) and functional connecitivity have with each other - or should have with each other in analysis. Lee et. al (ConnectivityPapers) has an intriguing discussion on this point (and many others). At some level, of course, if we're interested in finding out whether information is flowing from one neuron to another, it's useful to know if they're directly connected or not. But we're a long way off from those sorts of measures on a large scale, and it's not clear that coarser measures are all that useful in learning about functional connectivity. Anatomical connectivity on its own can be incredibly interesting, though, which is why diffusion tensor imaging (DTI) is becoming increasingly popular as an imaging modality. The idea of DTI is that it can extract a measure of directionality of the white-matter tracts in a given voxel, giving you a picture of where white matter is pointing in the brain. This can be used to infer which areas are strongly connected to each other and which less so. And, of course, many older techniques for measuring connectivity in animals - staining, tracing, etc. - are still widely used.

The other big exception to the rule of measuring correlation is techniques that can directly measure causality by disrupting some part of the system. If you can knock out part of the system and cause a part hypothesized to depend on it to fail, while knocking out the latter part doesn't affect the former, you can start to make some inferences about directionality of influence. In living humans, the latest way to do this is with transcranial magnetic stimulation (TMS), which seems to offer some ways to disrupt selected cortical areas temporarily, reversibly and on command. Although the technique is relatively new, it holds high promise as an additional tool in the connectivity toolbox. Other methods for disruption - cortical cooling, induced lesions in animals, even lesion case studies in humans - can provide valuable information on this front as well.

3. What are the different methods to analyze connectivity in fMRI? How do they differ from each other?

In a field burdened with a heavy load of meaningless acronyms and technical jargon, connectivity analyses stand out as a particular offender. There are what seems like a dizzying array of ways to model connectivity in fMRI, each with various acronyms and fancy-sounding concepts and a great number of equations underlying it. The important thing to remember in all of them is that the data input is essentially the same: it's just timeseries data. And the underlying computations are all essentially doing the same thing - looking for patterns in the data that are similar between regions. Some methods literally attempt to do the same thing, but for the most part, the different methods proliferate because they examine slightly different aspects of connectivity. So the important things to think about when faced with interpreting or performing any connectivity analyses are goals: what is the point of this analysis? What does its output measure? What are the alternate possible explanations that this analysis has ruled out, or failed to rule out?

There's a broad distinction you can make in connectivity analyses between model-driven and non-model-driven analyses. Non-model-driven analyses are those which don't "know" anything about the details of your experiment - some of them are called "blind" algorithms because they're searching for patterns in your data without knowing what the structure of your experiment was. These types of analyses can be used to get at activation in general, but they're probably more widely used in connectivity analyses. Principal components analysis (PCA) and independent component analysis (ICA) are non-model-driven types of analyses. I won't talk much about them yet here, because I don't know much about them right now. Anyone else out there, feel free to contribute some info...

The popular forms of model-driven analysis are those embedded into the popular neuroimaging programs, and SPM2 has recently added a couple to the field which are getting wide use. Psychophysiological interactions (PPIs) have been used in SPM and other programs for a while, but SPM2 has automated this analysis to make it a lot easier to perform. They start with a seed ROI and look for other regions that have high changes in connection strength to the seed as the experiment proceeds. Dynamic causal modeling (DCM) is Karl Friston's latest addition to the modeling tradition, and it's a much more all-encompassing form of connectivity analysis; he claims it subsumes all earlier forms of analysis as well as the standard general linear model activation analysis. DCM starts with a set of ROIs and attempts to determine the influence of each on the other and the experiment's influence on the connectino strengths. BrainVoyager is soon to release a connectivity package based on the concept of autoregressive modeling and Granger causality - a way of ruling out some directions of causality. This isn't out yet, due to a patent dispute, so I don't know much about it. Structural equation modeling (SEM) is used when you have a set of ROIs you'd like to investigate, but aren't sure what the links between them may be; it's a way of searching among the possible graphs that connect your ROIs and ruling out some connections while including others. Which of these you decide to use will decide on your experimental goals and what you want this analysis to show, exactly.

4. What is a psychophysiological interaction (PPI) analysis? How do I do it? Why would I want to?

A PPI analysis starts with an ROI and a design matrix. It's a way of searching among all other voxels in the brain (outside the seed ROI) for regions that are highly connected to that seed. One of the most straightforward ways of doing connectivity analyses would be to start with one ROI and simply measure the correlation of all other voxels in the brain to that voxel's timeseries, looking for high correlation values. As Friston and other pointed out a while ago, though, it's not quite as interesting if the correlation between two regions is totally static across the experiment - or if it's driven by the fact that they're both totally non-active during rest conditions, say. What might be more interesting is if the connection strength between a voxel and your seed ROI varied with the experiment - i.e., there was a much tighter connection during condition A between these regions than there was during condition B. That may tell you something about how connectivity influences your actual task (and vice versa).

PPIs are relatively simple to perform; you extract the timeseries from a seed voxel or ROI and convolve it with a vector representing a contrast in your design matrix (say, A vs. B). You then put this new PPI regressor into a general linear model analysis, along with the timeseries itself and the vector representing your contrast; you'll use those to soak up the variance from the main effects, which you'll ignore in favor of the PPI interaction term. When you estimate the parameters of this new GLM, the voxels where the PPI regressor has a very high parameter are those who showed a signficant change in connectivity with your experimental manipulation.

This is do-able in SPM99, or indeed any program; SPM2 makes it more automated, and adds some mathematical wrinkles, like deconvolving the HRF from your PPI regressor so as to look for interactions at the deconvolved (and hopefully neural) level, rather than at the HRF level.

PPIs are good to do if you have one ROI of interest and want to see what's connected with it. They're tricky to interpret, and they can take a really long time to re-estimate if you have several ROIs to explore and many subjects.

5. What is structural equation modeling (SEM)? How do I do it? Why would I want to?

Structural equation modeling analyses begin with a set of ROIs and nothing else. The idea in SEM is to try and estimate connection strengths between those ROIs that make up the best possible model of connection between them. The connection strengths are correlational (not directional), but represent the straightforward degree of correlation between the timeseries of those regions. This strategy (and variants of it) also fall under the title "path analysis," although that's a broader term that can describe analyses of non-timeseries data. SEM procedures can vary, but they're all kind of like the GLM: they search through the space of possible connection strengths until they find the set of connection strengths that best fits the data.

The measure of "best fit" is an important choice in SEM, and there's not wide agreement on the measure you should use, except a common suggestion that you use more than one and combine their results. Other bells and whistles on SEM analyses can include bootstrapping the data (see PthresholdFaq for information on permutation tests and bootstrapping) to get a confidence interval on how good the model could possibly be (Bullmore et. al (2000), NeuroImage 11, describe this strategy).

SEM isn't built in to any of the major neuroimaging programs that I know of, but several statistics program support it (as it's used in other social sciences besides neuroscience).

SEM is good to do when you have a set of ROIs - either functional or anatomical - and you're interested in knowing how strong the connections are between them (or whether connections between a particular pair exist at all) across the whole experiment (or part of it). It's a pretty straightforward style of analysis, but because of that, it doesn't take into account of lot of details of fMRI - temporal variations in the connection strengths, for example.

6. What is Granger causality? How does it relate to brain connectivity?

Granger causality is a concept imported from economics, where it was developed to do timeseries modeling of economic data (weird that that's the kind of data economists would want to look at - economic data, you know. Strange guys, those economists). It's an attempt to impose some directionality on connections between timeseries, or at least rule out some directions, by leveraging the rule of temporal precedence. The core of the Granger causality idea is that events can't cause events that already happened - so if a particular pattern happens in one timeseries, and then happens later in another timeseries, the latter one can't have caused the former one. Granger causation is a very limited form of causality, because it doesn't rule out the possibility that some third factor has induced the change in both of the timeseries, or any of the other problems common to ascribing causality to correlation data, but it's a start in the direction of blocking off certain directions of arrows.

The most explicit use of Granger causality has been in the connectivity package being developed for BrainVoyager, detailed below in Goebel et. al (ConnectivityPapers). The package is based also on the use of vector autoregressive modeling, which I couldn't begin to explain in detail but I gather is kind of like dynamic causal modeling or something like that. Unfortunately, the package has been held up in patent disputes, so it's not clear when we'll get to evaluate it up front. I'm not aware of other packages or programs currently using those methods to evaluate connectivity.

7. What is Dynamic Causal Modeling (DCM)? How do I do it? Why would I want to?

Well, this is another one that I'm undoubtedly going to botch the explanation for. But I'll take a very limited stab at it. If you're interested in this analysis, I highly recommend reading Friston et. al's paper on it at ConnectivityPapers.

DCM analyses are highly model-driven. You start with a set of ROIs and a guess at how they're connected with each other. That guess can be "fully connected," with every ROI attached to every other, or you can eliminate some connections off the bat. DCM then takes as its input your design matrix and the timeseries from those regions, and attempts a sort of hyper-advanced general linear model estimation. Instead of a general linear model, though, DCM explicitly considers some nonlinear aspects to the experiment: specifically, the connections between your ROIs and how they might change with the experimental manipulation. It goes through a huge set of Bayesian estimations and deconvolutions and every other fancy thing you can think of, and what you get on the way out is a big set of parameters. That set will include: HRFs for each of your regions, "resting" connection strengths between each of your regions, beta weights describing how the experiment affected each of your regions (just like regular beta weights), and "connection beta weights," indicating how the experimental manipulation affected your connection strengths. It'll also spit out some estimation of the statistical significance of each of these.

Friston et. al are hyped on this analysis; they believe that all the other analyses out there (SEM, PPI, etc.) are all special cases of DCM. Even the standard general linear model analysis of activation is a special case, they say, where you're assuming there are no connections between ROIs, and your ROIs are your voxels. A few papers have been put out thus far - Mechelli et. al (below) is one - using DCM in big analyses, with fairly promising results.

DCM is built into SPM2, and requires you to have SPM2 results to use it. It's not available yet for any other neuroimaging program.

DCM is great if you've got a set of ROIs, a hypothesis about how they might work, and you're particularly interested in how some areas or conditions might influence the connections between some other areas. Mechelli et. al (ConnectivityPapers) is a good example of this - they looked at whether differences in visual activations due to categories of stimuli were mediated from the bottom up or from the top down. It's also kind of insanely complicated right now, and clearly in a sort of feeling-out phase in the community. Results may be difficult to interpret. But it's definitely the cutting edge of fMRI connectivity research for model-driven analyses right now.

8. How do I measure connectivity across a group?

Almost all of these methods measure correlations between timeseries, and so they're only appropriate to do at the individual level. The best way to run a group analysis is in the standard hierarchical fashion - take the output of the individual analysis and toss it into a group analysis. The output from all of them won't be the same - for PPIs, for example, you'll get an activation image, which works in SPM for a standard group-level analysis, whereas for SEM you'll get a set of connection weights, which you can then run a standard statistical test on in SPSS - but the hierarchical approach should work fine in general.


Connectivity Papers

Useful Papers - Connectivity


Lee et. al (2003), "A report of the functional connectivity workshop, Dusseldorf 2002," NeuroImage 19, 457-465 PDF

Summary: An excellent overview of the state of connectivity analyses today, with a survey of current methods, pitfalls, and open questions. Many terms are clearly defined. The overview looks not only at human neuroimaging techniques, but at a variety of anatomical techniques from various species.

Bottom line: Invaluable primer on the issues in connectivity.

Horwitz (2003), "The elusive concept of brain connectivity," NeuroImage 19, 466-470 PDF

Summary: A sort of minority opinion, to go with Lee et. al (above). Essentially a deeper primer on the pitfalls of studying connectivity, Horwitz raises questions about just how clear any of the term used in the field are at their core. In particular, he highlights the point that different neuroimaging techniques probably study very different types of connectivity, and their results may not be directly comparable.

Bottom line: Comparing connectivity across experimental techniques is necessary, yet done at your own risk...


Friston et. al (2003), "Dynamic causal modeling," NeuroImage 19, 1273-1302 PDF

Summary: The first introduction to the DCM concept, and necessary reading for the current state of connectivity research. The DCM framework essentially expands on the classic general linear model, adding in bilinear terms that attempt to model resting connectivity between regions and how experimental manipulations change that connectivity. Friston demonstrates how DCM encompasses a variety of other techniques (SEM, etc.) and explicitly includes the temporal dimension in assessing connectivity.

Bottom line: Well, it's big. And dense. But if you're at all interested in DCM, you should probably at least flip through it.

Mechelli et. al (2003), "A dynamic causal modeling study on category effects: bottom-up or top-down mediation," Journal of Cognitive Neuroscience 15, 925-934 PDF

Summary: A really interesting example of DCM use in a real-world setting. Authors used the dataset from another study, provided by the fMRI Data Center, and applied DCM to examine the connection between secondary visual areas and parietal regions. They discovered that category effects - differences in responses between chairs, faces, and houses - in occipital and temporal cortex are mediated by very early visual effects - not just top-down effects.

Bottom line: An excellent road map for a classic DCM analysis, and a useful handbook for those interested in trying it out.

Kondo et. al (2004), "Functional roles of the cingulo-frontal network in performance on working memory," NeuroImage 21, 2-14 PDF

Summary: A recent example of the SEM approach. Kondo et. al used SEM to model the connection between ACC and PFC in high-reading-span and low-reading-span groups, discovering a closer connection in the former than the latter. Some discussion of how SEM is carried out, as a guide for those wishing to try the method.

Bottom line: Nice example of how SEM works in a fairly complicated between-group study.

Goebel et. al (2003), "Investigating directed cortical interactions in time-resolved fMRI data using vector autoregressive modeling and Granger causality mapping," Magnetic Resonance Imaging 21, 1251-1261 PDF

Summary: Yet another approach to measuring connectivity, this time from the BrainVoyager group and attempting to impose some concept of directionality on the connectivity analyses. The concept of Granger causality - imported from economics - is described from a neuroimaging point of view. The VAR framework (a multivariate technique) is described and used to analyze a visuomotor mapping study.

Bottom line: Nice look at another type of connectivity work and the pitfalls of trying to impose directionality on it.


Contrasts FAQ

Frequently Asked Questions - Contrasts

1. What's the difference between a T- and an F-contrast? When should I use each one?

Simply put, a T-contrast tests a single linear constraint on your model - something like "The effect size (parameter weight) for condition A is greater than that for condition B." T-contrasts can involve more than two parameters, but they can only ever test a single sort of proposition. So a T-contrast can test "The sum of parameters A and B is greater than that for parameters C and D," but not any sort of AND-ing or OR-ing of propositions.

An F-contrast, by contrast (ha!), is used to test whether any of several linear constraints is true. An F-contrast can be thought of as an OR statement containing several T-contrasts, such that if any of the T-contrasts that make it up are true, the F-contrast is true. So you could specify an F-contrast like "parameter A is different than B; parameter C is different than D; parameter E is different than F," and if any of those linear contrasts were significant, the F-contrast would be significant. The utility of the F-contrast is highest when you're just trying to detect areas with any sort of activation, and you don't have a clear idea as to the shape of the response. They were designed to be used with something like a Fourier basis set model, where you want to know if any combination of your cosine basis functions is significantly correlated with the brain activation. Testing that set with a T-contrast wouldn't be correct; it would tell you whether the sum of those basis functions' parameters was significant, which isn't what you'd want. Testing individually whether any of those parameters is significant, though, tells you something.

The disadvantage of the F-test is that it doesn't tell you anything about which parameters are driving the effect - that is, which of the linear constraints might be individually significant. It also doesn't tell you what the direction of the effect; parameter A might be different than parameter B, but you don't know which one is greater. This isn't a problem if you're using a basis set where different parameters don't have much individual physiological meaning (such as a Fourier set), but oftentimes F-tests are followed up with t-tests to further isolate which parameters are driving the effect and what direction the effect is in.

The Ward, Veltman & Hutton, and Friston papers on ContrastsPapers both describe the F-test and how it's used in pretty clear fashion, with specific examples.

2. What's a conjunction analysis? How do I do one?

An F-test allows you to OR together several linear constraints, but what if you want to AND them together? That is, what if you want to test if all of a set of several linear constraints are satisfied? For that, you need a conjunction analysis. There are several ways to perform them - see the Price & Friston paper on ContrastsPapers and those below it - but SPM provides a built-in way that is a good example. (Details of how to use SPM to do one are in the Veltman & Hutton paper there). The idea is to find the intersection of all the sets of voxels that satisfy a given linear constraint in the set, a simple mathematical operation in itself. The tricky part is to figure out what threshold level to use on each individual linear constraint to give the conjunction (or intersection) an appropriate p-threshold. SPM makes the choice that the p-thresholds on each individual constraint simply multiply together, so a conjunction of two constraints that you wanted to threshold at 0.001 would mean thresholding each individual constraint at the square root of 0.001. The resulting field of t-statistics is called a "minimum T-field" - effectively you're thresholding the smallest T-statistic among the linear constraints at each voxel - and SPM allows corrected p-thresholds to applied as well as uncorrected. These analyses are also available for F-constrasts, to AND together several OR statements.

One problem that some critics of this approach have highlighted is that it means at a voxel called "active" in the conjunction, any individual constraint on it may hardly be significant at all. If you want to see the conjunction of contrasts A and B, you'd prefer not to see 'common activations' that have p-values far above a reasonable threshold when looked at in each individual contrast. Price & Friston have argued that the individual constraints don't matter much in conjunctions, but some people still prefer not to use the minimum T-field approach for this reason. In this case, you can conjoin constraints together simply by intersecting their thresholded statistic maps (with some care taken to make sure the contrasts are orthogonalized (see below)), which can be done algebraically.

3. What does 'orthogonalizing' my contrast mean?

If you're testing a conjunction, one worry you might have is the the contrasts that make it up don't have independent distributions - that they are testing, to some degree, the same effect - and thus the calculation of how significant the conjunction of will be biased. If you use SPM to make a conjunction analysis through the contrast manager, it will attempt to avoid this problem by orthogonalizing your contrasts - essentially, rendering them independent of one another. The computation involved is complicated - not just simply checking whether the contrast vectors are linearly independent, although it's derived from that - but it can be thought of as follows:

Starting with the second contrast, check it against the first for independence; if the two are not orthogonal, remove all the effects of the first one from the second, creating a new, fully orthogonal contrast. Then check the third one against the second and the first, the fourth against the first three, and so on. SPM thus successively orthogonalizes the contrasts such that the conjunction is tested for correctly. See the help docs for spm_getSPM.m for more details.

4. How do I do a multisubject conjunction analysis?

Friston et. al (ContrastsPapers) is a good paper to check out for this. They describe some ways of thinking about the SPM style of conjunction analysis, which is normally a fixed-effects and hence only single-subject analysis, that allow its extension to a population-level inference. It's not clear that all the assumptions in that paper are true, and so it's on a little shaky ground.

However, it's certainly possible at an algebraic level to intersect thresholded t-maps from several subjects, just as easily as it is from several constraints. So it may make sense to try the simple intersection method, using somewhat loosened thresholds on the individual level. I'm not super sure on all the math behind this, so you might want to talk to Sue Gabrieli about this sort of thing...

5. What does the 'effects of interest' contrast image in SPM tell you?

Not an awful lot of interest, as it turns out. It's an image automatically created as the first contrast in an SPM analysis, and it consists of a giant F-contrast that tests to see whether any parameter corresponding to any condition is different from zero. In other words, if any of the columns of your design matrix (that aren't the block-effect columns) differ significantly from zero, either positively or negatively, at any voxel, that voxel will show up as significant in this F-image. Needless to say, it's not a very intepretable image for anyone who isn't using a very simple implicit-baseline design matrix. So generally, don't worry about it.

6. How is the intercept in the GLM represented in the analysis?

Every neuroimaging program accounts for the "whole-brain mean" somehow in its statistics, by which I mean whatever part of the signal that does not vary at all with time. That time-invariant point can be represented in the design matrix explicitly as a column of all ones, and SPM automatically includes a column like that for each session in a given design matrix. (AFNI and BrainVoyager don't explicitly show this column in the design matrix, but they include it in their model in the same fashion.) During the model estimation, a parameter is fit at each voxel to this whole-experiment mean, as any other column of the design matrix, and its value represents the mean signal value around which the signal oscillates. This is the 'intercept' of the analysis - the starting value from which experimental manipulations cause deviations. This number is automatically saved at each voxel in SPM ( in the beta images corresponding to the block effect columns) and can be saved in AFNI or BrainVoyager if desired.

7. How do I make contrasts for a deconvolution analysis? What sort of contrasts should I report?

Generally, deconvolution analyses of the sort implemented by AFNI's 3dDeconvolve work on a finite impulse response (FIR) model, in which each peristimulus timepoint for each condition out to a threshold timepoint is represented by a separate column in the design matrix. In this case, a given 'condition' (or trial type) is represented in the matrix not by one column but by several. The readout of the parameter values across those peristimulus timepoints then gives you a nice peristimulus timecourse, but how do you evaluate that timecourse within the GLM statistical framework? There are a couple of ways; in general, the Ward (ContrastsPapers) is the best reference to describe them.

A couple obvious ones, though. First, an F-contrast containing a single constraint for each column of a given condition will test the 'omnibus' hypothesis for that condition - the hypothesis that there's some parameter significantly different from zero somewhere in the peristimulus timecourse, or more, simply, the hypothesis that there was some brain signal correlated to the task at some point following the task onset. This test won't tell you what sort of activity it was, but it will point out areas that had some sort of activity of some kind going on.

Secondly, a variety of different T-contrasts could be used to test various hypotheses about the timecourse. You might be interested in testing between two conditions at the same timepoint that you think might be the peak of the HRF. You might be interested in whether a single condition's HRF rose more sharply or fell more sharply (in which case a T-contrast within that timecourse could be used). You might use some sort of a summing T-contrast to compare the 'area below the curve' in two different conditions.

There's not wide consensus about exactly what sorts of statistics count as 'significant' activation at this point in the literature - the difference between an HRF that rises sharply, spikes high, then falls back down to baseline quickly from an HRF that rises slowly, peaks only a little above baseline, but stays above baseline for a long time, isn't real clear at this point. No one is sure what such a difference represents exactly. This means, though, that there are a wealth of differences between timecourses that one could potentially explore. Almost any hypothesis can be made interesting with the right explanation, and fortunately almost any hypothesis can be tested in the GLM with the tools of T-tests, F-tests and conjunctions of constraints.


Contrasts HOWTO

How do I...

* Compare the activation in one contrast to the activation in another contrast? (i.e., find differences/similarities in activation between two contrasts)

One quick way to visually compare activations between 2 contrasts is by displaying the results of 2 contrasts and their overlapping regions. The following steps allow you to show the thresholded results of 2 different contrasts (A and B) in two different colors; overlapping regions between A and B are displayed in a third color.

1. bring up your results from one contrast (contrast A)

2. hit the "write filtered" button to create an image of the activations

3. bring up your results from the other contrast (contrast B)

4. hit the "write filtered" button to create an image of the activations

5. To create the "overlap" image of A and B, hit "imcalc", select the 2 images that were created using "write filtered", then enter "i1.*i2" in the calculation field and save the image.

6. To display, hit "Display ROIs" in the "ROI Toolbox" panel (i.e. display this panel by typing "roimod1" at matlab prompt).

a. Select the background anatomical image first (could be "single_subject_T1" the "canonical" directory)

b. For number of images to display, hit "3"

c. Select the 3 images you created above (A, B, and overlap)

d. Choose you want each image to display in (I recommend, yellow, blue, and green - in that order)

e. The image will be displayed in SPM's Graphics window and you can use the

usual "print" function to create an exportable image.

NOTE: The numbers contained in the overlap image shouldn't be used for quantitative analyses; this image is just used for display purposes. If you want to do a legitimate conjunction analysis, you should take the minimum T-value across image A and B - I have a program to do this which I can give more info about if needed. -Dara


Contrasts Papers

Useful Papers - Contrasts


Ward, AFNI 3dDeconvolve manual (in particular, pages 5-16 and 43-47) PDF

Summary: An excellent overview of the basic statistical model and the difference between F- and t-tests. Some good examples (p.43-47) of how to design contrasts to test particular questions about the differences between two or more impulse functions. Also contains a good overview of the deconvolution model and its derivation from the basic statistical model.

Bottom line: F-tests simultaneously test several linear contrasts; t-tests only test one at a time.

Veltman & Hutton, SPM99 Manual (in particular, pages 65-80 - some about PET and a lot of junk in there, but some good clean summaries of how to make F-contrasts or conjunction tests and what they mean (e.g., p. 73)) PDF

Summary: Some nice walk-throughs of how to design conjunction contrasts, F-contrasts, and other somewhat advanced techniques in SPM. Some particularly good points are made about F-contrast construction - how to make them for several basis functions at once, for example, their relation to t-contrasts, and other points.

Bottom line: Describes F-contrasts as an OR-ing of linear constraints and conjunctions as a conjunction of 'em.

Friston, "Statistical Parametric Mapping", chapter to appear in Human Brain Function II (pages 19-23, and 24-26 to a lesser degree) PDF

Summary: Further, quite concise, overview of the GLM and what contrasts mean. A nice breakdown of how the various ways of testing for activity (correlation coefficient, ANOVA, etc.) all break down to the GLM, and explanation of why the T-contrast (and hence F-contrast) supersede those ways generally. Some description of the 'basis function' model of HRF that transitions between the canonical and the FIR model.

Bottom line: The T-contrast serves as a more versatile version of the correlation coefficient, and the F-contrast tests a combination of T-contrasts.

Friston et. al, "Event-related fMRI: characterizing differential responses," NeuroImage 7, 30-40 PDF

Summary: Introduction of the basis-function approach to event-related analysis, where the design matrix is convolved with a non-Fourier basis function set, designed to span the space of reasonable responses in a compact way. The set introduced here separately models the magnitude and latency of a canonical HRF, and experimental evidence shows different areas differ on those two different metrics.

Bottom line: The basis-function approach can be an effective way to analyze event-related data and can shed light on otherwise difficult-to-interpret questions about the impulse response function

Price & Friston, "Cognitive conjunction: a new approach to brain activation experiments," NeuroImage 5, 261-270 PDF

Summary: The original paper describing conjunction designs as an alternative to subtraction designs. The difference from subtractions is outlined, as are the important ways conjunction analyses fit in well with factorial designs. A primitive way of doing a conjunction analysis (since discarded) is laid forth, and some PET data analyzed with these methods is examined.

Bottom line: Conjunction experiments can test for conjunctions of linear constraints - an important set of questions that hadn't been examined up to this point.

Price et. al, "Subtractions, conjunctions and interactions in experimental design of activation studies," Human Brain Mapping 5, 264-272 PDF

Summary: A follow-up to the Price & Friston paper above, this paper more explicitly describes the difference between conjunction and subtraction analyses in terms of how the former takes account of the interaction term (should there be one). Factorial designs used with conjunction analyses allow a much more explicit treatment of those interactions. Some PET data is examined.

Bottom line: Follow-up to above; more detailed in its treatment of interaction terms.

Friston et. al, "Multisubject fMRI studies and conjunction analyses," NeuroImage 10, 385-396 PDF

Summary: Attempts to get by one big hurdle of the standard conjunction analysis, which is that it's only available in fixed-effect fashion, preventing that style of analysis from being used in a true random-effects group model. The authors suggest a formula that allows one to used a fixed-effect conjunction group model and calculate a reasonable confidence interval of the population that might share such an effect.

Bottom line: Extension (slightly shaky) of the conjunction-analysis model to the group level.


Coregistration FAQ

Frequently Asked Questions - Coregistration

1. What is coregistration?

Remember realignment? It's just like that. It's a way of correcting for motion between images. But coregistration focuses on correcting for motion between your anatomical scans and your functional scans. The slightly trickier thing about that is that your anatomical scans might be T2-weighted, while your functionals are T1-weighted. Or maybe you have a Spoiled Grass anatomical and PET functional images. The intensity-based motion correction algorithms kind of choke on those. So coregistration aims for the same result as realignment - lining up two neuroimages - but uses different strategies to get there.

2. What are the different ways to coregister images?

These days, there are a few, but one de facto standard, which is coregistration by mutual information (MI). In some ways, coregistration is always the same problem - it's just like realignment, but can be between different modalities (PET, MRI, CAT, etc.), and usually you can be slower at it. The problem boils down to finding some function that measures the difference between your two images and then minimizing (or maximizing) it. Minimization/maximization is pretty standard these days; the question is what sort of cost function you use.

In realignment, we just used the sum of the squared difference in intensity between the images, measured voxel-by-voxel. The trouble with these scheme in coregistration is that your images might be different modalities, and hence a tissue type that's very dark in one (say, ventricle in PET) might be very bright in another (say, proton-density-weight MRI). In that case, trying to minimize the intensity differences between the images will give you a horrible registration.

So there are a couple strategies. SPM99 and older used templates within each modality that were already coregistered by hand with each other; that way, you could just realign your images to their modality-specific templates and automatically put them in register. These days, though, almost all automated coregistration schemes (including SPM2) use MI or some derivative of it.

3. What is mutual information, exactly?

Maes et. al, below, give a pretty good summary of it, but here's the nutshell version. If two variables A and B are completely independent, their joint probability Pab (the probability that A comes up a at the same time that B comes up b) is just the product of their respective separate probabilities Pa and Pb; so Pab=Pa*Pb, if A and B are independent. On the other hand, if A and B are completely dependent - that is, knowing what value A takes tells you exactly what value B will have - then their joint probability is exactly the same as their respective separate probabilities; so Pab = Pa = Pb, if A and B are completely dependent. If A and B are dependent a little bit, but not entirely, their joint probability will be somewhere in between there - knowing what value A has tells you a little bit about B, so you can make a good informed guess at what B will be, but not know it exactly.

Mutual information is a way of measuring to what extent A and B are dependent on each other. Essentially, if you can estimate the true joint probability Pab for all a and b, and you know the individual probability distributions Pa and Pb, you can measure how far away the probability distribution Pab is from Pa*Pb, with a Kullback-Leibler statistic that measures the distance between curves. If Pab is much different from Pa*Pb, then you know A and B are dependent to some degree.

Alternatively, you can frame MI in terms of uncertainty; MI is the reduction in uncertainty about B you get by looking at A. If you're much more certain about A after looking at B, then A and B have high MI; they're quite dependent. If you don't know anything more about A after looking at B, then they have low MI and are pretty independent.

4. So how does mutual information help coregistration?

MI coregistration methods work by considering the intensity in one image to be A and the intensity in the other image to be B. The algorithm computes the MI between those two variables - finds the mutual information between the intensity in one image and the intensity in the other - and then attempts to maximize it.

The idea is that, instead of squared-intensity-difference methods which assume that a bright voxel in one image must be bright in another, you let the images themselves tell you how they're related. If by looking at a bright voxel in one image, though, tells you almost infallibly that the corresponding voxel in the other image is dark, then the images have very high MI, and they're probably close to registered. You can leave unspecified the relationship between intensities in the two modalities, and let the algorithm figure out how they're related - it automatically maximizes whatever relationship they have. This makes MI ideal for coregistering a wide variety of medical images.

5. How are coregistration and segmentation related?

Fischl et. al (SegmentationPapers) make the point that the two processes operate on different sides of the same coin - each one can solve the other. With a perfect coregistration algorithm, you could be maximally confident that you could line up a huge number of brains and create a perfect probability atlas - allowing you the best possible prior probabilities with which to do your segmentation. In order to do a good segmentation, then, you need a good coregistration. But if you had a perfect segmentation, you could vastly improve your coregistration algorithm, because you could coregister each tissue type separately and greatly improve the sharpness of the edges of your image, which increases mutual information.

Fortunately, MI thus far appears to do a pretty good job with coregistration even in unsegmented images, breaking us out of a chicken-and-egg loop. But future research on each of these processes will probably include, to a greater and greater extent, the other process as well.

Check out SegmentationFaq for more info on segmentation...


Coregistration HOWTO

How-Tos - Coregistration

How do I...

Do coregistration in SPM?

Number of subjects: 1

Which option? Options: coregister only, reslice only, coregister & reslice.

Select coregister only. "Reslice only" will take a single image with a .mat file as input and create a new rV .img file that incorporates the transformations specified in the .mat file directly into the image, so the new rV.img doesn't have a .mat file but has had its actual voxels moved around instead. Reslicing isn't perfect, though - it introduces slight interpolation errors - so don't do it more often than necessary. You shouldn't need to re-slice your anatomical file if you're only using it to get good normalization estimates, or as a background image for displaying functional activation. "Coregister and reslice" will do both steps at once - coregister two images and reslice the source image to make a new, lined-up rV.img file.

(These following questions about modality are in SPM99 only - SPM2 coregistration is by mutual information (see CoregistrationFaq) and so the modality isn't important and it doesn't ask).

Modality of first target image?

Select EPI. The target image is the one to which you coregister. EPI is the generic option for functional MRI images - it can be selected for both echo-planar and spiral images.

Modality of first object image?

Select T1 MRI, probably. The object image is the one which is being coregistered. Select object - T1 MRI if the structural image looks dark where gray matter should be and bright where white matter should be. If the structural images has the opposite contrast (bright where gray matter should be and dark where white matter should be), it is probably object - T2 MRI, so select appropriately.

Select target image for subject 1: Select the mean image created by realignment (see RealignmentHowTos): session1/mean*V001.img, DONE

Select source image (in SPM2)/ object image (in SPM99) for subject 1: Select the the inplane anatomical image : anatomy/inplane/V001.img, DONE

Select other images for subject 1: DONE (do not select any images). If any images are selected as "other", the tranformation parameters estimated to coregister the object to the target image will also be applied to the "other" images.

The coregistration will now run, which should only take a few minutes, maybe less. This step will find the transformation that maps the anatomical inplane image into the space of the functional images (as defined by the mean image). A new anatomy/inplane/V001.mat image will be created (although it is not yet applied - no reslice image is created).

Check my coregistration in SPM?

Use the "Check Reg" button in the main SPM interface. You can use this to display up to 15 images at once and compare all their joint registration; we'll describe here using it for two images, say, immediately after you've coregistered your inplane anatomy and your mean functional image.

Select two images: session1/mean*V001.img, anatomy/inplane/V001.img, DONE.

This will display the two images, with the mean image in the top half and the implane at the bottom. The better the images' outlines coincide (check that by moving the cursor on the outlines) the better coregistered they are.

Figure out which is my "source" and which is my "target" image in SPM2?

Your "target" image is the one which holds still. It's the one that you want to coregister to. In the standard Gablab SPM preprocessing protocol, this is your mean functional image after realignment. The "source" image is what was called the "object" image in SPM99 - it's the image that is moved to line up with the target image. In the picturesque words of John Ashburner, the source/object is the one that's "jiggled about" to fit the target image. Another way of remembering this is that the "target" image is what you're trying to get to, while the source/object image is the one that has transformations applied to it.

Fix a bad coregistration?

First, a note for SPM99ers - if you have a bad coregistration with the standard defaults, try running it again by using mutual information coregistration. You can do this by first deleting any previous .mat files for your source/object image, then starting SPM, hitting "Defaults," choosing "Coregistration," and then choosing "Use Mutual Information Coregistration." Now run coregistration again.

For the rest of us...

If the automated methods fail, you may have to tweak a bad coregistration by hand. Often, if the automated coregistration fails badly, a good way to go is by getting the two images "pretty close" to in line by hand, and then re-running the coregistration program. This would fix the case where the program has bad starting estimates - they may start far apart from each other or just have something funny about how they started, and by lining them up differently before you start the automated program, you can improve the final estimates from the program. Other times, there may just be something strange about the brains in question, and the automated coregistration can fail no matter how close they start out. In that case, your best bet is to adjust by hand the whole way.

The adjusting by hand is handled by different programs in different packages - 3dNudge in AFNI, the "Display" button in SPM. The idea is the same, though, so you can use the SPM directions for reference.

Start SPM and hit the "Display" button. Choose your source image to display - the one you want to move around to fit the other one. It's often helpful to have the target image simultaneously displayed in another copy of SPM, if you can. The Display interface has a panel in the lower left where you can make translations, rotations, and zooms in all three dimensions. By hand, starting with small movements, try and adjust the numbers in there to best line up the source image with your target image. When you've made a movement you want to save, hit the "Reorient images..." button in the lower-left. Choose the source image.

This will save the movements you've made in the .mat file associated with that image. Any time you save a movement, it adds onto the current .mat file - the file isn't overwritten, even though the numbers in the movement panel may reset. When you think you have the image close, save your current movements with "Reorient images" and use SPM's "Check Reg" button to compare the source and target images side-by-side. If they're not as close as you like, go back to the "Display" function and keep tweaking.

When the images are aligned as well as you'd like, you don't have to do anything more - the movements are saved already in the .mat file. If you want to apply that .mat file to the actual .img file, you can choose "Coregister" from SPM and choose the "reslice only" option - this will create a new rV .img file that doesn't have a .mat file but will be properly aligned.


Coregistration Papers

Useful Papers - Coregistration


Maes et. al (1997), "Multimodality image registration by maximization of mutual information," IEEE Transactions on Medical Imaging 16(2), 187-198 PDF

Summary: One of the fundamental mutual information (MI) coregistration papers, this lays out the concepts of MI in a reasonably easy-to-follow manner and explains how to use MI as a cost function in coregistration. The advantages relative to other cost functions (least-squares, for example) are laid out and a set of experiments are perfomed to show the algorithm's effectiveness, which is subvoxel. A couple different interpolation schemes are also discussed.

Bottom line: An excellent paper summarizing the advantages of MI as a coregistration cost function.


Zhu & Cochoff (2002), "Influence of implementation parameters on registration of MR and SPECT brain images by maximization of mutual information," Journal of Nuclear Medicine 43(2), 160-166 PDF

Summary: The authors compare different settings of a coregistration algorithm for their effectiveness and accuracy; parameters varied included type of interpolation, number of bins to create the probability density function, and a couple different sampling optimization techniques.

Bottom line: Trilinear interpolation works best, as does adaptively changing bin numbers. Simplex optimization or multiresolution optimization were also effective ways to improve the success rate and speed the coregistration.

Ashburner & Friston (1997), "Multimodal image coregistration and partitioning - a unified framework," NeuroImage 6, 209-217 PDF

Summary: The original paper defining the old SPM (pre-SPM2) way of doing coregistration. The authors suggest defining within-modality templates that are already coregistered and using least-squares methods to coregister the experimental images to those templates. Using segmentation during the coregistration can help improve the success and accuracy of that registration.

Bottom line: A bit obsolete these days; SPM2 has moved to MI coregistration, which is simpler and shows better success rates.

Nestares & Heeger (2000), "Robust multiresolution alignment of MRI brain volumes" Magn Reson Med. May;43(5):705-15 PDF

Summary: Discusses the algorithm former Stanford prof David Heeger used in his lab to do realignment for visual cortex; they enhance the robustness of their algorithm by automatically masking out voxels whose intensity difference between images exceeds some threshold. Results from the algorithm are presented.


Design FAQ

Frequently Asked Questions - Experimental Design

The major tradeoff in planning your experimental design, from a statistical standpoint, is the fundamental one between efficiency and power. In the context of fMRI, power is the conventional statistical concept of how likely it is that your experiment will correctly reject the null hypothesis of no activation; it might be thought of at how good your experiment is at detecting any sort of activation at all. Efficiency, by contrast, is the ability to accurately estimate the shape of the hemodynamic response to stimuli - the variability that your average detectable hemodynamic response has. This is clearly important if you're interested in looking at response timecourse or HRF shape, but also important in finding activations at all - if variability in your modeled response is high, it's more difficult to distinguish one condition from another.

The tradeoff between power and efficiency is, unfortunately, inescapable (shown by Liu et. al in DesignPapers) - you can't be optimal at both. Things that increase power include block design experiments, very high numbers of trials per condition, and increased numbers of subjects. Things that increase efficiency include designs with randomized inter-trial intervals (also called inter-stimulus intervals or ISIs) and analyzing your design with an event-related model (whether the design was blocked or not). Semi-random designs can give you a good dollop of both power and efficiency, at the cost of increased experimental length. Where you fall in designing your experiment will depend on what measures you're interested in looking at - but within the given constraints of a particular number of subjects, a reasonable experimental length, and a guess at how big an effect you'll have, there are good steps you can take to optimize your design.

Experimental design is heavily mixed in with setting your scanning parameters, and jittering your trial sequence, so be sure to check out the other design-related pages:

1. What are some pros and cons of block designs?

Pros: High power, lower number of trials and subjects needed. Cons: Low efficiency, high predictability (which may be a problem for certain tasks from a psychological perspective).

2. What are some pros and cons of event-related designs?

Pros: High efficiency even at lower trial numbers, can have randomized stimulus types and ISIs. Cons: Low power, more trials/subjects needed, more difficult to design - efficiency advantages require jitter (see JitterFaq) or randomized ISIs.

3. What's the difference between long and rapid event-related designs? What's good and bad about each?

Long event-related designs have long ISIs - usually long enough to allow the theoretical HRF to return to baseline (i.e., 20 or 30 sec). Rapid event-related designs have short ISIs, on the order of a few seconds. Long event-related designs have generally fallen out of favor in the last few years, as proper randomization of ISI allows rapid designs to have much greater efficiency and greater power than long. Until the very late 1990s, it wasn't entirely clear that rapid event-related designs would work from a physiological perspective - that the HRFs for different trials would add roughly linearly. Since those assumptions have been (more or less) vetted, the only advantage offered by long event-related designs is that they're much more straightforward to analyze, and that rarely outweighs the tremendous advantages in efficiency and power offered by the increased trial numbers of rapid designs.

4. What purpose would a mixed (block and event-related) design serve? Under what circumstances would I want to use one? How do I best design it?

Mixed designs, which can include both block and event-related periods, or semi-random designs which have blocks of relatively higher and lower odds of getting a particular trial type, can give you good power and efficiency, but at the cost of longer experiments (i.e., more trials). They're more predictable than fully randomized experiments, which may be a problem for certain tasks. AFNI, SPM and Tom Liu's toolbox all have good utilities to design semi-random stimulus trains - see DesignHowTos.

5. How long should a block be?

From a purely theoretical standpoint, as described by Liu and others, blocks should be as long as possible in order to maximize power. The power advantage of a block comes from summing the HRFs into as large a response as possible, and so the highest-power experiment would be a one-block design - all the trials of condition in a row, followed by all the trials of the next condition. The noise profile of fMRI, however, means that such designs are terribly impractical - at least one and probably two alternations are needed to effectively differentiate noise like low-frequency drifts from the signal of your response. So from a theoretical standpoint, Liu recommends a two- or three-block design (with two conditions, two blocks: on/off/on/off, with three conditions, two blocks: A/B/C/A/B/C, etc.). With few conditions, this can mean blocks can be quite long.

In practice, real fMRI noise means that two or three-block designs may have blocks that are too long to be optimal. Skudlarksi et. al (see DesignPapers), using real fMRI noise and simulated signal, recommend about 18 seconds for complex cognitive tasks where the response time (and time of initial hemodynamic response onset) is somewhat uncertain (on the order of a couple seconds). For simple sensory or motor tasks with less uncertainty in that response, shorter blocks (12 seconds or so) may be appropriate. Of course, you should always take into account the psychological load of your blocks; with especially long blocks, the qualitative experience may change due to fatigue or other factors, which would influence your results.

Bledowski et al. (2006, HrfPapers), using empirically derived estimates of the HRF, mathematically derive a 7-sec-on, 7-sec-off block pattern as being optimal for maximizing BOLD response, suggesting it's a bit like a "swing" - pushing for the first half, then letting go, maximizes your amplitude.

6. How many trials should one block have?

As many as you can fit in to that time. The more trials the better.

7. How many trials per condition are enough?

In terms of power, you can't have too many (probably). The power benefits of increasing number of trials per condition continue increasing until at least 100 or 150 trials per condition (see Desmond & Glover and Huettel & McCarthy in DesignPapers). In terms of efficiency, 25 or more is probably enough to get a good estimate of your HRF shape.

8. How can I estimate the power of my study before I run it?

Several of the papers below have detailed mathematical models for trying to figure that sort of thing out; if you can make an educated guess at how large (in % signal change) your effect might be from the literature, Desmond & Glover (in DesignPapers) can give you a decent range of estimation.

9. What's the deal with jitter? What does it mean? Should I be doing it?

Jitter probably deserves its own FAQ, so check out JitterFaq for more info about it...

10. Do I have to have the same number of trials in all my conditions?

This question comes up especially for subsequent memory analyses, or things like it, where subjects might have only remembered a fraction of the trials they've looked at, but have forgotten a whole lot. If you're trying to compare remembered against forgotten in that case, is that okay? Depends on exactly the ratio. First and foremost, if a given condition has too few trials in general, you'll lose a lot of ability to detect activation in it - as above, if you don't have at least 25 trials in a condition in an event-related study (over the whole experiment), you're probably starting to get on thin ice in terms of drawing inferences between conditions. But the ratio of trial numbers between conditions can also have an influence. Generally, neuroimaging programs assume that the different columns of the design matrix you're comparing have equal variance, and a vast difference in numbers between them will invalidate that assumption. In practice, this probably isn't a huge concern until you're dealing with ratios of 5 or 10 to 1. If you have 35 trials in one condition and 100 in another - it's not ideal, but you probably won't be too fouled up. If you have 30 in one and 300 in another... it's probably cause for some concern.

11. How many subjects should I run? How many do I need?

Short answer: 20-25 subjects is a good rule of thumb. Long answer: Obviously this is affected to some degree by situations like funding, etc. But from a statistical perspective, this question boils down to what the levels of noise in fMRI are, or a power analysis: how many subjects should you have in order to detect a reasonably-sized effect a reasonable amount of the time? Using moderate estimates of effect sizes (0.5%) and estimating within- and between-subject noise from real data, Desmond & Glover (2002; DesignPapers) calculated that 20-25 subjects were needed for 80% power (i.e., chance of detecting a real effect) with about the loosest reasonable whole-brain p-threshold. Smaller effect sizes might require more subjects for the same power, and looser p-thresholds (i.e., for an a priori anatomical hypothesis) might require fewer subjects. But in general, the 20-25 subject barrier is a pretty good rule of thumb. You aren't ever hurt by more subjects than that (although very large sample sizes can start tongues wagging about how small your effect size is, and you don't want to get into a fight about size - we're adults, after all). But unless you're very sure your effect size is much bigger than average, having fewer than 20-25 subjects means you're likely to be missing real effects. Check out Desmond & Glover (DesignPapers) for detailed analysis.


Design Papers

Useful Papers - Experimental Design

Also check out JitterPapers, ScanningPapers, FmriPhysicsPapers and PhysiologyPapers...


Mechelli et. al (2003), "Comparing event-related and epoch analysis in blocked design fMRI," NeuroImage 18, 806-810 PDF

Summary: Traditionally, blocked design experiments (i.e., many trials in a row of the same condition) have been analyzed by convolving an HRF with a 'boxcar' regressor - an "epoch-related" model. Mechelli et. al show that even in block experiments, using an "event-related" model - i.e., treating each trial as a separate event, convolving each impulse regressor with an HRF, and summing the resulting regressors - can be better at detecting activations for at least some experiments.

Bottom line: Whether you choose blocked or randomized stimulus presentation, using an event-related model for your analysis is worth a try.

Dale (1999), "Optimal experimental design for event-related fMRI," Human Brain Mapping 8, 109-114 PDF

Summary: Within event-related designs, there is a debate about how best to organize stimuli - in a long periodic train, with randomized inter-stimulus intervals (ISIs), or somewhere in between. Dale shows that in terms of efficiency (the accuracy of estimating the shape of the HRF), event-related functions with randomized ISIs outperform those with fixed ISIs, and as the mean ISI drops, that outperformance becomes incredibly large.

Bottom line: For event-related designs, if efficiency is a concern, having a randomized inter-trial interval and a short mean inter-trial interval is a large advantage over either a fixed ISI or a long mean ISI.


Liu et. al (2001), "Detection power, estimation efficiency, and predictability in event-related fMRI", NeuroImage 13, 759-773 PDF

Summary: Introduces the distinction between efficiency (accuracy at estimating the shape of the HRF) and power (ability to detect any activation at all), and shows that any experimental design makes a fundamental tradeoff between the two. Describes mixed designs - essentially event-related designs with blocks of relatively higher concentrations of a particular stimulus type - as being a good tradeoff between efficiency and power, at the expense of longer experiments.

Bottom line: Mathematically shows block designs are great for power but poor for efficiency, and event-related designs make the opposite tradeoff. Confirms the goodness of randomized ISIs for event-related designs.

Desmond & Glover (2002), "Estimating sample size in functional MRI (fMRI) neuroimaging studies: statistical power analyses," Journal of Neuroscience Methods 118, 115-128 PDF

Summary: Used real fMRI data to get good estimates of average within-subject and between-subject variability in fMRI, and brought those into a mathematical model to determine how many subjects are necessary to detect an "average"-sized activation under several variability assumptions. Also worked on how many trials per condition are necessary to achieve a certain power level.

Bottom line: At reasonable p-thresholds, with an average activation, around 20-24 subjects were needed to maintain a reasonable power level. This is especially crucial with low effect sizes, which increase the number of subjects needed faster than linearly. The benefits of increasing the number of trials/condition don't level off until around 100. Spatial smoothing is a good strategy to decrease within-subject variability.

Mechelli et. al (2003), "Estimating efficiency a priori: a comparison of blocked and randomized designs," NeuroImage 18, 798-805 PDF

Summary: Mechelli et. al show that even when activation values are identical in a block and an event-related experiment, the block-related design may show higher t- or Z-values, due to differences in the standard error across experiments induced by the experimental design.

Bottom line: Comparing t- or Z-statistics directly across experiments is tricky, because a good part of the variability in those numbers depends on the actual experimental design - and that part of the variance is difficult to establish before the experiment.


Skudlarski et. al (1999), "ROC analysis of statistical methods used in functional MRI: individual subjects," NeuroImage 9, 311-329 PDF

Summary: A signal detection statistic (the receiver operator characteristic (ROC)) is calculated for real fMRI data into which fake activations of known extent and timecourse have been introduced, in order to find out what effect various preprocessing and experimental design manipulations have on detecting those known activations.

Bottom line: Lots of interesting results, but of prime interest for this week: with behavioral and hemodynamic response times on the (typical) order of a few seconds, blocks of around 18 seconds are best for block design experiments. Having several sessions (experimental runs in one scanning session) is a good way to improve signal detection. Other bottom lines of interest: Global intensity normalization and low-pass filtering don't help. High-pass filtering (linear and quadratic) and smoothing raw images does.

Huettel & McCarthy (2001), "The effects of single-trial averaging upon the spatial extent of fMRI activation," NeuroReport 12, 1-6 PDF

Summary: The authors used real event-related fMRI data, extracted varying sizes of subsets of trials from their runs, and treated those subsets as independent experiments, to determine what effect the number of trials per condition had on two factors: the spatial extent of their activation (power), and the accuracy of their estimated hemodynamic response (efficiency).

Bottom line: The extent of activation had a clean exponential relationship to number of trials, and the asymptote (point of diminishing returns) of that relationship wasn't reached until around 150 trials per condition; in other words, the benefits of increasing number of trials per condition continued to go up until around 150. However, variability of the estimated hemodynamic response was pretty stable after getting to about 25 trials per condition - so that might be a good benchmark for experiments concerned with estimating HRFs.


GabLab Scripts

In the Gabrieli Lab, the current development team and a number of current and former students and postdocs have written a lot of custom code over the years to do new analyses or simplify doing old analyses.

One big package of those scripts, which relates mostly to region-of-interest (ROI) analyses and artifact detection, is collected in the RoiToolbox, but there are others as well - some of which were written for particular people, some of which are intended for labwide use.

If you have any code out there you've written and found useful or think other people might find useful, let us know, even if it's designed for your own directory structure. We'd be happy to make it generic and usable by everyone, so the lab can share all the tools people are out there writing.

This is a quick overview of some of the scripts that I know are available; this page will probably be further subdivided as the list grows. For now, this'll do, though.

GlmMask and glm_specmask

These scripts do essentially the same thing; GlmMask is a stripped-down version, where glm_specmask allows a couple other options. They're both intended to avoid a problem with SPM where voxels that have low intensity are dropped from further analysis. These scripts allow the user to explicitly specify a mask image before model estimation that can explicitly include all the voxels in the brain, to make sure every voxel in the brain is included in the model. The difference between the two is where that mask image comes from. GlmMask assumes the user already has an appropriate mask image and asks for a preexisting image file; glm_specmask allows the user to specify a preexisting mask, but if he/she doesn't have one, it allows the user to create one based on a particular subject's anatomy.

contrast_creator2, contrast_copier

I'm actually not super familiar with contrast_creator2, but I think these both do essentially the same thing, which is copy individual subjects' contrast images into a central directory and rename them based on the subject they came from, to make doing a lot of group analyses on those images a lot faster - you can pick all the images in a single directory instead of having to navigate to every subjects' results directory for every analysis. Sue, do I have this one right?


A great program written by Kalina Christoff that, given an SPMcfg.mat file, accumulates information about each condition in the experiment (like where it's located in the design matrix, what its onset times are, etc.) and gets it into a matlab structure array. Very useful for writing your own scripts.


The predecessor of our % signal change code now in the RoiToolbox, this is also a Kalina program that extracts percent signal change from the Y.mad file created by SPM99's model estimation. Semi-obsolete now, as information in the Y.mad is debatably reliable, and the filtering in this program has been replaced in later versions.

Gablab Toolbox

Gablab Toolbox

In the Gabrieli Lab, one major package of custom programs we've written is the Gablab Toolbox, which is available to anyone working on a Gablab machine (and someday on this here website).

This page has a quick description of what all the scripts in the toolbox do, and links to the full documentation for each script. A couple of the more complicated scripts additionally have their own FAQ pages, which are linked below under their names.

You can always also access the documentation for any script from the ROI Toolbox itself, in the upper right-hand corner. Make sure you're running the graphical version of Matlab, or else you won't be able to get to them. They're also available in the Gablab file structure at /usr/fmri_progs/matlab/spm99/devel/help (and mirrored at /usr/fmri_progs/matlab/spm2/devel/help).

You can run the Toolbox by running spm2-devel (or spm99-6-devel) from a terminal prompt at a Gablab machine (which will bring up Matlab), and then typing "roimod1" at the Matlab prompt. The Toolbox is compatible with Matlab 6.5 (R13) and 6.0 (R12).

Keep in mind that when using the Toolbox with SPM2, the programs that create image files may be subject to ?SideFlipping problems. Be careful...

Step-by-step directions for using many of these scripts can also be found at RoisHowTos. RoisFaq can also be an invaluable reference for understanding why you want to use many of these scripts, and to what purpose.

Toolbox Overview

Global Variate (artdetect5.m) (full docs: global_variate_readme.txt)

An interactive tool to identify and repair motion-related outlier images in your experiment. The tool displays a plot of global intensity values for each scan, z-scores for each of those intensity values, and plots the realignment movement parameters for each scan, so you can identify scans whose intensity values are way outside the mean and which occurred at the same moment as a large head movement. The tool then allows you to repair the timecourse by replacing outlier scans with a mean functional image or with an interpolated image created from the outlier’s neighboring scans.

Movement Parameters (plot_move.m) (full docs: movement_parameters_readme.txt)

A simple display tool to look at the realignment movement parameters for a given scan session. Parameters are plotted on two sets of axes; the first displays x,y,z motion for the head in mm, while the second plots pitch, roll, and yaw motion for the head in radians.

Movie of Images (spm_movie.m) (full docs: movie_of_images_readme.txt)

Runs through every image in a given timecourse as a movie, which allows quick viewing of all the scans. Useful to detect bizarre outlier scans that automated methods might miss.

ROI stats (roi_stats.m) (full docs: roi_stats_readme.txt)

Given an ROI .img file and a set of data or beta images to extract from, this function extracts the number of non-masked voxels in the ROI in each image, the average intensity value of all voxels in the ROI from each image, the variance of intensities across all voxels in the ROI from each image, and the min and max intensities in the ROI from each image, and returns a data structure containing vectors of all those values.

ROI Extract (roi_extract.m) (full docs: roi_extract_readme.txt)

Just like ROI stats, but this function only extracts the mean. Given an ROI .tal and a set of data images to extract from, this function extracts the average intensity value across all voxels in the ROI from each image. It optionally writes those values to a text file.

% Signal Change (roi_percent.m) (full docs: percent_signal_change_readme.txt)

This function takes ROI .tal files and a set of data files and extracts the % value that the mean intensity of all the voxels in the ROI differs at each scan from the mean ROI intensity across the whole data set (or for a particular condition). It optionally applies a number of temporal preprocessing options to the data set before the values are extract to clean up the values. It can also be set to an individual voxel mode, in which % signal change is extracted from a single voxel. It writes the values for the whole timecourse out to a text file, as well as average signal change values for each condition of the experiment. The end result is similar to that of RoiDeconvolve (see below), but this program is suitable for block and long event-related designs and less do for rapid event-related designs.

roi_deconvolve (roi_deconvolve.m) (full docs: roi_deconvolve_readme.txt)

This function takes ROI .tal files and a set of data files, and extracts the % value that the mean intensity of all the voxels in the ROI differs at each scan from some selected baseline. It optionally applies a number of temporal preprocessing options to the data set before the values are extracted to clean up the values. It runs that % signal timecourse through a finite impulse response model (see PercentSignalChangeFaq) to re-create peristimulus timecourses for each of your conditions, independent of the contributions of other conditions. Essentially, given the data and a model of the neuronal response (your design matrix), roi_deconvolve re-creates the hemodynamic response for each condition. These timecourses are all written out to text files, and an interactive viewer is created to explore them. The end result is similar to that of RoiPercent (see above), but this program is suitable for rapid event-related designs and less so for block and long event-related designs.

Display ROIs (display_rois.m) (full docs: display_rois_readme.txt)

This button pops up the standard SPM interactive display screen, with three orthogonal views, and then allows the user to superimpose up to three ROI images on top of the background image in different colors.

Display Slices (display_slices.m) (full docs: display_slices_readme.txt)

This button asks for a set of background and ROI images to display, then asks the user to select what sort of image (structural or blob) each is, as well as the desired orientation for displayed slices and a range to pick slices from; it then pops up a non-interactive multiple-slice viewing window with any ROI images displayed as blobs, suitable for printing.

Render (spm_xbrain.m) (full docs: SpmHowTos or SPM manual)

Activates the SPM render facility to create a rendered image of the brain on which an ROI image may be superimposed; the results are displayed in a non-interactive window suitable for printing.

img2txt, txt2img (roi_list.m, mm2img.m) (full docs: img2txt_readme.txt, txt2img_readme.txt)

These functions are used to change .img files into .tal files and back again. Changing an image into a text file is pretty self-explanatory, but txt2img, used to convert a .tal into a .img, is a little trickier; since the coordinates in .tal files are listed in millimeters, changing that coordinate list into a voxel-based .img file requires knowing what the voxel size and origin coordinates should be. So txt2img requires an “template image” – a .img which defines the space in which the new .img will be made.

mni2tal, tal2mni (mni2talgui.m, tal2mnigui.m) (full docs: mni2tal_readme.txt, tal2mni_readme.txt)

These functions are used to change ROIs, in the form of .tal files, from MNI space to Talairach space and back again. These create new .tal files in the desired output space. The mni2tal function appends _tal to the filename in its translation process; the tal2mni function appends _mni. See Matt Brett's page on MNI space for info on why these are necessary:

Generate Tal ROIs (tal_roi.m) (full docs: generate_tal_rois_readme.txt)

Uses the Talairach Daemon database to generate ROI .img files based on various anatomical landmarks. ROIs can be generated by intersecting or connecting any gyri, Brodmann areas, hemispheres, tissue types, etc. desired – typical results would be “left amygdale” or “intersection of right BA 10 and inferior frontal gyrus.” These .img files are in Talairach space, so before they are directly applied to SPM results, they should be converted into MNI space with tal2mni.

XYZ_rois (roi_xyz.m) (full docs: xyz_rois_readme.txt)

Allows the user to generate cubical ROIs based on specified x, y, and z limits in millimeters. The initial output is a .tal file called “roi.tal,” but the program automatically enters into the txt2img facility to allow creation of a .img file from the roi.tal file.

roi_process (roi_process.m) (full docs: roi_process_readme.txt)

Combines a sequence of steps intended to be applied to .imgs that have come right out of the Talairach Daemon or “Generate Tal ROIs” button – the idea would be to run that to specify anatomical regions of interest, then immediately run roi_process on these raw .imgs to prepare them for SPM. Roi_process takes a set of ROI .img files and converts them into MNI space (by running them through a .img-to-.tal conversion, a Talairach-to-MNI conversion and a .tal-to-.img conversion back to images), then smooths the ROIs with a specified Gaussian kernel and finally truncates them, converting them into black-and-white images suitable for SPM statistical use.

Smooth (spm_smooth_ui.m) (full docs: smooth_readme.txt)

Activates the SPM smoothing facility to allow spatial smoothing of ROI .img files. See SmoothingFaq for more about smoothing.

Truncate (roi_truncate.m) (full docs: truncate_readme.txt)

After an ROI is smoothed, its image intensities likely no longer consist of only ones and zeros; it may also have been enlarged by the smoothing process. Truncation allows the user to select an intensity threshold, then sets all voxels whose values are below the threshold equal to zero and all voxels above the threshold to one. This creates a black-and-white image suitable for use as a mask.

Reverse Norm (reverse_norm.m) (full docs: reverse_norm_readme.txt)

Given a .tal file containing ROI coordinates and an “_sn3d.mat” file of the sort output by SPM’s normalization process, this function inverts the normalization parameters used to normalize a particular image and applies the inverted parameters to the .tal file. This can be used to take an ROI in standard, MNI space and convert it to one which is precisely fitted to a particular subject’s anatomy.

Timeseries Explorer (full docs: timeseries_explorer_readme.txt)

An interactive tool which allows the display of intensity and % signal change timecourses from particular ROIs and/or particular voxels; these timecourses can be for a complete experiment, or a given condition or number of conditions, and can be updated interactively to examine the effects of temporal preprocessing on the data. Sort of like % signal change above, but interactive.

MARSBAR (marsbar_wrap.m) (full docs :

An outside ROI package developed by Matthew Brett and others which contains some of the functionality of this Toolbox as well as a number of other facilities. See the MarsBaR main page for more details.

Generate func ROIs (spm_results.m) (full docs: RoisHowTos or SPM manual)

This button calls the SPM results facility (just as if you’d hit “Results” in SPM), in order to call up activations for a particular contrast and threshold level. The “S.V.C.” button in the results control panel can be used to isolate a particular cluster and save it out as a .tal file. Check out RoisHowTos for instructions on how to define a functional ROI.

SPM Tal Stats (glassbrain.m) (full docs: spm_tal_stats_readme.txt)

This button can only be used if an SPM results window is currently up. If SPM results are being displayed and this button is selected, a file (possibly more than one) is generated which contains the complete list of the coordinate positions of all the activated voxels in the current results, converted into Talairach space – effectively a way to “dump” voxel locations from SPM into a text file.

Tal stats summary (sum_coord.m)

This function summarizes an output file generated by the Talairach Daemon program, giving you a file which tells you how many voxels were in each location.

Display Tal Space (TSU_wrap.m) (full docs:

This function displays selected functional clusters on an illustrated Talairach atlas, allowing them to be lined up precisely with Talairach-space anatomy, and allows them to be rendered into 3-D on the Talairach brain. It was developed by the PET lab at the Institute of the Human Brain in St. Petersburg, Russia.

tbx_roi (tbx_roi_wrap.m) (full docs:

This button calls up Russ Poldrack’s ROI Toolbox, an outside package developed to work with SPM’s functional ROI capabilities. Provides a number of ways to generate multiple functional ROIs and ROIs of different shapes based around functional clusters. See the TBX main page for more details.


If you don't find what you're looking for here, try searching this site with the search box atop the page. Of course, the Wikipedia and Google are always great resources. Terms in bold in the definitions are defined elsewhere in this glossary.

A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z | Numbers


AC-PC, AC-PC line: Stands for "anterior commissure-posterior commissure." Used to describe the hypothetical line between the anterior commissure (a frontal white matter tract used as the origin of the Talairach coordinate system) and the posterior commissure (another white matter tract in the midbrain). A brain which is properly aligned to Talairach space has the line between the AC and the PC as exactly horizontal. Also see The Commissures (Cambridge Imagers).

affine: in fMRI, a term for certain types of spatial transformations which add together linearly. From Mathworld: "An affine transformation is any transformation that preserves collinearity (i.e., all points lying on a line initially still lie on a line after transformation) and ratios of distances (e.g., the midpoint of a line segment remains the midpoint after transformation)." This includes all translations, rotations, zooming, or shearing (think 'squeezing' one end of a square such that it becomes a trapezoid). Importantly, affine transformations affect the whole image; no affine transformation can tweak one local part of an image and leave the rest exactly the same. The first step in SPM's normalization process is affine, generally followed by nonlinear normalization.

AFNI: One of the most widely-used software packages for analyzing fMRI data. Stands for "Analysis of Functional NeuroImages." AFNI is free, written in C, and runs on UNIX-based systems. Originally developed by ?BobCox at the Medical College of Wisconsin, it's still maintained by him and others at the National Institute of Health. One of the fastest software packages out there, as well as efficient for data storage. Well-supported as well. Check out our AfniNotes page for more.

AIR: A software package for registering anatomical and functional brain images together of a variety of modalities. Stands for "Automated Image Registration." Used primarily to solve the problem of coregistration, but also has motion correction / realignment capacities. Does not include any tools for statistical analysis of images. Originally developed by Roger Woods and others as one of the first registration tools and heavily updated since then.

ANALYZE: A 3-D medical image format developed by the Mayo Clinic. In it, each image volume is saved on disk as a pair of files: a .img file which contains the image data, and a .hdr file which contains header information about voxel size, image origin, etc. Both the .img and .hdr file must be present in order for the image to be read. ANALYZE images are the main image system supported by SPM, and also can be read by other packages like AFNI and BrainVoyager.

anatomical ROI: A region of interest (ROI) in the brain that is constructed from anatomical data (as opposed to functional activation data). Any ROI that is an anatomical structure in the brain - the inferior frontal gyrus, the amygdala, the posterior half of BA 32 - is an anatomical ROI. You can make them with the GablabToolbox and other tools.

anesthesized carp: Subjects in this awesome study - check out PhysiologyPapers for the summary. They just drugged these fish and stuck them straight in an MRI scanner. In tanks of water, of course.

anisotropic, anisotropy: The opposite of isotropic. In other words, not the same size in all directions. Anisotropy (the degree of anisotropic-ness) is one measure used in DTI to determine the direction of white-matter fibers. Smoothing kernels or voxels can also be anisotropic.

ANOVA: Stands for ANalysis Of VAriance. A standard statistical tool used to find differences between the distributions of several groups of numbers. Differs from simpler tests like the t-test in that it can test for differences among many groups, not just two groups. The standard ANOVA model is used in neuroimaging primarily at the group level, to test for differences between several groups of subjects. However, the ANOVA is essentially the same thing as an F-test (test of the F-statistic), which is often used at the individual level as well to test several linear constraints on a model simultaneously. Check out RandomAndFixedEffectsFaq for more info on group testing, and ContrastsFaq for more info on F-tests.

AR(1) or AR(1) + w (or (AR(2), AR(3), etc.): Terms used to describe different models of autocorrelation in your fMRI data. See autocorrelation below for more info. AR stands for autoregression. AR models are used to estimate to what extent the noise at each time point in your data is influenced by the noise in the time point (or points) before it. The amount of autocorrelation of noise is estimated as a model parameter, just like a beta weight. The difference between AR(1), AR(2), AR(1) + w, etc., is in which parameters are estimated. An AR(1) model describes the autocorrelation function in your data by looking only at one time point before each moment. In other words, only the correlation of each time point to the first previous time point is considered. In an AR(2) model, the correlation of each time point to the first previous time point and the second previous time point is considered; in an AR(3) model, the three time points before each time point are considered as parameters, etc. The "w" in AR(1) + w stands for "white noise." An AR(1) + w model assumes the value of noise isn't solely a function of the previous noise; it also includes a random white noise parameter in the model. AR(1) + w models, which are used in SPM2 and other packages, seem to do a pretty good job describes the "actual" fMRI noise function. A good model can be used to remove the effects of noise correlation in your data, thus validating the assumptions of the general linear model. See TemporalFilteringFaq, and the Purdon & Weiskoff paper in TemporalFilteringPapers, for more info.

artdetect (artdetect4, artdetect5, etc.): The name for a tool in the GablabToolbox used for ArtifactDetection. It's the function called by the "Global Variate" button in the Toolbox. Used to look for changes in the global fMRI timeseries that seem way "out of whack" and/or correlated with major head movements in your subject, and hence may represent artifacts in your data. Also provides some crude methods for repairing your data by interpolating around or replacing the artifact-laden images.

artifact: Any transient and/or localized chunk of noise in your fMRI data. Often introduced by problems with the scanner or head movement, but the sources of fMRI artifacts are myriad (metal in the subject, sudden coughing, etc.). St. Paul's Hospital has an interesting list of MRI artifacts in general, with pictures, here. Essentially any noise in the fMRI signal that's localized in either space or time is generally referred to as an artifact. Common artifacts are caused by head motion, physiological motion (cardiac, respiration, etc.), or problems in the scanner itself. Check out ArtifactDetection for info on how to find them, and RealignmentFaq and PhysiologyFaq for info on what to do about motion artifacts.

autocorrelation (function, correction, etc.): One major problem in the statistical analysis of fMRI data is the shape of fMRI noise. Analysis with the general linear model assumes each timepoint is an independent observation, implying the noise at each timepoint is independent of the noise at the next timepoint. But several empirical studies have shown that in fMRI, that assumption's simply not true. Instead, the amount of noise at each timepoint is heavily correlated with the amount of noise at the timepoints before and after. fMRI noise is heavily "autocorrelated," i.e., correlated with itself. This means that each timepoint isn't an independent observation - the temporal data is essentially heavily smoothed, which means any statistical analysis that assumes temporal independence will give biased results.

The way to deal with this problem is pretty well-established in other scientific domains. If you can estimate what the autocorrelation function is - in other words, what, exactly, is the degree of correlation of the noise from one timepoint to the next - than you can remove the amount of noise that is correlated from the signal, and hence render your noise "white," or random (rather than correlated). This strategy is called pre-whitening, and is referred to in some fMRI packages as autocorrelation correction. The models used to do this in fMRI are mostly AR(1) + w models, but sometimes more complicated ones are used. See BasicStatisticalModelingFaq for more info on autocorrelation correction.


B-spline, B-spline interpolation: A type of spline which is the generalization of the Bezier curve. Don't know what I'm talking about? Neither do I. The nice folks at MathWorld have this to say about them: B-Spline. Essentially, though, a B-spline is a type of easily describable and computable function which can take many locally smooth but globally arbitrary shapes. This makes them very nice for interpolation. SPM2 has ditched sinc interpolation in all of its resampling/interpolation functions (like normalization or coregistration - anything involving resampling and/or reslicing). Instead, it's now using B-spline interpolation, improving both computational speed and accuracy.

band-pass filter: The combination of a high-pass filter and low-pass filter. Band-pass filters only allow through a certain "band" of frequencies, while attenuating or knocking out everything outside that band. A well-designed band-pass filter would be great for fMRI experiments, because fMRI experiments generally have most of their frequencies in a certain band that's separable from the frequencies of fMRI noise. So if you could focus a band-pass filter on your experimental frequencies, you could knock out almost all of your noise. In practice, though, it's tricky to design a really good band-pass filter, and since most of the noise in fMRI is low-frequency, using only a high-pass filter works almost as well as band-pass filtering.

baseline: A) The point from which deviations are measured. In a signal measure like % signal change, the baseline value is the answer to, "Percent signal change from what?" It's the zero point on a % signal change plot.

B) A condition in your experiment that's intended to contain all of the cognitive tasks of your experimental condition - except the task of interest. In fMRI, you generally can only measure differences between two conditions (not anything absolute about one condition). So an fMRI baseline task is one where the person is doing everything you're not interested in, and not doing the thing you're interested in. This way you can look at signal during the baseline, subtract it from signal during the experimental condition, and be left with only the signal from the task of interest. Designing a good baseline is crucially important to your experiment. Resting with the eyes open is a common baseline for certain types of experiment, but inappropriate for others, where cognitive activity during rest may corrupt your results. In order to get good estimates of the shape of your HRF, you need to have a baseline condition (as opposed to several experimental conditions). Check out DesignFaq for more.

basis function: One way to look for fMRI activation in the brain is to assume you know the exact shape of the HRF, and look for signals that match that shape. This is the most common way to analyze fMRI data. It suffers, though, in the case where the HRF may not be exactly the same shape from one subject, one region, or even one task, to the next - which we know is true to some degree. Another way is to assume you know nothing about the shape of the HRF and separately estimate its value at every timepoint at every voxel. This is a finite impulse reponse model, and it's more common these days. But it suffers because it gives up many degrees of freedom in order to estimate a ton of parameters. A third way is to assume you know something about the shape of the response - maybe something as simple as "it's periodic," or something as complicated as "it looks kind of like one of these three or four functions here." This is the basis function approach, and the basis functions are the things you think "look" kind of like the HRF you want to estimate. They could be sines or cosines of different periods, which assumes very little about the shape except its periodicity, or they could be very-HRF looking things like the temporal and dispersion derivatives of the HRF. The basis function approach is kind of a middle way between the standard analysis and the FIR model. You only estimate parameters for each of your basis functions, so you get more power than the FIR model. But you aren't assuming you know the exact shape of your HRF, so you get more efficiency and flexibility than the standard analysis. You allow the HRF to vary somewhat - within the space defined by your basis functions - from voxel to voxel or condition to condition, but you still bring some prior knowledge about the HRF to bear to help you. Check out DesignFaq and HrfFaq for more info on the basis function approach.

batch, batch script: Analysis programs with graphical interfaces are nice. But sometimes you don't want to have to push sixteen buttons and type in fourteen options to have to analyze every individual subject in your experiment. It takes a bunch of your time, and you'll probably screw it up and have to start over at some point. So many programs - SPM, AFNI, BrainVoyager - offer a "batch mode," where you can enter in the options you'd like in some sort of scripting language and then just set it to run the program in an automated function, according to the instructions in your batch script. If you'd like to write your own batch scripts, the Gablab has some nice templates for SPM available. Check out MatlabProgramming for more on scripting.

BayleShanks: The extremely nice man who set up the programming for this Wiki site for us a couple years back.

beta image: Also called a parameter image. It's a voxel-by-voxel summary of the beta weight for a given condition. Usually it's written as an actual image file or sub-dataset, so you could look at it just like a regular brain image, exploring the beta weight at each voxel. In SPM, you get one of these written out for every column in your design matrix - one for each experimental effect for which you're estimating parameter values.

beta weights: Also called parameter weights, parameter values, etc. This is the value of the parameter estimated for a given effect / column in your design matrix. If you think of the general linear model as a multiple regression, the beta weight is the slope of the regression line for this effect. The parameter gets its name as a "beta" weight from the standard regression equation: Y = BX + E. Y is the signal, X is the design matrix, E is error, and B is a vector of beta weights, which estimate how much each column of the design matrix contributes to the signal. Beta weights can be examined, summed, and contrasted at the voxel-wise level for a standard analysis of fMRI results. They can also be aggregated across regions or correlated between subjects for a more region-of-interest-based analysis. Check out RoisFaq for more info on beta weights and ROIs.

block design: A type of experiment in which different types of trials are not intermixed randomly, but rather happen in blocks. So you might have 30 seconds in a row of condition A, followed by 30 seconds of condition B, followed by 30 seconds of A again, etc. Used even with shorter trials - that 30 seconds might be looking at a single flashing checkerboard, or it might be six trials of faces to look at. Block designs were the earliest type of design for fMRI and PET, and remain among the simpler designs to analyze and interpret. They have very high power, because the summing of HRF responses across repeated trials means you can often get higher peaks of activation during a block than for an isolated shorter trials. They suffer from very low efficiency (ability to estimate the shape of the HRF). Check out DesignFaq, especially Tom Liu's papers, for the tradeoffs between block designs and other designs.

BOLD (or blood oxygen level-dependent) signal, BOLD response: The type of signal that fMRI actually measures. Check out FmriPhysicsFaq for a primer on fMRI signal, but the nutshell version is this: When neurons fire (or increase their firing rate), they use up oxygen and various nutrients. The brain's circulatory system responds by flooding the firing region with more highly-oxygenated blood than it needs. The effect is that the blood oxygen level in the activated region increases slightly. Oxygenated blood has a slightly different magnetic signature than de-oxygenated blood, due to the magnetic characteristics of hemoglobin. So with the right pulse sequence, an MRI scanner can detect this difference in blood oxygen level. The signal that is thus read in fMRI is called BOLD, or blood oxygen level-dependent. MRI can be used to measure other things in the brain as well - perfusion being among them - but BOLD signal is the primary foundation of most fMRI research. Check out FmriPhysicsFaq for more details.

bootstrapping: A statistics method used when you have to test a distribution without knowing much about its true underlying variance or mean or anything. The skeleton of the method is essentially to build up a picture of the possible space of the distribution by re-shuffling the elements it's made up of to form new, random distributions. Bootstrapping is widely used in many quantitative scientific domains, but it's only recently become of interest in neuroimaging analysis. Some papers have argued that under certain conditions, bootstrapping and other nonparametric ways of testing hypotheses make the most sense to test statistical hypotheses in fMRI. Permutation testing is the neuroimaging concept most related to boostrapping, and it's explored in PthresholdFaq by Nichols and Holmes.

BrainVoyager: A widely-used software package for fMRI analysis. The latest version, BrainVoyager QX, is written in C++ and runs natively on Windows, UNIX-based systems, and Mac OS. The package was originally developed by ?RainerGoebel, and is now actively developed by Brain Innovation B.V. BrainVoyager has perhaps the slickest interface of any of the major analysis packages, as well as having nicer support for surface mapping and inflation analysis techniques than most packages. It's unfortunately not free, but it is available at Stanford in the Brain Imaging Analysis Center.

Brodmann area: An area of the brain that is distinct at the cytoarchitectonic (cellular) level from those around it. There are 52 Brodmann areas, originally defined by Korbinian Brodmann. Many of them map onto various distinct anatomical structures, but many also simply subdivide larger gyri or sulci. Mark Dubin at the University of Colorado has a great map of the areas: Brodmann map. They are often used as anatomical ROIs, but be careful: they have significant variability from person to person in location and function. It's not clear how well functional activation maps onto most Brodmann areas. See RoisFaq for more.


callback: The name given by Matlab to any function or script that is called by pressing a button or clicking a radio button or doing anything else in a Matlab graphical interface. A uicontrol callback is basically synonymous, since almost all callbacks are called from uicontrols. If a callback creates an error, a lot of times Matlab will spit out a totally uninformative "uicontrol callback" error, which only tells you something went wrong during the running of the callback. Often, though, you can coax more info about the error out of Matlab. See ?CommonErrors and MatlabDebugging for more.

canonical HRF: A model of an "average" HRF. Intended to describe the shape of a generic HRF; given this shape and the design matrix, an analysis package will look for signals in the fMRI data whose shape matches the canonical HRF. The different analysis packages (SPM, AFNI, BrainVoyager, etc.) use slightly different canonical HRFs, but they all share the same basic features - a gradual rise up to a peak around six seconds, followed by a more gradual fall back to baseline. Some progams model a slight undershoot; some don't. See HrfFaq for more.

canonical single-subject: see single-subject canonical.

chronometry: A technique in psychology in which the experimenter tries to figure out something about the processes underlying a task by the time taken to do the task and various portions of it. Some of the original chronometric experiments were done with reaction times, having subjects do various stages of an experiment to see whether some parameter might vary the reaction time for one stage and not another. Chronometric experiments have just started cropping up in fMRI. They attempt to determine not just the location of activations, but their sequence as well. This is generally done by getting an extremely accurate estimate of the shape of the HRF and exactly when it begins during the task. See MentalChronometryFaq for more.

cluster: A group of active voxels that are all adjacent, without any breaks. Clusters may include holes, but there has to be a contiguous link (vertical, horizontal or diagonal) from any voxel in the cluster to any other voxel in the cluster. Clusters are often taken to represent a set of neurons all involved in some single computation. They can also serve as the basis for functional ROIs.

colored noise: Noise that is totally random and uncorrelated from point to point is called white noise. So as a play on that phrase, the term colored noise is used to describe any noise that isn't totally random - noise in which the value of one point is at least partially predictable by knowing the value of other noise points. fMRI noise is, in general, heavily colored to start, which poses a problem for standard analyses which assume noise is white. Check out BasicStatisticalModelingFaq for more info.

coloring: Noise that's not white but colored noise poses a problem if you're assuming white noise. On the other hand, if you know exactly how your colored noise differs from white noise, then you could conceivably just remove the effect of the coloration by modeling it away. So having noise with a known "color" is almost as good as having pure white noise. Earlier in fMRI analysis history, when the true colors (ha!) of fMRI noise weren't known well, one thought about dealing with the unknown correlation in the noise was to swamp it with known correlations. In other words, if you colored the noise yourself, then you'd know almost exactly what color it was, and you wouldn't have to worry so much about what color it started as (to extend this metaphor to the breaking point). This was part of the strategy behind low-pass filtering in SPM99 and earlier - to impose a well-described correlational structure on the fMRI signal, and hence overwhelm any unknown structure. Unfortunately, this strategy worked very badly, and is generally not used any longer for fMRI analysis. Check out TemporalFilteringFaq for more on the whole debate.

coregistration: The process of bringing two brain images into alignment Ideally, you'd like them lined up so that their edges line up and the point represented by a given voxel in one image represents the same point in the other image. Coregistration generally refers specifically to the problem of aligning two images of different modalities - say, T1 fMRI images and PET images, or anatomical MRI scans and functional MRI scans. It goes for some of the same goals as realignment, but it generally uses different algorithms to make it more robust. See CoregistrationFaq for more.

con image, contrast image: A voxel-by-voxel summary of the value of some contrast you've defined. This is often created as a voxel-by-voxel weighted sum of beta images, with the weights given by the value of the contrast vector. In SPM, it's actually written out as a separate image file; in other programs, it's usually written as a separate sub-bucket or the equivalent. It shouldn't be confused with the statistic image, which is a voxel-by-voxel of the test statistic associated with each contrast value. (In SPM, those statistic images are labeled spmT or spmF images.) Only the contrast images - not the statistic images - are suitable for input to a second-level group analysis. See ContrastsFaq for more info on contrasts, and RandomAndFixedEffectsFaq for more info on group analyses.

conjunction analysis: A way of combining contrasts, to look for activations that are shared between two conditions as opposed to differing between two conditions. It's implemented in SPM and other packages as essentially a logical AND-ing of contrasts - a way of looking for all the areas that are active in both one contrast and another. It's tricky to implement at the group level, though. Look at ContrastsFaq for more info, and possibly RandomAndFixedEffectsFaq as well.

contrast: The actual signal in fMRI data is unfortunately kind of arbitrary. The numbers at each voxel in your functional images don't have a whole lot of connection to any physiological parameter, and so it's hard to look at a single functional image (or set of images) and know the state of the brain. On the other hand, you can easily look at two functional images and see what's different between them. If those functional images are taken during different experimental conditions, and the difference between them is big enough, then you know something about what's happening in the brain during those conditions, or at least you can probably write a paper claiming you do. Which is good! So the fundamental test in fMRI experiments is not done on individual signal values or beta weights, but rather on differences of those things. A contrast is a way of specifying which images you want to include in that difference. A given contrast is specified as a vector of weights, one for each experimental condition / column in your design matrix. The contrast values are then created by taking a weighted sum of beta weights at each voxel, where the weights are specified by the contrast vector. Those contrast values are then tested for statistical significance in a variety of ways. Check out ContrastsFaq for more info on contrasts in fMRI.

cutoff period: The longest length of time you want to preserve with your high-pass filter. A high-pass filter attentuates low frequencies, or slow oscillations; everything that repeats with a period slower than two minutes, say, you might reject as being clearly unrelated to your experiment. The cutoff period would be two minutes in the above example; it's the longest length of time you could possibly be interested in for your experiment. You generally want to set it to be wayyy longer than an individual trial or block, but short enough to knock out most of the low-frequency noise. See TemporalFilteringFaq for more.

cytoarchitectonic: Relating to the look/type/architecture of individual cells. Not all neurons look exactly the same, and they're not all organized in exactly the same way throughout the brain. You can look in the brain and find distinct places where the "type" of neuron changes from one to another. You might theorize that a cell-level architecture difference might relate to something difference in the functions subserved by those cells. That's exactly what Brodmann theorized, and his Brodmann areas are based on cytoarchitectonic boundaries he found in the brain. Check out RoisFaq for how cytoarchitectonic differences can be used.


DCM: see dynamic causal modeling (DCM).

deconvolution: A mathematical operation in which the values from one function are removed from the values of another. In fMRI, where the signal is generally interpreted to be the result of a neuronal timeseries (which is modeled by the design matrix) convolved with a hemodynamic response function (which is modeled by a canonical HRF, basis functions, or a finite impulse response model), the operation is usually used to separate the contributions of those two functions. The AFNI function 3dDeconvolve (and its Matlab port, RoiDeconvolve) takes an input a design matrix and comes up with an estimate of the HRF. SPM's psychophysiological interaction function attempts to model the interaction of neuronal timeseries (as opposed to fMRI timeseries) by first deconvolving the canonical HRF and then checking the interaction at the neuronal, rather than hemodynamic level.

design matrix: A model of your experiment and what you expect the neuronal response to it to be. In general represented as a matrix (funnily enough), where each row represents a time point / TR / functional image and each column represents a different experimental effect. It becomes the model in a multiple regression, following the vector equation: Y = BX + E. Y is a vector of length a (equal to nframes from the scanner), usually representing the signal from a single voxel. B is a vector of b, representing the effect sizes for each of b experimental conditions. E is an error vector the same length as Y. X is your design matrix, of size a x b. Check out BasicStatisticalModelingFaq for more...

detrend, detrending: There are multiple sources of noise in fMRI - head movement, transient scanner noise, gradual warming of the RF coils, etc. Many of them are simple, gradual changes in signal over the course of the session - a drift that can be linear, quadratic, or some higher polynomial that has very low frequency. Assuming that you don't have any experimental effect that varies linearly over the whole experiment, then, simply removing any very low-frequency drifts can be a very effective way of knocking out some noise. Detrending is exactly that - the removal of a gradual trend in your data. It often refers simply to linear detrending, where any linear effect over your whole experiment is removed, but you can also do a quadratic detrending, cubic detrending, or something else. Studies have shown that you're not doing much good after a quadratic detrending - most of the gradual noise is modeled well by a linear and/or quadratic function.

Diffusion Tensor Imaging: A relatively newer technique in MRI that highlights white matter tracts rather than gray matter. It can be used to derive maps showing the prevailing direction of white matter fibers in a given voxel, which has given rise to a good deal of interest in using to derive connectivity data. Check out ConnectivityFaq for more...

dispersion derivative: The derivative with respect to the dispersion parameter in a gamma function. In SPM, the dispersion derivative of the canonical HRF looks a lot like the HRF but can be used as a basis function, to model some uncertainty in how wide you expect the HRF to be at each voxel.

drift, drifts: Some noise in an fMRI signal that is extremely gradual, usually varying linearly or quadratically over the course of a whole run of the scanner. This noise is usually called a drift, or a scanner drift. Sources of drifts are generally from the scanner - things like gradual warming of the magnet, gradual expansion of some physical element, etc. - but can also come from the subject, as in a gradual movement of the head downwards. Drifts often comprises a substantial fraction of the noise in a session, and can often be substantially removed by detrending.

dropout: The fMRI signal is contingent on having an extremely even, smooth, homogenous background magnetic field and a precisely calculated gradient field. If anything distorts the background field or the gradient field in a localized fashion, the signal in that region can drop to almost nothing due to the distortions. This is called dropout or signal dropout. This is most common in regions of high susceptibility - brain regions near air/tissue interfaces, where the differing magnetic signatures of the two materials causes major local distortions. In those regions, it's difficult to get much signal from the scanner, and signal-to-noise ratio shrinks drastically, meaning it's hard to find activations there. A good deal of research has been done to ameliorate dropout; recently, it's been shown spiral in-out imaging does a pretty good job avoiding dropout in the traditionally bad regions. See ScanningFaq for more, and stay tuned for more studies on the subject.

DTI: See Diffusion Tensor Imaging (DTI).

dynamic causal modeling: A new statistical analysis technique for making inferences about functional connectivity. Designed primarily by ?KarlFriston, it's included in SPM2 and (currently) no other software package. It allows the user to specify a small set of functional ROIs and a design matrix, and then given some data, produces a set of connectivity parameters. These parameters include both a "default" measure of connectivity between the ROIs, as well as a dynamic measure of how that connectivity changed across the experiment - specifically, whether any experimental effect changed the connectivity between regions. Has been used, for example, to investigate whether category effects in vision are modulated by bottom-up or top-down pathways. See ConnectivityFaq and ConnectivityPapers for much more.


EEG: Stands for electroencephalogram. A neuroimaging technique in which electrodes are pasted to the skull to directly record the electrical oscillations caused by neuronal activity - sometimes called "brain waves". Allows the recording of electrical activity at millisecond resolution, far better than PET or fMRI, but suffers from a lack of regional specificity, as it's extremely difficult to tell where in the brain a given EEG signal originated. The exact nature of the neuronal activity that gives rise to the EEG signal is not entirely clear, but active efforts are underway at several facilities to combine EEG and fMRI to try and get excellent spatial and temporal resolution in the same experiment. See also ERP below.

effective connectivity: A term introduced by ?KarlFriston in order to highlight the difference between "correlational" methods of inferring brain connectivity and the actual concept of causal connection between brain areas. The distinction made is one between correlation and causation. Effective connectivity (EC) stands in contrast to functional connectivity (FC), which goes more with correlation. EC between brain areas is defined as "the influence one neural system exerts over another either directly or indirectly." It doesn't imply a direct physical connection - simply a causative influence. It's a lot harder to establish that two regions are effectively connected than it is to establish that they're functionally connected, but EC supports more interesting inferences than FC does.

efficiency: A statistical concept in experimental design, used to describe how accurately one can model the shape of a response. It's at the other end of a tradeoff with power, which is used to describe how well you can detect any effect at all. Block experiments are very low in efficiency; because the trials come on top of each other, it's difficult to tell how much signal comes from one trial and how much from another, so the shape is muddled. Fully-randomized event-related experiments have high efficiency; you can sample many different points of the HRF and know exactly which HRF you're getting. Experiments that have very high power must necessarily have lower efficiency - you can't be perfect at both. This tradeoff is explored well by Tom Liu's papers in DesignPapers. Check DesignFaq our for more on the efficiency/power tradeoff. Also check out JitterFaq for how to maximize efficiency in your experiment.

epoch: Synonymous with block in block design experiments. Epoch-related experiment and block design experiments are the same thing. Epoch is just a fancier name for some extended block of trials or slightly longer (i.e., non-instantaneous) trial. SPM used to use different canonical HRFs for epochs and events (instantaneous trials), but SPM2 did away with that. Now they've gone to the model used by most other package, where the same canonical HRF is used for zero-length trials and longer trials - it's just scaled slightly.

EPI (or echo-planar imaging): A type of pulse sequence in which lines of k-space are sampled in order. This is the more conventionally-used pulse sequence around the the world, and has some advantages over other sequences of being slightly easier to analyze and pretty fast. It is quite susceptible to various artifacts and distortions, though, and at Stanford various spiral sequences are preferred. Check out ScanningFaq for more. ScanningPapers also has a nice handout with some info.

ERP (or event-related potential): A variation on EEG in which you focus not on the ongoing progression of activity, but rather electrical activity in response to a particular stimulus (or lack thereof). Instead of looking at a whole EEG timecourse or frequency spectrum, you take a small window of time (1 second, say) after each presentation of an A trial, and average those windows together to get the average response to your A stimulus. This creates a peristimulus timecourse, not unlike that for an HRF in fMRI. You can then compare the time-locked average from one condition to that from another condition, or analyze a single time-locked average for its various early and late components. ERPs and the advent of event-related designs in fMRI allow the same designs to be used in both EEG and fMRI, presenting the promise of combining the two into one super-imaging modality which will grow out of control and destroy us all. Or not.

event-related design: An experimental design in which different trial types are intermixed throughout the experiment, usually in random or pseudo-random fashion. Contrasts with a block design, where trials of the same type are collected into chunks. Event-related designs sacrifice power in exchange for higher efficiency, as well as psychological unpredictability, which allow new kinds of paradigms in fMRI. Check out DesignFaq for way more about event-related designs, and JitterFaq for why randomization is all the rage amongst the kiddies.


F-contrast: A type of contrast testing an F-statistic, as opposed to a t-statistic or something else. Allows you to test several linear constraints on your model at once, joining them in a logical OR. In other words, it would allow you to test the hypothesis that A and B are different OR A and C are different OR B and C are different at a given voxel. Another way of describing that would be to say you're testing whether there are any differences among A, B and C at all. F-contrasts can be tricky (if not impossible) to bring forward to a random-effects group analysis. See ContrastsFaq and RandomAndFixedEffectsFaq for more.

False Discovery Rate: A statistical concept expressing the fraction of accepted hypotheses in some large dataset that are false positives. Benjamini & Hochberg developed a procedure in 1995 to tweak your statistical threshold over a large dataset to control the false discovery rate (FDR), a procedure which Genovese et. al imported to neuroimaging in 2002. The idea in controlling FDR instead of family-wise error is that you accept the near-certainty of a small number of false positives in your data in exchange for a more liberal, flexible, reasoned correction for multiple comparisons. Since most researchers accept the likelihood of a small amount of false positives in fMRI data anyways, FDR control seems like an idea whose time may have arrived in neuroimaging. AFNI, SPM and BrainVoyager all allow FDR thresholding now. Check out PthresholdFaq for more, and PthresholdPapers for links to the FDR papers...

family-wise error correction: In a dataset of tens of thousands of voxels, how do you decide on a statistical threshold for true activation? The scientific standard of setting the statistic such that p < 0.05 isn't appropriate on the voxel level, since with tens of thousands of voxels you'd be virtually guaranteed hundreds of false positives - voxels whose test statistic was highly improbably just by chance. So you'd like to correct for multiple comparisons, and you'd like to do it over the whole data set at once - correcting the family-wise error. Family-wise error correction methods allow you to set a global threshold for false positives; if your family-wise threshold is p < 0.05, you're saying there's a 95% chance there are NO false positives in your dataset. There are several accepted methods to control family-wise error: Bonferroni, various Bonferroni-derived methods, Gaussian random fields, etc. FWE stands in contrast to false discovery rate thresholding, which threshold the number of false positives in the data, rather than the chance of any false positives in the data. See PthresholdFaq for more.

FDR: See False Discovery Rate (FDR).

FIR (or Finite Impulse Response) model: A type of design matrix which assumes nothing about the shape of the hemodynamic response function. With an FIR model, you don't convolve your design matrix with a canonical HRF or any basis functions. Instead, you figure out how long an HRF you'd like to estimate - maybe 10 or 15 TRs following your stimulus. You then have a separate column in your design matrix for every time point of the HRF for every different condition. You separately estimate beta weights for every time point, and then line them up to form the timecourse of your HRF. The advantage is that you can separately estimate an unbiased HRF at every voxel for every condition - tremendous flexibility. The disadvantage is that the confidence in any one of your estimates will drop, because you use so many more degrees of freedom in estimation. Full FIR models may not be useable for very complex experiments or certain types of designs. Check out PercentSignalChangeFaq for more on FIR models, as well as RoiDeconvolve and the 3dDeconvolve manual in PercentSignalChangePapers.

fishing expedition: What happens when your data doesn't really offer any compelling or interpretable story about your task... so you try every conceivable way of analyzing it and every conceivable contrast possible to find something interesting looking. Then, of course, it behooves you to write your paper as if you'd been looking for that all along.

fixed-effects analysis: An analysis that assumes that the subjects (or scanning sessions, or scanner runs, or whatever) you're drawing measurements from are fixed, and that the differences between them are therefore not of interest. This allows you to lump them all into the same design matrix, and consider only the variance between timepoints as important. This allows you to gain in power, due to the increased number of timepoints you have (which leads to better estimates and more degrees of freedom). The cost is a loss of inferential power - you can only make inferences in this case about the actual group of subjects (or scanner sessions, or whatever) that you measured, as opposed to making inferences about the population from which they were drawn. Making population inferences requires analyzing the variance between subjects (/scanner sessions... you get the idea) and treating them as if they were drawn randomly from a population - in other words, a random-effects analysis. Check out RandomAndFixedEffectsFaq for more.

fixed ISI: Stands for fixed inter-stimulus interval. A type of experiment in which the same time separates the beginning of all stimuli - trials needn't be all exactly the same length, but the onsets of stimuli are all separated by exactly the same amount of time. Event-related or block design experiments can be fixed-ISI. Fixed-ISI event-related experiments, though, are pretty bad at both efficiency and power, especially as the ISI increases. In general, several empirical studies have shown that for event-related designs, variable-ISI is the way to go. For block designs, the difference is fairly insignificant, and variable-ISI can make the design less powerful, depending on how it's used. See JitterFaq for more on the difference between fixed and variable.

flattening: One inconvenient thing about mapping the brain is the way that it's all folded and scrunched into that little head like so much wadded-up tissue. Voxels that appear to be neighboring, for example, might in fact be widely separated on the cortical sheet, but have that distance obscured by the folds of a gyrus in between them. In order to study the spatial organization of a particular cortical region, it may then be useful to "unfold" the brain and look at it as if the cortical sheet had been flattened out on a table. Indeed, some phenomena like retinotopy are near-impossible to find without cortical flattening. Several software packages, then, allow you to create a surface map of the brain - a 3D graphical representation fo the cortical surface - and then apply several automated algorithms to flatten it out, and project your functional activations onto the flattened representation. BrainVoyager is best known for this type of analysis; AFNI, with SUMA, also allows it. SPM doesn't have a good surface mapping module to date.

fMRI: stands for functional magnetic resonance imaging. The small 'f' is used to distinguish functional MRI, often used for scanning brains, from regular old static MRI, used for taking pictures of knees and things. (This is a convention which irritates the heck out of ?BobCox.) Check out FmriPhysicsFaq for more info on the physics and theory behind fMRI, or ScanningFaq for useful (with any luck) answers about how to set parameters for your experiment.

FSL: A software package for fMRI analysis, developed by a group at Oxford. It runs natively on UNIX-based systems. I actually don't know a ton about it - even what it's written in - but I hear pretty good things...

Fourier basis set: A particular and special type of basis function. Instead of using a standard design matrix, an analysis with a Fourier basis set simply uses a set of sines or cosines of varying frequency for the design matrix columns for each condition. Because a combination of cosines can be used to model almost any periodic function at all, this design matrix is extremely unbiased - in particular as to when your activations took place, since you don't have to specify any onsets. You simply let your software estimate the best match to the period parts of your signal (even if they're infrequent). This allows you, like an FIR model, to estimate a separate HRF for every voxel and every condition, as well as come up with detailed maps of onset lag at each voxel and other fun stuff. The disadvantages of this model include relatively lower power, due to how many degrees of freedom are used in the basis set, and some limitations on what functions can be modeled (edge effects, etc.) It also requires you to use an F-contrast to test it, since the individual parameters have no physiological interpretation.

function: in Matlab, a type of .m file that can take arguments and give back output. Maintains its own workspace rather than having access to the base workspace. Most of the software packages written in Matlab are a collection of functions, with only a couple scripts here and there. MatlabProgramming has more information on these.

functional connectivity: A term introduced by ?KarlFriston to highlight the differences between "correlational" methods of inferring brain connectivity and the causational concepts and inferences that you might want to make. The difference is between correlation and causation; functional connectivity is more correlational. Brain regions which are functionally connected merely must have some sort of correlation in their signal, rather than having any direct causal influence over each other. This is in contrast to effective connectivity, which demands some causation be included. Functional connectivity is rather easier to establish, but supports perhaps less interesting inferences. Most methods out there looking at connectivity are good only for functional connectivity, with TMS being a notable exception. See ConnectivityFaq for more.

functional ROI: Any region-of-interest (ROI) that is generated by looking at functional brain activation data is considered a functional ROI. It may also have reference to anatomical information; you may be looking for all active voxels within the amygdala, say. That would be both an anatomical and functional ROI. Any subsset of voxels generated from a list of functionally active voxels, though, can comprise a functional ROI. See RoisFaq for ways you can use 'em.


Gaussian random field: Whoo, that's a heck of a way to start a letter. Essentially, a type of random field that satisfies a Gaussian distribution, I guess. As it applies to fMRI, the key thing to know is that SPM's default version of family-wise error correction operates by assuming your test statistics make up a Gaussian random field and are therefore subject to several inferences about their spatial distribution. FWE correction based on Gaussian random fields has been shown to be conservative for fMRI data that has not been smoothed rather heavily. See PthresholdFaq for more info.

general linear model: The general linear model is a statistical tool for quantifying the relationship between several independent and several dependent variables. It's a sort of extension of multiple regression, which is itself an extension of simple linear regression. The model assumes that the effects of different independent variables on a dependent variable can be modeled as linear, which sum in a standard linear-type fashion. THe standard GLM equation is Y = BX + E, where Y is signal, X is your design matrix, B is a vector of beta weights, and E is error unaccounted for by the model. Most neuroimaging software packages use the GLM as their basic model for fMRI data, and it has been a very effective tool at testing many effects. Other forms of discovering experimental effects exist, notably non-model-based methods like principal components analysis. Check out BasicStatisticalModelingFaq for more info on how the GLM is used in fMRI analysis.

glassbrain, glassbrain3: Scripts used in the GablabToolbox to dump the coordinates of all activated voxels for a given contrast into a text file. SPM by default only displays the coordinates for local peaks of activation, so these scripts serve to extract the coordinates for every suprathreshold voxel. Glassbrain dumps only the MNI coordinates; glassbrain3 dumps those coordinates as well as p-value information for the voxel, cluster and set level.

GLM: See general linear model (GLM).

global effects: Any change in your fMRI signal that affects the whole brain (or whole volume) at once. Sources of these effects can be external (scanner drift, etc.) or physiological (motion, respiration, etc.). They are generally taken to be non-neuronal in nature, and so generally you'd like to remove any global effects from your signal, since it's extremely unlike to be caused by any actual neuronal firing. See PhysiologyFaq and RealignmentFaq for thoughts on how to account for global effects in your dataset.

global scaling: An analysis step in which the voxel values in every image are divided by the global mean intensity of that image. This effectively makes the global mean identical for every image in the analysis. In other words, it effectively removes any differences in mean global intensity between images. This is different than grand mean scaling! Global scaling (also called proportional scaling) was introduced in PET, where the signal could vary significantly image-to-image based on the total amount of cerebral blood flow, but it doesn't make very much sense to do generally in fMRI. The reason is because if your activations are large, the timecourse of your global means may correlate with your task - if you have a lot of voxels in the brain going up and down with your task, your global mean may well be going up and down with your task as well. So if you divide that variation out by scaling, you will lose those activations and possibly introduce weird negative activations! There are better ways to take care of global effects in fMRI (see PhysiologyFaq for some), considering that moment-to-moment global variations are very small in fMRI compared to PET. They can be quite large session-to-session, though, so grand mean scaling is generally a good idea (see below).

grand mean scaling: An analysis step in which the voxel values in every image are divided by the average global mean intensity of the whole session. This effectively removes any mean global differences in intensity between sessions. This is different than global scaling! This step makes a good deal of sense in fMRI, because differences between sessions can be substantial. By performing it at the first (within-subject) level, as well, it means you don't have to do it at the second (between-subject) level, since the between-subject differences are already removed as well. This step is performed by default by all the major analysis software packages.

Granger causality, Granger causality modeling: A statistical concept imported from econometrics intended to provide some new leverage on tests of functional connectivity. Granger causality is somewhat different from regular causality; testing Granger causality essentially boils down to testing whether information about the values or lagged values of one timecourse give you any ability to predict the values of another timecourse. If they do, then there's some degree of Granger causality. The concept is still somewhat controversial in econometrics, and the same goes for neuroimaging. What's clear is the test is still effectively a correlational test, though far more sophisticated than just a standard cross-correlation. So establishing Granger causality between regions is enough to establish functional connectivity and some degree of temporal precedence, but probably not enough to establish effective connectivity between those regions. BrainVoyager is attempting to implement a connectivity module based on Granger causality. Check out ConnectivityFaq for more.


hand-waving: An explanatory technique frequently used in fMRI research to obscure the fact that no one really knows what the hell is going on.

hierarchical model: A type of mixed effects model in which both random and fixed effects are modeled but separated into different "compartments" of "levels" of the modeling. The standard group model approach in fMRI is hierarchical - you model all the fixed (within-subjects) effects first, then enter some summary of those fixed effects (the beta weights or contrast images) into a random effects model, where all the random (between-subject) effects are modeled. This allows separate treatment of the between- and within-subject variance. Check out RandomAndFixedEffectsFaq for more info.

high-pass filter: A type of frequency filter which "passes through" high frequencies and knocks out low frequencies. Has the effect, therefore, of reducing all very low frequencies in your data. Since fMRI noise is heavily weighted towards low frequencies, far lower than the frequencies of common experimental manipulations, high-pass filters can be a very effective way of removing a lot of fMRI noise at little cost to the actual signal. Setting the cutoff period is of crucial importance in high-pass filter construction. Contrasts with low-pass filters and band-pass filters. See TemporalFilteringFaq for more info.

HRF (or hemodynamic response function): When a set of neurons in the brain becomes more active, the brain responds by flooding the area with more highly-oxygenated blood, enabling an MRI scanner to detect the BOLD signal contrast in that region. But that "flooding" process doesn't happen instantaneously. In fact, it takes a few seconds following the onset of neuronal firing for BOLD signal to gradually ramp up to a peak, and then several more seconds for BOLD signal to diminish back to baseline, possibly undershooting the baseline briefly. This gradual rise followed by gradual fall in BOLD signal is described as the hemodynamic response function. Understanding its shape correctly is crucial to analyzing fMRI data, because the neuronal signals you're looking to interpret aren't directly present in the data; they're all filtered through this temporally extended HRF. A great deal of statistical thought and research has gone into understanding the shape of the HRF, how it sums over time and space, and what physiological processes give rise to it. Check out HrfFaq for more about how it's modeled in fMRI analysis.


ICA: see independent components analysis (ICA).

image mask: see mask image.

independent components analysis: A statistical technique for analyzing signals that are presumed to have several independent sources mixed into the single measure signal. In fMRI, it's used as a way of analyzing data that doesn't require a model or design matrix, but rather breaks the data down into a set of statistically independent components. These components can be then (hopefully) be localized in space in some intelligible way. This enables you, theoretically, to discover what effects were "really" present in your experiment, rather than hypothesizing the existence of some effects and testing the significance of your hypothesis. It's been used more heavily in EEG research, but is beginning to be applied in fMRI, although not everything about the results it gives is well understood. Its use in artifact detection is clear, though. It differs from principal components analysis, an algorithm with similar goals, because the components it chooses have maximal statistical independence, rather than maximizing the explained variance of the dataset.

inflation: Related to flattening. A downer about superimposing activation results on the brain is that brains are kind of inconveniently wrinkled up. This makes it difficult to see the exact spatial relationship of nearby activations. Two neighboring voxels might well be separated by a large distance on the cortical sheet, but one is buried deep in a sulcus and one is on top of a gyrus. Inflation and flattening are visualization techniques that aim to work around that problem. Inflation works by first doing surface mapping to construct a 3-D model of the subject's cortical surface, and then applies graphics techniques to slowly blow up the brain, as if inflating it. This gradually reduces the wrinkling, spreading out the sulci and gyri until, ultimately, you could inflate the brain all the way to spherical shape. Usually inflation stops when most of the smaller sulci and gyri are flattened out, as this allows much nicer visualization of phenomena like retinotopy. BrainVoyager has supported inflation methods for a long time, and I think it's the only program right now that does very much of it.

IRF (or impulse response function): In linear systems theory, you can predict a system's response to any arbitrary stimulus if you a) assume that its response to stimuli obeys certain assumptions about linearity (summation, etc.) and b) you know how the system responds to a single instantaneous impulse stimulus. The system's response in this case is called the IRF, or impulse response function. Many analyses - the general linear model, primarily - of the brain's response to stimuli proceed along linear systems methods, assuming that the IRF is equivalent to the hemodynamic response function (HRF). This HRF can be measured or simply assumed. IRF and HRF are sometimes used interchangeably in fMRI literature.

ISI (or inter-stimulus interval): The length of time in between trials in an experiment. Usually measure from the onset of one trial to the onset of the next. The length and variability of your ISI are crucial factors in determing how much power and efficiency your experimental design provide, and thus how nice your results will look. See DesignFaq and JitterFaq for info about figuring out the proper length of your ISI.

isotropic, isotropy: The same size in all directions. A sphere is isotropic. An ovoid is not. Isotropy is the degree to which something is isotropic. Smoothing kernels are often isotropic, but they don't have to be - they can be anisotropic. Voxels are often anisotropic originally, but are resample to be isotropic later in processing.


jitter: A term used to describe varying the inter-stimulus intervals during your experiment, in order to increase efficiency in the experimental design. Can also be used (although less frequently these days) to describe offsetting the TR by a small amount to avoid trial lengths being an exact multiple of the TR. Used as a noun - "I made sure there was some jitter in my design" - or a verb - "We're going to jitter this design a little." Check out JitterFaq for all the gory details.


k-space: One way to take a 3-D picture would be to sample various points in space for the intensity of light there, and then reassemble those samples into a volume - an easy reassembly process, since the sampled intensity is exactly what you want to see. But that's not how MRI scanners take their pictures. Instead of sampling real space for the intensity of light at a given point, they sample what's called k-space. A given point in k-space describes both a frequency and a direction of oscillation. Very low frequencies correspond to slow oscillations and gradual changes in the picture at that direction; higher frequncies correspond to fast oscillations and sharp changes (i.e., edges) in the picture at that direction. The points in k-space don't correspond to any real-world location! They correspond only to frequency and direction. This is the space that MRI scanner samples. A really good explanation of k-space kind of requires some pictures, but fortunately Philippe Goldin has provided some in his nice handout on the ScanningPapers page. Definitely, definitely check that out for more info. K-space can be sample in different patterns; these correspond to different pulse sequences at the scanner.

kernel: see smoothing kernel.


linear drift, linear drifts: see drifts.

localizer: One way of dealing with the sizeable differences in brain anatomy between subjects is to use an analysis that focuses on regions of interest, rather than individual voxels. The danger in using anatomically defined regions of interest is that the mapping between function and anatomy varies widely between subjects, so one subject might activate the whole calcarine sulcus during a visual stimulus and another might only activate a third of it. One way around this variability is to use functionally-defined regions of interest. A localizer task is one designed to find these functional ROIs. The idea is to design a simple task that reliably activates a particular region in all or most subjects, and use the set of voxels activated by that localizer task as an ROI for analyzing another task. The simple task is called a localizer because it is designed to localize activation to a particular set of voxels within or around an anatomical structure. See RoisFaq for more on the region-of-interest approach.

long event-related design: An experimental design in which single trials are the basic unit, and those single trials are separated by enough time to allow the HRF to fully return to baseline before the next trial - usually 20-30 seconds. This design is a subtype of event-related designs, contrasting with the other subtype, rapid event-related designs.. Long event-related designs have the advantage of being very straightforward to analyze, and incredibly easy to extract timecourses from. They have the disadvantage, though, of having many fewer trials per unit time than a block design or rapid event-related design, and so long event-related designs are both very low-powered and very inefficient. They're not widely used in fMRI any more, unless the experiment calls for testing assumptions about HRF summation or something. See DesignFaq for more.

low-pass filter: A type of filter that "passes through" low frequencies and suppresses high frequencies. This has the effect of smoothing your data in the temporal (rather than spatial) domain - very fast little jiggles and quick jumps in the signal are suppressed and the timecourse waveform is smoothed out. If temporal-domain noise is random and independent across time, low-pass filtering helps increase signal-to-noise ratio in the same way spatial smoothing does. But, unfortunately, fMRI temporal-domain noise is highly colored, and so low-pass filtering usually ends up suppressing signal. Check out TemporalFilteringFaq for lots more on the low-pass filtering controversy.


magnetic susceptibility: see susceptibility.

MarsBaR: A slick little region-of-interest (ROI) analysis package for SPM, developed by a team led by Matt Brett. Aimed at functional ROI analysis, and allows for ROI creation, manipulation, and statistics - i.e., direct statistics on the ROI timecourse, including model estimation, timecourses, etc. See RoisFaq for info on the ROI approach. MarsBaR is available in through the GablabToolbox, if you've got it, or you can download it from the actual source

mask, mask image: A special type of image file used in SPM (and other programs) which is used to specify a particular region of the brain. Every voxel in that region has intensity 1; everything outside of that region has intensity 0. You might have an ROI mask, to specify the location of an ROI, or you might have a brain mask, where the mask shows you where all of the in-brain voxels are (so that you can analyze only the in-brain voxels, for example). Truncation will turn a regular image file into a mask file, and most ROI programs that create image files create masks. SPM standardly creates a mask image file based on intensity thresholds during model estimation, and only estimates voxels within its brain mask; in the Gablab, we often use specmask (see GablabScripts) to replace that mask with a more inclusive one.

mat file (or dot-mat file, .mat file, etc.) 1) A MATLAB file format which contains saved Matlab variables, and allows you to save variables to disk and load them into the workspace again from disk. Format is binary data, so it's not accessible with text editors. See MatlabBasics for more.

2) One special kind of .mat file in SPM is the .mat file which can go along with an ANALYZE format .img/.hdr pair. A .mat file with the same filename as a .hdr/.img pair is interpreted in a special way by SPM; when that image file is read, SPM looks into the .mat file for a matrix specifying a position and orientation transform of the image. In this way, SPM can save a rigid-body transformation of the image (rotation, zoom, etc.) without actually changing the data in the .img file. Almost every SPM image-reading function automatically reads the .mat file if it's present, and many functions which move the image around (realignment, slice timing, etc.) give you the option to save the changes as a .mat file instead of actually re-slicing the image. So sometimes you'll hear people refer to an image's .mat file, which is this special kind.

MATLAB: The dominant software package in scientific and mathematical computing and visualization. Originally built to do very fast computations and manipulations of very large arbitrary matrices; now includes things like a scripting language, graphical user interface builder, extensive mathematical reference library, etc. Check out MatlabBasics for a quick intro and MatlabProgramming for more. Written by The ?MathWorks, comin' at ya straight out of the great town Natick, Massachusetts. Word!

mental chronometry: See chronometry or MentalChronometryFaq.

microanatomy: A level of anatomical detail somewhere around and above cytoarchitectonic, but smaller than the standard anatomic strucures. This level of detail refers to things like cell type, or the organization of cell layers and groups. See RoisFaq for information on using microanatomical detail in your study.

MINC, MINC format: Stands for Medical Image NetCDF. An image format designed for brain imaging based on the NetCDF data format. Originally developed at the Montreal Neurological Institute, but now supported more globally. Unlike Analyze-format images, MINC images are only one file, usually with the extension .mnc. MINC files have their coordinate system saved - you can always tell whether a MINC image is in radiological or neurological convention, unlike with Analyze format. There are many libraries of MINC tools out there, and at least one SPM MINC tools library. SPM2 now publishes its template images in MINC format, and new extensions in MINC may make it a growing image format in the future.

mixed effects, mixed effects model: A model which combines both fixed effects and random effects. Most fMRI group effects model are mixed effects models of a special type; they are generally hierarchical, where the fixed effects and random effects are partitioned and evaluated separately. Check out RandomAndFixedEffectsFaq for more info.

.MNC format: See MINC format.

MNI space, MNI templates: The Montreal Neurological Institute (MNI) has published several "template brains," which are generic brain shapes created by averaging together hundreds of individual anatomical scans. The templates are blurry, due to the averaging, but represent the approximate shape of an "average" human brain. One of these templates, the MNI152, is used the as the standard normalization template in SPM. This differs from Talairach normalization, which uses the Talairach brain as a template. So normalized SPM results aren't quite in line with Talairach-normalized results, a state of affairs which the mni2tal and tal2mni scripts aim to account for.

mni2tal: A script written by Matthew Brett and embedded in the GablabToolbox, which aims to convert a set of XYZ coordinates from a given point in the Montreal Neurological Institute (MNI) standard template brain into the same anatomical point in the Talairach atlas brain. The MNI brain, which is used as the normalization template by SPM, differs slightly from the Talairach brain in several ways, particularly in the inferior parts of the brain. In order to use facilities like the Talairach Daemon or other Talairach-coordinate lookups, or in order to report normalized SPM results in Talairach coordinates for ease of reference, it's necessary to convert the MNI coordinates into Talairach space with this script. It's not a perfect mapping, but it's widely used, and it's the best that's out there right now... See RoisFaq and NormalizationFaq for more.

motion correction: see realignment.

Mr. Fancy Pants: This guy right here:

MRIcro: An excellent medical-image viewing and manipulation program. Converts to and from ANALYZE format from a variety of other 2-D and 3-D formats, and allows easy hand-drawing of ROIs. Written by Chris Rorden.

mrVista: The software package developed and used by Stanford's Vision, Imaging Science and Technology Activities (VISTA) lab. Comprised of a variety of statistical analysis tools (mrVista), segmentation tools (mrGray), flat-mapping and inflation code (mrFlatMesh), and others. Coded by a veritable army of intrepid developers roaming around the psych department at Stanford...

multifiltering: An advanced spatial smoothing technique suggested for fMRI use by Skudlarski et. al. Multifiltering consists of running two parallel analysis paths with a given dataset - one smoothed, one unsmoothed - and then making final contrast images by averaging the smoothed and unsmoothed results together. Empirical results presented in Skudlarski et. al (1999) (see SmoothingPapers) suggest that multifiltering may do a better job than any given smoothing kernel at preserving small activations while getting some of the benefits of highlighting large activations that smoothing usually provides. Check out that paper for more.

mutual information: A concept imported from information theory into image analysis. If you have two random variables, A and B, and would like to quantify the amount of statistical dependence between them, one way you might do it is by asking: how much more certain are you about the value of B if you know the value of A? That amount is the amount of mutual information between A and B. In more precise terms, it's the distance (measured by a K-L statistic) between the joint probability distribution P(ab) and the product of their individual distributions, P(a) * P(b). It comes up in fMRI primarily in coregistration. Mutual information-based methods provide a much more robust way of lining up two images than simple intensity-based methods do, and so most current coregistration programs use it or a measure derived from it. See CoregistrationFaq for more info...


neurological convention: Radiological images (like fMRI) that are displayed where the left side of the image corresponds to the left side of the brain (and vice versa) are said to be in "neurological convention" or "neurological format." In neurological convention, left is right and right is left. Those crazy neurologists. This contrasts with radiological convention. Some image formats - notably Analyze - do not contain information saved as to what convention they're in, and so ?SideFlipping can be an issue with those images. So be careful.

normalization: A spatial preprocessing technique in which anatomical and/or functional MRI images are warped in order to more closely match a template brain. This is done in order to reduce intersubject variability in brain size and shape. The warping can be affine in nature or nonlinear, and can be done on a voxelwise basis or with respect to the surfaces of the brains only. All the major neuroimaging packages support some form of normalization, but there are many questions about how much variability it actually removes. See NormalizationFaq for more answers than you can shake a stick at, and even more questions than that.


onset vector, onsets: In order to create a design matrix for your experiment, you need to know when, in time, each of your trials started and how long they lasted. The beginning of a trial is commonly called an onset. An onset vector is a list of starting times for the trials of a particular condition. If you have 15 trials in condition A, your onset vector for condition A will have 15 numbers, each one specifying the moment in time when a particular trial started. The times are usually specified in either seconds or TRs. Generally all neuroimaging software packages require you to enter your onset vectors somehow, or construct a design matrix from them, as input before they can estimate a model. Check out BasicStatisticalModelingFaq for more.

outlier: Any point in a dataset (of any kind) whose value lies wayyyyy outside the distribution of the rest of the points. Outliers are often removed from datasets in many scientific domains, because their extreme values can have give them undue influence over the description of the data distribution; as one example, outliers can severely skew statistics like mean or variance. Figuring out just how far an outlier need be from the center of the distribution to be removed, though, is a tricky procedure, and often extremely arbitrary. Outlier detection and removal is one key aim of ArtifactDetection schemes and programs, so check that page out for more...

orthogonal, orthogonalize, orthogonality: Orthogonal means perpendicular. Two things that are orthogonal to each other are perpendicular, to orthogonalize two things means to make them orthogonal, etc. The terms, though, are generally used less for real lines in space than for vectors. Any list of numbers can be taken to represent a point or a line in some space, and those lists of numbers can thus be made orthogonal by tweaking their elements such that the lines they represent become perpendicular. In more common terms, this corresponds to removing correlations between two lists of numbers. Two lists are "collinear" to the degree that they have some correlation in their elements, and they are orthogonal to the degree to that they have no correlation whatsoever in their elements. Two perfectly orthogonal lists have values that are totally independent of one another, and vice versa. Having columns in a design matrix, or elements in two contrasts, not be orthogonal can pose problems for estimating the proper beta weights for those columns or contrasts, so many programs either require certain structures be orthogonal or do their own orthogonalization when the issue comes up. Check out BasicStatisticalModelingFaq and ContrastsFaq for more info.


p-threshold: A particular probability value which is used as a threshold for deciding which voxels in a contrast are active and which are not. The contrast image is rendered in terms of some statistic, like a T or F, at each voxel, and each statistic can then be assigned a particular p-value - the likelihood that such a value would occur under the null hypothesis of no real activation. Voxels with p-values smaller than the threshold are declared active; other voxels are declared inactive. P-thresholds can be manipulated to account for multiple comparisons, spatial and temporal correlation, etc. See PthresholdFaq for lots, lots more.

parameter weights: see beta weights.

partial volume, partial-voluming: In doing segmentation, a major problem in assigning a particular voxel to a tissue-type category or anatomical structure is that tissue and structure boundaries rarely line up exactly with voxel boundaries. So a given voxel might contain signal from two or more different tissue types. If one of the assumptions of segmentation is that different tissue types give off different signals (usually MR intensity), voxels with a mixture of tissue types pose a problem, because their intensity may lie in between the canonical intensity of any one tissue type. Oftentimes segmentation algorithms simply make a guess based on which tissue type the voxel seems closest to, but this can pose a problem in calculating, say, the total volume of gray matter in a brain. If half of your "white-matter" voxels have some gray matter in them, but you count them only as white matter, you're missing a whole lot of gray matter in your volume calculation. This is the partial volume problem, and a partial-voluming effect is this type of tissue mixing. See SegmentationFaq for more.

path: An ordered list of directories which Matlab searches when it's looking for a script or filename. Includes all the Matlab program directories by default, as well as the SPM program directories when you start Matlab in the Gablab with an spm* command (spm99-6-devel, spm2-devel, etc.). You can personalize your Matlab path or change it on the fly - check out MatlabPaths for how.

PCA: see principal components analysis (PCA).

peak voxel: The most active voxel in a cluster, or the voxel in a cluster that has the highest test statistic (T-stat or F-stat or whatever). Often the coordinates of only the peak voxel are reported for a cluster in papers, and sometimes timecourses or beta weights are extracted only from the peak voxel. See RoisFaq and PercentSignalChangeFaq for more info on why that would be.

percent signal change: A measure of signal intensity that ignores the arbitrary baseline values often present in MR signal. A timecourse of signal can be viewed as a timecourse of changes from some baseline value, rendered in units of percent of that baseline value. The baseline is then chosen on a session-specific basis in some reasoned way, like "the mean of the timecourse over the whole session," or "the mean of the signal during all rest periods." This gets around the problem that MR signal is often scaled between sessions by some arbitrary value, due to how the scanner feels at that moment and the physiology of the subject. Two signal timecourses that are identical except for an arbitrary scaling factor will be totally identical when converted to percent signal change. Percent signal changes timecourses are thus used to show intensity timecourses from a given region or voxel during some experimental manipulation. PercentSignalChangeFaq has everything you ever wanted to know about the measure, or at least everything I could think of before noon.

peristimulus, peristimulus timecourse: Means "with respect to the stimulus." A peristimulus timecourse is one that starts at the onset of a given stimulus. Sometimes a peristimulus timecourse will start with negative time and count down to a zero point before counting up again; the zero point is always the onset of a given stimulus. This is the same as a time-locked average timecourse. See PercentSignalChangeFaq for more on why you would want to look at these.

perfusion: A type of fMRI imaging which doesn't look at BOLD contrast. Instead, blood is magnetically "labeled" just before it gets to the brain, and it's then tracked through the brain over time. Perfusion imaging has several advantages over BOLD - a different and flatter noise profile, possibly less variability over subjects, and a readily interpretable physiological meaning for the absolute units are chief among those. The major disadvantage is that signal-to-noise ratio is significantly smaller in perfusion imaging, at least in single subjects. This probably makes it less suitable for most current fMRI designs, but it may be a better option for novel designs (blocks lasting several minutes, for example). See ScanningFaq for a fuller discussion of the pros and cons of each.

permutation test, permutation testing: A type of statistical test, like a T-test or F-test, but one which assumes much less about the distribution of the random variable in question. This is a type of nonparametric test related to bootstrapping. It has significant advantages over standard parametric tests under certain conditions, like low degrees of freedom, as in a group analysis. Nichols & Holmes give a terrific introduction to permutation testing in PthresholdPapers, and PthresholdFaq delves into more detail about how they stack up to other types of tests.

PET (or positron emission tomography): An imaging method in which subjects are injected with a slightly radioactive tracer, and an extremely sophisticated and sensitive radition detector is used to localize increased areas of blood metabolism during some experimental task. PET offers better spatial resolution than EEG, but not as much as fMRI - on the order of tens of millimeters at best. Its temporal resolution is pretty poor, as well - within tens of seconds at best, making block designs the only feasible design for PET studies. As well, PET scanners are very expensive, and so aren't around at many institutions. Recently, though, studies have demonstrated one extremely useful aspect of PET - the ability to selectively label particular neurotransmitters, like dopamine, and hence get a chemically-specific picture of how one neurotransmitter is being used. SPM was originally developed for use with PET.

phantom: Any object you scan in an MRI machine that's intended only to help you calibrate your scanner. Phantoms can range from very simple (a tank of water) to very complicated (a plastic skull with a gelatin brain controlled by several motors to simulate head movements). The fact that they don't have brain responses is the key; you can use them to check your scanner or preprocessing paradigm, or introduce fake signal into a phantom scan and know that you won't be corrupted by real brain responses.

Philippe: A very nice guy who knows a lot about AFNI. Also, a very nice stuffed otter.

power: A statistical concept which quantifies the ability of your study to reliably detect an effect of a particular size. Studies with higher power can reliably detect smaller effects. A tremendous number of factors influence your study's power, from the ordering of your stimuli presentation to the noise characteristics of the scanner, but the one that's most under your control is your experimental design. High power is very desirable for fMRI studies, where effect sizes can often be extremely small, but it doesn't come without a cost; increasing the power of your study requires decreasing the efficiency, which can also be seen as assuming more information about the shape of your response. See DesignFaq (and JitterFaq) for tons more on power and efficiency and how to manipulate them both.

PPI: see psychophysiological interaction (PPI).

pre-whitening: A process by which signals that are corrupted by non-white noise - i.e., colored noise, or noise that is more prevalent at some frequencies than others - can be improved, by making the noise "whiter." This involves estimating the autocorrelation function of the noise, and then removing the parts of the noise that are influenced by previous noise values, leaving only independent or white noise. Whatever analysis is to be done on the signal is then carried out. Because this process makes "colored" noise into white noise, it's called whitening, and the "pre" part is because it happens before the model estimation (or other analysis) is done on the signal. This is a standard technique in many signal processing domains, but it's only recently be introduced into fMRI analysis. SPM2 and BrainVoyager have pre-whitening modules, and both appear to have significant effects on reducing noise, at least at the individual subject level. See BasicStatisticalModelingFaq for more details.

preprocessing: Any manipulation of your data done before you estimate your model. Usually this refers to a set of spatial transformations and manipulations like realignment, normalization, or smoothing done to decrease noise and increase signal strength. There are various preprocessing steps you can take in the temporal domain as well, like temporal filtering or pre-whitening. In SPM, "preprocessing" often refers to the specific set, in order, of slice timing correction, realignment, normalization, smoothing, which are grouped together in the interface and generally comprise the first steps of any analysis. CategoryFaq has a list of F.A.Q pages for all these steps and more.

present working directory: Matlab always has a present working directory - the directory that any operation or script will, by default, run in. Given a script name or variable file to look for, Matlab will always look first in the present working directory, before any directory in its path. You can always check which directory you're currently in by typing pwd at a Matlab prompt. See MatlabPaths and MatlabBasics for more info.

principal components analysis: A statistical technique for identifying components of your signal that explain the greatest amount of variance. In fMRI, it's used as a way of analyzing data that doesn't require a model or design matrix, but rather breaks the data down into a set of distinct components, which can be interpreted in some case as distinct sources of signal. These components can be then (hopefully) be localized in space in some intelligible way. This enables you, theoretically, to discover what effects were "really" present in your experiment, rather than hypothesizing the existence of some effects and testing the significance of your hypothesis. It's been used more heavily in EEG research, but is beginning to be applied in fMRI, although not everything about the results it gives is well understood. Its use in artifact detection is clear, though. It differs from independent components analysis, an algorithm with similar goals, because the components it chooses explain the maximum amount of variance in the dataset, rather than maximizing the statistical independence of the components.

prospective motion correction: A form of realignment that is performed within the scanner, while the subject is actually being scanned. Rather than waiting until after the scan and trying to line up each functional image with the previous after the fact, prospective motion correction techniques aim to line up each functional image immediately after it is taken, before the next image is taken. Since TRs are typically on the order of a few seconds, these algorithms must operate very fast. Standard methods call for an extra RF pulse or two to be taken during one TR's pulse sequence, essentially to quantify how much the subject has moved during the TR. These algorithms can avoid some of the major problems of standard realignment algorithms, like biasing by activation and warping near susceptible regions. That extra functionality comes at the cost of time - it usually takes tens of milliseconds per TR to perform, which might mean taking one fewer slice or two. See RealignmentPapers for details of one prospective algorithm.

pulsatility: A type of artifact induced by the cardiac cycle. The beating of the heart pushes blood through the arteries and into the brain, and the rhythmic influx of blood actually causes small swellings and deflations in brain tissue, as well as other small movements, all timed to the heartbeat. As the heartbeat is often faster but around the same timescale as the TR, signal changes induced by cardiac movements can be unpredictable and difficult to quantify and remove. ?GaryGlover, among others, has developed a nice algorithm to help knock out cardiac pulsatility artifact, although it's not without its cautions. See PhysiologyFaq for more on physiological sources of artifacts.

pulse sequence: fMRI works by stimulating the brain with rapid magnetic pulses in an intense baseline magnetic field. The exact nature of those rapid pulses determines exactly what kind of fMRI signal you're going to get out. Many things about those pulses are standardized, but not all, and you can use different pulse sequences to take functional images, depending on your scanner characteristics and different parameters of your experiment. EPI and spiral are two well-known functional pulse sequences; there are many others for other types of scans. Check out ScanningFaq and FmriPhysicsFaq for a little bit more.

psychophysiological interaction: A term invented by ?KarlFriston and the SPM group to describe a certain type of analysis for functional connectivity. They have argued that looking at simple correlations of signal between two regions may not be as interesting as looking at how those correlations change due to the experiment; i.e., does condition A induce a closer connection between two regions than condition B does? If so, these regions have a psychophysiological interaction (or PPI) - an interaction influenced both by psychological factors (the experimental condition) and physiological factors (the brain signal from another region). Doing a PPI analysis is available right now automatically only in SPM2, but it's relatively easy to perform one by hand in any neuroimaging package. Check out ConnectivityFaq for more.


quadratic drift, quadratic drifts: see drifts.


radiological convention: Radiological images (like fMRI) that are displayed where the left side of the image corresponds to the right side of the brain (and vice versa) are said to be in "radiological convention" or "radiological format." In radiological convention, left is right and right is left. Those crazy radiologists. This contrasts with neurological convention. Some image formats - notably Analyze - do not contain information saved as to what convention they're in, and ?SideFlipping can be an issue with those images. So be careful.

random-effects analysis: An analysis that assumes that the subjects (or scanning sessions, or scanner runs, or whatever) you're drawing measurements from are randomly drawn from some distribution. The differences between them must thus be accounted for in accounting for the average effect size. This generally means evaluating effects within each subject (session/run/etc.) separately, to allow for the possibility of differential responses, which means separate design matrices and estimations. This costs you a significant amount of power from a fixed-effects analysis, because you only end up having as many degrees of freedom in your test as you have subjects (sessions/runs/etc.), which is generally far smaller than the number of measurements (i.e., functional images). The advantage is a gain in inferential power: a random-effects analysis allows you to make inferences about the population from which the subjects were drawn, not just the subjects themselves. Fixed-effects analyses of any kind do not allow this type of inference. The analyses generally done in neuroimaging programs is technically a mixed-effects analysis, because they include both fixed and random effects. Check out RandomAndFixedEffectsFaq for more.

rapid event-related design: Any event-related design in which trials occur too fast for the HRF to return to baseline in between trials. This generally corresponds to an inter-stimulus interval of less than 20-30 seconds or so. These designs contrast with long event-related designs. They are more difficult to analyze than long event-related designs, because you have to make assumptions about the way that the hemodynamic response to different events adds up. They compensate for this difficulty by being having much more power and efficiency than long event-related designs - so long as the mean ISI in the design is properly varied or jittered. This gain comes from the increased number of trials per unit time, but necessitates proper jitter. See DesignFaq for more, and JitterFaq for a good deal about rapid designs specifically.

realignment: Also called motion correction. A spatial preprocessing step in which functional images are lined up together, so a single voxel in the grid corresponds to the same anatomical location during the whole experiment. This step is needed due to subtle head motions from the subjects; even with a bite bar or head mount, subjects move their head slightly during an experiment, and so the functional images that are taken end up being slightly out of register with each other. Realignment aims to line them back up again. See RealignmentFaq for much much more.

reference slice: A term used in slice timing correction to denote the slice of the brain that no correction is done on. All other slices of each functional image will have their voxels' timecourses slightly shifted in the temporal domain so that they take on the values they "would have had" if the whole brain had been sampled at the same moment as the reference slice. See SliceTimingFaq for more, and for how to choose a reference slice.

region of interest: Any subset of voxels within the brain that you want to investigate further. They might comprise an anatomical structure, or a cluster of activated voxels during your task. An ROI needn't be spatially contiguous, although they often are. Subtypes are anatomical ROIs and functional ROIs. They can be identified before or after a standard general linear model analysis, and they often represent some area of pre-existing theoretical interest. They're often saved as either lists of coordinates (all coordinates in the list make up the ROI) or image masks, a special type of image file where every voxel in the ROI has intensity 1 and every voxel not in the ROI has intensity 0. Several further analyses can be performed once you've identified some regions of interest; see RoisFaq for some thoughts on them.

render, rendering: A three-dimensional object like the brain can be difficult to visualize in a two-dimensional picture. Several graphics packages provide facilities to make a three-dimensional picture of the brain that shows the folds of the surface, and often allows zooming and rotation of the whole three-D object. This process of making a 3-D image is called rendering. All the major neuroimaging software packages provide some rendering package; BrainVoyager's is the nicest looking in general, but they all have them. They all allow you to superimpose patterns of activation on those 3-D objects, to allow a better visualization of the 3-D nature of the activations. Rendering is often connected with other 3-D visualization methods, like inflation or flattening.

retinotopy: A phenomenon in the brain in which the spatial map of where light falls on the retina corresponds with an activation map of which occipital neurons in a certain region are activated. In other words, an object appears in the upper left corner of my visual field, and within V1 (a region in the occipital lobe), neurons in the upper-left corner of that region become active. As the object moves to the right, the activation moves to the right in the brain. In other words, a spatial mapping between regions of activation and the actual, real-world visual field. Commonly used to identify sub-regions in visual cortex; retinotopic boundaries (i.e., where one map of retinotopy ends and another begins) are often used to draw functional boundaries between regions. Some other sensory modalities show this sort of effect; auditory cortex shows some tonotopy (mapping based on high-to-low tones), and somatosensory cortex shows somatotopy for tactile activation (neighboring areas of skin activate neighboring neurons, etc.). Retinotopy can be difficult to visualize on the 3-D brain because it follows the folds of the cortex, so methods like inflation and flattening are often used to display it better.

retroicor: A script written by ?GaryGlover to calculate and remove noise in fMRI images caused by physiological motion, like heartbeat and respiration. Available at Stanford as part of the reconstruction utilities to make Analyze images from raw scanner output. The script is based on gating techniques and requires you to measure physiological data at the scanner; it's very effective for cardiac and respiratory noise. There are concerns about its use in paradigms where physiological factors might correlate with the task - powerfully affect-laden stimulu, or heightened anticipation, for example, might drive heartbeat higher during a certain condition, and retroicor then might remove some signal as well as some noise. See PhysiologyFaq for more info and PhysiologyPapers for the original paper.

reverse normalization: After normalization, you have some set of transformation parameters which specify how the individual subject's brain was warped and shifted to match the standard template brain. One thing you could do at that point would be to identify some functional regions of interest in the normalized group results, or some anatomical regions of interest on a standard brain like the MNI template or Talairach brain. Reverse normalization would entail, then, inverting the transformation matrix of normalization and applying the reversed matrix to some anatomical or functional ROI made at the normalized, standard brain level. This reverse-normalized ROI would then be warped to fit your individual subject's brain, and you could then analyze any non-normalized images you had of theirs with it. Given that normalization induces some interpolation errors and localization problems into your images, this might be a great way to save labor on hand-drawing ROIs but still look at non-normalized results. See RoisFaq for more info on why you'd want to analyze data at the individual level, and NormalizationFaq for more about the normalization process.

ROI: see region of interest (ROI).

roi_deconvolve: A program in the GablabToolbox, based on (some might say totally ripped off from) the 3dDeconvolve program in AFNI. It produces condition-specific timecourses from experiments that have rapid event-related designs and thus aren't suitable for time-locked averaging (like roi_percent uses). roi_deconvolve uses a finite impulse response model to remove the effect of each condition from every other condition's timecourse, and give you a picture of what the timecourse for that condition would look like in isolation. As you might expect from the name, timecourses are calculated within an ROI. The script can handle multiple ROIs, multiple subjects, arbitrary experimental designs, and has many temporal preprocessing options included. It's even got its own documentation page and FAQ at RoiDeconvolve...

roi_percent: A program in the GablabToolbox, used to calculate condition-specific timecourses within a given ROI. It uses time-locked averaging to make these timecourses, a method which is only really suitable for long event-related designs and, to a degree, block designs. For rapid event-related designs, you might be better off using RoiDeconvolve. roi_percent can handle multiple ROIS, multiple subjects, and arbitrary experimental designs. It's got documentation and an FAQ at RoiPercent.

roi_stats: A program in the GablabToolbox, used to extract average intensity and other descriptive statistics from a given ROI over several images. It calculates mean intensity, variance in intensity, min and max intensity, and number of voxels, for each of an arbitrary timecourse of images (which are generally functional, but could be contrast images or beta images or whatever), within a given ROI or set of ROIs. See GablabToolbox for more documentation and overview.

roi_extract: A program in the GablabToolbox, used to extract average intensity from a given ROI over several images. Given an arbitrary timecourse of functional images (or contrast images or beta images or what have you), returns a vector of average intensity in the ROI for each image, and will write that vector out to a text file if so desired. See GablabToolbox for more documentation and overview.

run: A term used to describe a single pass-through of a given experimental paradigm, which generally corresponds with a single chunk of time between turning the scanner on and turning it off. A given experiment for one subject often consists of several runs, which are often all modeled together in a fixed-effects analysis. Generally, it does not mean the whole time a subject is in the scanner if there are several chunks of scanning time in there. Often used interchangeable (and confusingly) with session. SPM generally says "session" where AFNI or BrainVoyager say "run."


scanner drift: see drifts.

SCM: see stimulus-correlated motion.

SCR (or skin conductance response): A physiological measure of autonomic arousal. Also called galvanic skin response (GSR). A measure in which electrodes are pasted to the skin - oftentimes on a finger or some part of the hand/arm - and the electric conductance of the skin area is recorded, measuring a sort of pre-sweaty-palms arousal. Possibly a better physiological measure to correlate with brain activity than measures like heartbeat or respiration rate, because it does not induce any known physiological motion artifacts (unlike the cardiac and respiratory cycles). See PhysiologyFaq for more on physiological responses and data collection.

script: in Matlab, a type of .m file that doesn't take arguments or give output, but merely operates in the base workspace. Essentially scripts are just a text file containing a bunch of Matlab commands exactly as if you'd typed them, in order, at the Matlab prompt when you ran the script. Scripts are contrasted with functions, which have their own workspaces and don't have access to the base workspace. Most SPM sub-programs are functions, as are most of the GablabToolbox, but not all of it. See MatlabProgramming for more on functions and scripts.

segmentation: A spatial step in which an automated algorithm classifies a brain image into different tissue types. Standard segmentation programs start with an MRI image - generally, but not always, an anatomical scan - and give out images of all the gray matter in the brain, all the white matter, and all the cerebrospinal fluid (CSF). Each voxel is thus labeled uniquely as being one of the three standard tissue types. Those images can then be used to make mask images (to restrict analysis to gray matter only, for example) or to do voxel-based morphometry, or a lot of other things. Segmentation can be pretty inexact, due to problems like partial voluming and other issues, so advanced segmentation algorithms these days sometimes do a "soft classification," where voxels are labeled only with a probability of being a certain tissue type, rather than a definite label. Other segmentation algorithms go farther and use anatomical information to classify voxels into different structures as well as different tissue types. See SegmentationFaq for lots more.

SEM: see structural equation modeling (SEM).

session: An ambiguous term usually used to denote the exact same thing as run: the chunk of time in an experiment between turning the scanner on and turning it off, during which you have one pass of your experimental paradigm. Oftentimes, the experiment on one subject will have several sessions, which might all be the same paradigm or different ones. Unfortunately, this term has also been used to denote the whole single-subject experiment; i.e., one scanning session is the whole time you have the person in the scanner, which might include several different runs. On this site, we use "session* as SPM does, which is interchangeably with "run." When we mean the whole experiment, we'll say "the whole experiment." Don't even get me started on the word "scan."

signal dropout: see dropout.

signal-to-noise ratio: One of the most self-explanatory terms out there. If you can quantify the amount of signal you have in a measurement and the amount of noise, then you divide the former by the latter to get a ratio - specifically, your signal-to-noise ratio, or SNR. Your SNR is a far more valuable measure of how much power your measurement will have than, say, average intensity; if the measurement is brighter, that could mean more signal or more noise. Things like smoothing change average intensity unpredictably, but always aim to increase SNR. Calculating SNR can be tricky, because it requires some determination (or at least estimation) of how much noise your measurement has, which may not be known. But things like phantom measurements can help. See ScanningFaq for a little bit of commentary on how your scanning parameters can tweak your SNR.

sinc, sinc interpolation: The sinc function is a function related to the sine function - it actually is shorthand for "sine cardinal," and is commonly calculated as sinc(x) = sine(x) / x. Sinc functions are commonly used in many signal processing domains for interpolation, due to their smoothness and ease of computation and other nice mathematical properties. Most neuroimaging programs use (or used) sinc interpolation for almost any task needing resampling or interpolation - warping or moving an image, time-shifting a timeseries, etc. Like any interpolation procedure, it introduces slight errors into the signal; Grootonk et. al, for example, feel that much of the residual effects of motion left over after realignment related to interpolation errors (see RealignmentPapers). SPM2 has changed; it now uses B-spline interpolation for most resampling tasks.

single-subject canonical: An image distributed with SPM that is a very clear anatomical scan of a single brain (as opposed to the average scan of many brains, which is how brain templates like the MNI brain are made). The single-subject canonical is often used as a background to superimpose normalized results onto, because the brain is roughly average in shape and more or less lines up with the MNI template. It's also a very, very clear scan (made by averaging many scans of the same brain together) and so is much clearer than a standard in-plane anatomical scan for a single subject might be. However, the single-subject canonical is not an exact map onto the MNI or Talairach templates; activation which appears to be in one structure on the canonical image may not lie in that structure in either template brain. This image is generally found in the SPM directory, in the /canonical subdirectory; it is Analyze-format in SPM99 and earlier and MNC format in SPM2.

slice timing, slice timing correction: A spatial preprocessing step which aims to correct for the fact that not all slices of a functional volume are sampled at the same instant. Functional images aren't acquired instantly - they are sampled across the whole TR, so with a descending pulse sequence and a 2-second TR, the bottom of the brain is sampled almost two seconds after the two of the brain. If every voxel in the brain is analyzed with exactly the same model, then the onsets you've specified are going to be correct for some parts of the brain and wrong for others. If you say a trial happens at time 1, in the above example, and the TR starts right then, your onset is almost 2 seconds off for voxels at the bottom of the brain, because by the time you sample them, they're 2 seconds into their hemodynamic response already. Slice timing correction aims to fix this problem by simply time-shifting or interpolating all the voxels in the brain to line up with a reference slice. The methods for doing this are fairly uncontroversial and generally accepted as necessary for all event-related designs. See SliceTimingFaq for more.

slice thickness: Generally when you take a functional MRI sequence, your voxels aren't isotropic - there is a given matrix within a slice (often 64x64 voxels), and a certain set of slices (usually ranging from a few to a few dozen). Your slice thickness is exactly what it sounds like - how thick, in millimeters, your slices are. This is also called the through-plane resolution of your voxels - voxels are often thicker between slices than within a slice. Sometimes you'll leave a gap between slices; this is called the "skip" distance and isn't factored into your slice thickness.

small-volume correction: If you have a pre-existing hypothesis about a particular region in the brain - an anatomical or functional ROI from another study, say - then you might want to search within only that region for activation. This helps avoid the multiple-comparison problem for thresholding; instead of correcting your threshold for the tens of thousands of voxels in the whole brain, you can say you're only looking within a small region and correct for only the hundreds or thousands of tests within a much smaller region. This is called small-volume correction. It's available in SPM through the results interface's S.V.C. button. This button is also used sometimes to merely save a cluster or region as a functional ROI in SPM, rather than actually looking at the corrected statistics. See PthresholdFaq for more on thresholding.

smoothing: A spatial preprocessing step in which your functional images are blurred slightly. Each voxel's intensity is replaced with a weighted average of its own intensity and some voxels around it; this is accomplished by convolving a Gaussian function - the smoothing kernel - with the intensity at each voxel. The amount of blurring is determined by the size of the kernel. Smoothing can greatly increase your signal-to-noise ratio, as well as increase the chance of getting group activations (by increasing the size and hence overlap of functional regions) and validating the assumptions of Gaussian random field theory if you're doing that sort of family-wise error correction. The downside of smoothing is, well, it makes your data blurrier. This is a problem if you're trying to decide whether one voxel or its neighbor is active, or if you're worried about smearing activation across anatomical or functional boundaries in the brain. It effectively reduces the resolution of your images. SmoothingFaq has tons, tons more on why to smooth and why not to smooth.

smoothing kernel: A generally Gaussian function which is convolved with voxel intensities in a given functional image during smoothing. The "size" of the kernal is the FWHM (full-width half-maximum) measurement of the Gaussian function. Common kernel sizes for fMRI range between 2 and 12 mm, depending on what you're looking for. See SmoothingFaq for more on choosing a kernel size.

smoove-ing: A process that can only be carried out by Smoove B, Love Man. Damn.

SNR: see signal-to-noise ratio (SNR).

spatial frequency: Like any other signal, images can be analyzed in terms of their frequency. A gross simplification might be looking at the image intensities of neighboring voxels as a timecourse, and finding the frequencies of the waveforms contained within. In real life, finding spatial frequency is a little trickier, but the idea is the same. Low spatial frequency equals slow change in intensity; areas with low spatial frequency in an image are largely homogeneous, smooth, and less-varying. High spatial frequency equals fast change in intensity; areas of high spatial frequency in an image are often edges, or choppy patterns. K-space is a way to view images in terms of their spatial frequency. In ScanningPapers, there are some nice explanations of k-space and even some pictures to do it justice.

spatial preprocessing: See preprocessing; this term refers specifically to spatial transformations done before analysis, like normalization, smoothing, slice timing correction or realignment, and excluding temporal manipulations like high-pass filtering or pre-whitening.

spatial smoothness: A measure of spatial frequency. Spatial smoothness just measure the amount of low-spatial-frequency information in an image or a local region of an image. This is a way of quantifying how smoothly an image varies across the whole volume or a small chunk of it. Images have to have a relatively high spatial smoothness to satisfy the assumptions of Gaussian random field theory and be eligible for Gaussian-random-field family-wise error correction. Increasing their spatial smoothness can be accomplished with, of all things, smoothing. Crazy. See SmoothingFaq and PthresholdFaq for the relationship between smoothness and thresholding.

spiral imaging: A particular pulse sequence in which k-space is sampled in a spiraling trajectory, rather than in discrete lines. Spiral imaging avoids some of the common artifacts than can plague other sequences like EPI: geometric distortions, ghosting, or radical displacement. Spiral artifacts tend to be simply blurring of greater and lesser degree. Some spiral sequences can be more susceptible to dropout, but spiral in-out sequences seem to recover a great deal of signal from all parts of the brain. See ScanningFaq for a bit more on spiral sequences.

spiral-in, spiral-out, spiral in-out, spiralio: Different variations of spiral pulse sequences. In spiral-in, k-space is sampled in an inward-spiraling trajectory during the TR; in spiral-out, k-space is sampled in an outward-spiraling trajectory. Spiral in-out (also called spiralio) sequences do both, sampling k-space on an inwards spiral followed by an outwards spiral during the same TR and averaging the two images together. Spiral in-out sequences in particular do an excellent job at avoiding dropout in many areas of the brain traditionally thought to be difficult to image ue to dropout. Check out ScanningPapers and ScanningFaq for more.

SPM (or SPM99, SPM2, etc.): A software package for neuroimaging analysis, written in Matlab and distributed freely. Stands for "Statistical Parametric Mappin." Probably the most widely-used package worldwide currently. SPM2 is the latest version; SPM99 was the most recent until about 2003, and before that was SPM98 and SPM96. Developed and supported by the Wellcome Department of Imaging Neuroscience, among them ?KarlFriston and others. Has an easy-to-learn interface combined with some of the most sophisticated statistical modeling available.

Spoiled Grass: A particular pulse sequence used for high-quality 3-dimensional anatomical scans.

stim function, stimulus function: The analogue to an onset vector for programs like AFNI and others. A way of specifying when trials of a particular condition started and stopped. Instead of listing the onset times in scans or seconds, a stimulus function is a vector with as many elements as the experiment has TRs - one element for each functional image or timepoint. Everywhere a given condition was happening has a 1 in the vector; everywhere has a zero. So the stimulus function for a particular condition specifies when a trial was happening by putting ones in the timecourse during that condition's trials and zeros elsewhere - sort of like a mask image for time instead of space. Forming the functions in this way makes them easy to assemble into a design matrix. See BasicStatisticalModelingFaq for more on those.

stimulus-correlated motion: Head motion during an experiment is a big enough problem to start with. But random head motion can be dealt with by realignment and including your motion parametes in your design matrix, to eliminate any signal correlated with head motion. So why doesn't everyone just do that? Because if your subject moved their head in correlation with your task paradigm, removing motion-correlated signal will also remove task-correlated signal - which is what you're looking for. So stimulus-correlated motion is a big problem because it prevents you from regressing out motion-related activity. Evaluating your SCM should be a priority for anyone who includes motion parameters in their design matrix, particularly if you don't use a bite bar or if you have an emotionally-intense paradigm. Check out RealignmentFaq for more.

structural equation modeling: A statistical method for analyzing functional connectivity. Structural equation modeling (SEM) allows you to start with a set of ROIs and figure out what the connection strengths between them are, via a model-fitting process. It can't be used to determine the directionality of connection, but it can do a good job describing which connections are strong and which are weak, which can be crucial in ruling out certain theoretical constructs. See ConnectivityFAQ for lots more on this.

stub: A page on a Wiki site that's been created, but doesn't have any real content on it yet. Stubs cry out for someone to lovingly elaborate them and fill them with beautiful content. If you find one, don't let it cry all alone... add a little something.

SUMA: An add-on package for AFNI which adds the ability to do surface mapping analyses. This can allow far better visualization of how information is organized on the cortex. Like all the AFNI stuff, well-supported and run out of NIH. Don't know if it does inflation and/or flattening like BrainVoyager, but several recent posters have argued surface mapping in itself does a better job finding activations than standard voxel-based analyses.

surface mapping: The cortex, where much of the brain's processing takes place, could be flattened out into a flat sheet. But in the head, it's all crinkled up into sulci and gyri. If you ignore the folds of the brain and simply analyze it like it's all one homogenous shape - as traditional voxel-based analysis do - then you may well miss important principles of how activation is organized, and you might even miss real activations in general. Surface mapping techniques are related to inflation and flattening techniques, and surface mapping is in fact a necessary prerequisite for those. Surface mapping simply starts with a high-quality anatomical scan, and builds a three-dimensional model of the folds and curves of the brain, which is then linked to particular voxels in the functional analysis. This allows the activation from the functional images to be mapped not just to a particular voxel, but to a particular point on the surface of the cortex. This surface can then be manipulated and visualized in far more interesting ways than simple voxel-based pictures allow.

susceptibility: Also called magnetic susceptibility. Used to describe regions where magnetic fields are generally more distorted, chopped up, and subject to dropout, due to the tissue characteristics of a region. Usually, regions of high susceptibility (try typing that five times fast) are near tissue/air interfaces, or interfaces between two different types of tissue, where the magnetic differences between the two materials causes distortions in the local field. High-susceptibility regions traditionally include the orbitofrontal cortex, medial temporal lobe, and many subcortical structures. Spiral in-out imaging has shown good promise at dealing with susceptibility-induced dropout. See ScanningFaq for more.

SVC: see small-volume correction (SVC).


tal2mni: A script written by Matthew Brett and embedded in the GablabToolbox, which aims to convert a set of XYZ coordinates from a given point in the Talairach atlas brain into the same anatomical point in the Montreal Neurological Institute (MNI) standard template brain. The Talairach brain, which is used as the normalization template for AFNI, BrainVoyager, and other programs, differs slightly from the MNI brain in several ways, particularly in the inferior parts of the brain. In order to use facilities like the Talairach Daemon or other Talairach-coordinate lookups to make ROIs for normalized SPM results, or in order to report Talairach data in MNI coordinates, it's necessary to convert the Talairach coordinates into MNI space with this script. It's not a perfect mapping, but it's widely used, and it's the best that's out there right now... See RoisFaq and NormalizationFaq for more.

Talairach, Talairach space, Talairach brain, Talairach atlas: In 1988, Talairach and Tournoux published a widely-cited paper created a common reference coordinate system for use in the human brain. The paper set forth axes labels and directions, an origin at the anterior commissure, and anatomical and cytoarchitectonic labeling for many individual coordinate points within the brain. The coordinate system is based on one reference brain they dissected, sometimes referred to as the Talairach brain. The coordinate system has been widely adopted, and many algorithms have sprung up to normalize arbitrary brains to the Talairach reference shape. Coordinates in the reference system are said to be in Talairach space, and the full listing of coordinates and their anatomical locations is called the Talairach atlas. (Tournoux pretty much got ths short end of this whole stick.) Although the coordinate system has been widely used and has proven very valuable for standing reporting of results, it has drawbacks: the Talairach brain itself is a fairly unrepresentative single subject (and differs significantly from a more average template brain - see tal2mni), it ignores left-right hemispheric differences as only one hemisphere was labeled, and there are no MRI pictures available of it to be directly comparable. Some programs, like SPM, have avoided using the Talairach brain for normalization, but Talairach labeling is pretty much inescapable at this point. Check out RoisFaq for a little more on all this, as well as NormalizationFaq.

Talairach Daemon: A very nice software package hosted by UT-San Antonio and developed by Lancaster et. al, the Talairach Daemon takes in a set of coordinates in Talairach space and spits out a set of anatomical labels for each point - hemisphere, anatomical area, brodmann area, tissue type - based on the Talairach atlas. This allows you, in an automated fashion, to label your results in a common space with many other researchers. The Talairach Daemon database is also available to do lookup based on labels - and so the GablabToolbox "Generate Tal Rois" script generates ROI files in Talairach space from the same database, based on anatomical criteria. The Talairach Daemon can be queried over the web or downloaded in its entirety for your network.

task-correlated motion: see stimulus-correlated motion.

temporal derivative: Derivative of a function with respect to time. In SPM, the temporal derivative of the canonical HRF looks something like the canonical but can be used as a basis function, to model a degree of uncertainty as to the exact onset of the HRF. At least one empirical study has found that including the temporal derivative significantly reduces power in the study; see HrfPapers for more (as well as HrfFaq).

temporal filter: A filter applied in the temporal domain to some signal to help cut noise. Temporal filters knock out some frequencies in a given signal while allowing others to pass through; some types include high-pass, low-pass, and band-pass. In fMRI, applying some temporal filtering is a terrifically good idea, because noise is heavily concentrated in some parts of the frequency spectrum. See TemporalFilteringFaq for more.

Tesla: The Tesla is the standard international (or metric system, if you must) unit of magnetic flux density. It's abbreviated simply as T. It measures, in a nutshell, the strength of a standing magnetic field at a given point. It's named after Nikola Tesla, the engineer who discovered the rotating magnetic field back in the 19th century. The strength of an MRI scanner is measured in Tesla. Most scanners in current use for humans are rated as 1.5T or 3T; human scanners up to 7T can be found around. For comparison, the Earth's standing magnetic field is around 2.5 * 10e-5 T.

threshold: see p-threshold

through-plane resolution, thruplane resolution: see slice thickness.

time-locked averaging: A technique in signal processing for signals where some repeating signal is corrupted by random noise. If you know the timepoints in the timeseries when the signal starts and can choose a window of time following the start to look at - say, 30 seconds - then you could take the 30-second chunk following each signal onset and average all those 30-second chunks together. If you have five signal onsets, then you have 5 windows; you average the first timepoint in each window together (all 5 of them), then the second timepoint in each window together (all 5 of those), then the third timepoint, etc. This creates an average time window - the average response following a signal onset. If the noise is roughly random, it should average to zero, and you'll get a clearer picture of your signal than from any individual response. The resulting peristimulus timecourse is called "time-locked" because it always describes a given time following the onsets - it's locked in time to the condition's onsets. This technique has long been used in EEG, and with the advent of event-related designs, it began to be used in fMRI as well. RoiPercent is a script in the GablabToolbox that does time-locked averaging for fMRI. The technique may not be appropriate for rapid event-related designs; when the window following an onset overlaps the onset of other signals, the final timecourse can be muddled by other signals. RoiDeconvolve might be used in that case.

timecourse, timeseries: A list of numbers that are taken to represented some measurement sampled over time. Each point in the timeseries represents a specific point in time; neighboring points represent neighboring moments, later points represent later points in time, etc. Many scientific domains deal with timeseries data, and so a good deal of research has been done on how to deal with any peculiar characteristics they might have - autocorrelation, etc. In fMRI, the most common timeseries would be the series of measurements from a specific voxel across all the functional images - that repeated measurement represents a series of samples over time in (we hope) one unique point in the brain.

Timeseries Explorer: A program in the GablabToolbox which enables some AFNI-esque capabilites for SPM users - the ability to click around a brain and instantly see the whole timecourse from the clicked point. It also enables you to view average timecourses from specific ROIs, and will also do time-locked averaging on the fly for you, to allow you to see condition-by-condition effects at a moment's notice. Can be a handy tool for ArtifactDetection. Check out GablabToolbox for more.

truncate, truncation: A simple image manipulation which turns an ordinary image into a mask image. You choose some intensity threshold to truncate your image with - 0.2, say - and then every voxel whose intensity is above that threshold is given a new intensity of 1, and every voxel whose intensity is below that threshold is given a new intensity of 0. Often used on ROI mask files after smoothing, to make sure the boundaries of the ROI aren't too extended by smoothing.


uicontrol: The name in Matlab for any graphical interface element that you can interact with - buttons, check boxes, text edit boxes, etc. Gives rise to the ever-popular "uicontrol callback" error - see CommonSpmErrors for details.

unwarping: A relatively new preprocessing technique, available only in SPM2 right now, which attempts to eliminate some of the residual effects of head motion after realignment. The method attempts to estimate a map of the inhomogeneities in the magnetic field and thus map regions of high susceptibility, and calculate how those regions might have distorted the data around them, once the head motion parameters are known. Unfortunately, this method is currently (and probably will always be) available only for EPI functional data, not spiral, due to the models of geometric distortion used. See RealignmentFaq for a bit more.


variable ISI: Stands for variable inter-stimulus interval. A type of experiment in which a varying amount of time separates the beginning of all stimuli - trials can be all the same length or all different, but the onsets of stimuli aren't all the same length of time apart. Variable-ISI studies most often have ISIs that are randomized within certain extremes, not just arbitrarily variable. Generally only event-related designs are variable-ISI, although there's no reason why you couldn't have a limited-variability-ISI block design experiment. Variable-ISI event-related experiments, though, are much better than fixed-ISI experiments at both efficiency and power, especially as the ISI increases. In general, several empirical studies have shown that for event-related designs, variable-ISI is the way to go. For block designs, the difference is fairly insignificant, and variable-ISI can make the design less powerful, depending on how it's used. See JitterFaq for more on the difference between fixed and variable.

voxel: One "dot" in a 3-D picture. Like "pixel" for 2-D pictures, but it's three-dimensional. Voxels have a given size, usually a few millimeters in any direction (although they can be isotropic or anisotropic). Their size is specified in millimeters generally, like 2x2x5; the third dimension is generally the through-plane size or slice thickness. In a given brain, you'll often have tens of thousands of voxels, even if you haven't resampled your voxels to be smaller during preprocessing. Voxels are specified on a coordinate system that's different than the millimeter coordinate system; millimeters coordinates have their origin in the middle of the image (and so can be negative), whereas voxel coordinates start counting in one corner of the image and are always positive.

voxel-based morphometry (or VBM): A type of analysis which doesn't look at functional images, but instead looks at the differences between subjects' anatomy. Anatomical images are segmented into different tissue types, and the measurements generally looked at are the total volume of gray matter (or white matter or CSF) in a given anatomical structure. This type of analysis is about the form, or morphometry of the brain, and it's based not on the surface of the brain or any dissection of it but on arbitrarily-sampled voxels - hence the name. See SegmentationFaq for more.


white noise: Noise which is random and independent from measurement to measurement. In other words, it is equally strong in all frequencies. White noise is nice because it tends to average to zero, which enables the use of many simple smoothing techniques to get rid of it, although it tends to defeat filtering techniques. It's in contrast to colored noise, which has some correlation from timepoint to timepoint. fMRI noise tends to be pretty white in the spatial domain (with some exceptions) and severely colored in the temporal domain. See SmoothingFaq and TemporalFilteringFaq for a little more.

whitening: see pre-whitening.

Wiki: The type of website you're on right now! The word is Hawaiian for "fast." Wiki sites all share the ability to create and edit pages simply and easily from the actual site itself, and anyone's encouraged to collaborate on them. AboutWiki is a nice little intro to the concept. It's essentially a way of a particular site's design and content into an open-source project. There are several great Wikis out there; the Wikipedia is maybe the most useful...

Wikipedia: One of the neatest sites on the Web right now. A vast, collaborative, multi-language, free encyclopedia. Over 250,000 articles to date. Go there and love them.


X-beta: A program in the GablabToolbox used to extract beta weights from regions of interest. Can handle multiple subjects, multiple ROIs and multiple conditions, and give you back out nice little text files suitable for spreadsheets and all organized. See GablabToolbox for more.




1.5T, 3T, 4T (etc.): Ratings of different strengths of MRI scanners. T is the abbreviation for Tesla, the international standard unit for magnetic flux density.

3dDeconvolve: An exquisite program in AFNI which uses a finite impulse response model to spit out condition-specific timecourses for a given subject and produce great output datasets. Totally ripped off to create RoiDeconvolve for SPM. Written by ?DougWard. Check out the manual at PercentSignalChangePapers; you'll learn more about fMRI analysis in general that you could possibly think.


Frequently Asked Questions - Your Friend, The Hemodynamic Response Function

1. What is the 'canonical' HRF?

The very simplest design matrix for a given experiment would simply represent the presence of a given condition with 1's and its absence with 0's in a that condition's column. That matrix would model a signal that was instantly present at its peak level at the onset of a condition and instantly offset back to baseline with the offset of a trial. It's possible this may be a good model of neuronal activity, but since fMRI measures BOLD signal rather than neuronal activity directly, it's clearly not that good a model of the hemodynamic response that BOLD signal represents.

If we assume the hemodynamic response system is linear, linear systems theory tells us that if we can figure out the hemodynamic response to an instantaneous impulse stimulus, we can treat our real paradigm as the conglomeration of many instantaneous stimuli of various kind and the hemodynamic response should sum linearly. Tests over the last ten years suggest that the brain does, in fact, largely behave this way, so long as stimuli are spaced more than a couple hundred milliseconds apart. So the canonical HRF is a mathematical model of that impulse response function. It's a function that describes what the BOLD signal would theoretically be in response to an instantaneous impulse. Once your design matrix is described, your analysis software convolves it with a canonical HRF, so that your matrix now represents a gradual rise in activity and gradual offset that lines up with a 'typical' HRF.

Most of the common neuroimaging programs use similar canonical HRFs - a mixture of gamma functions, originally described by Boynton's group. This function has been found to be a roughly good model of hemodynamic response - at least in visual cortex - in most subjects. It models a gradual rise to peak (about 6 seconds), long return to baseline (another 10 seconds or so) and slight undershoot (around 10-15 seconds), the whole thing lasting around 30 seconds or so.

2. When should you use the canonical in your model? When should you use different response functions? (HRF, HRF w/ derivatives, etc.) What's the difference?

Generally, the canonical HRF is a decent fit to the true HRF for many normal subjects in many cortical and subcortical regions. If your analysis is intended primarily to test a hypothesis about neural activity, looking only for size and place of activation, and you believe your subjects to be reasonably normal, the canonical makes good sense. If you're looking to find out more about your activation than simply where it is and how big it is, though, you'll need to get a little fancier. Using the canonical HRF will tell you how much the canonical HRF (convolved with your design matrix) needed to be scaled to account for your signal. But you might be interested in more detail - how much variance there was in the onset of the HRF, or how much in its length, or the true shape of the HRF for your subjects. Or you might not think the canonical HRF is a good enough fit to your subjects or your region of interest for you (and there's certainly evidence to make that thought reasonable).

In those cases, you may want to complicate your model a bit. In the extreme, you could not use any sort of a guess at an HRF, and instead directly estimate the shape of your HRF by separately estimating parameters for every timepoint following your stimulus - a finite impulse response (FIR), or deconvolution model. We'll talk about those in more detail later in the course. Alternatively, you might choose to model your neural activity as a linear combination of basis functions like sines and cosines - this will guarantee you can get an excellent fit to your true HRF and use a different HRF at every voxel, thus avoiding the problem of regionally-different HRFs. This is a Fourier basis set model. As an intermediate step between the FIR/basis set - type models and the pure canonical HRF models, you might try modeling your activation as a combination of two or three functions - say, the canonical HRF and its temporal derivative, or dispersion derivative. This will separately estimate the contribution of the canonical HRF and how much variability it has in time of onset (temporal deriv.) or shape (dispersion deriv.).

There are tradeoffs for using the more complicated models: for FIR and Fourier models, the interpretation of any given parameter value becomes very different, and it becomes much more difficult to design contrasts. Using the intermediate steps can get you back physical interpretability, but at the price of decreased degrees of freedom in your data without a guarantee of better fit overall - and, in fact, Della-Maggiore et. al (HrfPapers) find that the HRF w/ temporal derivative has significantly decreased power relative to the canonical alone in a typical experimental design. So use at your own risk...

3. When does it make sense to do a regionally-specific HRF scan?

If you're particularly interested in a region that's not primary visual cortex and you'd like to get a very good fit of your model to the data, it may make sense to try and get an HRF that's specific to a different region. The canonical HRF is derived from measurements in V1 of some subjects, and studies like Miezin et. al (HrfPapers) have demonstrated that between-region variability within a subject in one scan can be significant.

However, if you're worried about this, rather than taking an entirely separate scan or task to estimate impulse response in a region, it probably makes more sense to use an analysis type that models the response of each voxel separately - like an FIR or Fourier basis set model - which doesn't assume that you have the same HRF at every voxel.

4. When does it make sense to do a subject-specific HRF scan?

If you're looking to study intersubject variability, or if you're looking to improve the fit of your model by a good chunk and you can afford the extra time in the scanner. Several studies, like Aguirre et. al and Miezin et. al (HrfPapers), have demonstrated there is a significant amount of variability between subjects in several parameters of the HRF - time to onset, time to peak, amplitude, etc.

Perhaps more importantly, the canonical HRF is based on measurements from normal, adult cortex. It's becoming clear that populations like children, the elderly, or patients of various kinds may have HRFs that differ significantly from the canonical. Particularly in any sort of between-group study in which these populations are being compared to normal subjects, it's crucial to ensure that any effect you see isn't driven by the difference in fit of the HRF between groups - you would expect the canonical HRF to provide a better fit (and hence more and larger activations) to normal adults than it would to patients or non-standard populations. In cases like these, using a subject-specific HRF may be necessary or at least desirable.

5. Which regions have particularly different HRFs?

Probably a lot of 'em. But in particular there are questions about the extent to which the canonical HRF or others measured from cortical neurons maps onto HRFs for subcortical structures like the basal ganglia. Definitive answers about this sort of thing await further study. Logothetis & Wandell (2004; HrfPapers) discuss some reasons why regions might differ in HRF - from increased white matter density to differences in vascular density to, um, complicated physiological things. But they have one clear point: we know HRF can vary from region to region, and there is no accurate way currently to convert the absolute BOLD magnitude to any neural measure. Which means comparing absolute BOLD effect between regions, even nearby, is simply not justified theoretically. In their words:

"It seems that with our current knowledge there is no secure way to determine a quantitative relationship between a hemodynamic response amplitude and its underlying neural activity in terms of either number of spikes per unit time per BOLD increase or amount of perisynaptic activity." - Logothetis & Wandell (2004)

So be cautious about comparing absolute magnitude between regions...

6. Which populations have particularly different HRFs?

Surprising few studies have been published on this subject. Ongoing studies at Stanford (Moriah Thomason & Gary Glover) suggest children of at least a certain age have important differences in their HRF, and it's clear the same is true of elderly subjects. But at present, if you're interested in using any non-standard population of subjects, it's not a crazy idea to assume they have significant differences in their HRF from the standard canonical.

7. What's the difference between an 'epoch' HRF and an 'event' HRF?

If you're using AFNI/BrainVoyager/SPM2, there isn't any. So don't worry about it. But if you're using SPM99 or earlier, the story above about the canonical HRF is actually oversimplified. In SPM99 (and earlier), epoch-related and event-related studies had the same underlying design matrix form - the onset of a trial was marked with a 1, but the actual trial itself was all 0's. Event-related studies were simply modeled by convolving those with a canonical HRF, but epoch-related studies clearly needed to account for the length of the trial. So there was a separate, epoch-related, canonical HRF, that was also based off a mixture of gamma functions, but was specifically scaled to account for the length of the trial - so the HRF that was convolved with the design matrix was different for a 12-second epoch and a 30-second epoch. The epoch HRF generally looked like a wider, fatter canonical HRF, and represented a model of linearly summed HRFs over the course of the trial.

With a re-vamping in data structures, though, in SPM2, and further study, this difference was scrapped. Epochs are now modeled differently at the level of the origingal, pre-convolved design matrix, with 1's down the whole length of the trial, and events and epochs are convolved with the same canonical HRF. This seems provide an equally good or better fit to real data as well as simplifying many calculation aspects.

8. What relationship does the BOLD have to the underlying neuronal activity?

(This should probably be higher up the question list.) The BOLD signal is produced by an influx of oxygenated blood to a local area of neuronal activity, to compensate for increased energy usage. But neurons use energy for a lot of things, both pre- and post-synaptic: action potentials, increased membrane potentials, cleaning up neurotransmitter, putting out neurotransmitter, etc. In order to correctly interpret the BOLD in a given region, we need to know if it's caused by, say, increased firing (i.e., output) or increased post-synaptic activity (i.e., increased input) or some combination. What aspect of the underlying electrophysiology does the BOLD correlate with?

Logothetis & Wandell (HrfPapers) review a good deal of this work. Several electrophysiology studies have found tight correlations between BOLD and local field potential (LFP), a lower-frequency electrical measure summed over many neurons (but still with good spatial resolution). Similarly, fast ERP amplitude seems to vary linearly with BOLD (Arthurs & Boniface (2003), RoisPapers). Slow ERPS, which are believed to arise from _post_synaptic potentials, correlate with BOLD in parietal cortex (Schicke et al. (2006), HrfPapers). Some studies have also found linear relationships between spike rates and BOLD, and spiking activity likely also correlates with BOLD, although perhaps not as robustly (Logothetis et al., 2001).

All this goes to suggest that BOLD may originate less in actual neuronal spiking and more in low-frequency potentials or increased excitability of neurons: in other words, BOLD reflects input to an area more than output. Clearly, input and output are often correlated for neurons; excitatory input will increase spiking outputs. But they aren't always; if this hypothesized connection is true, it means an increased BOLD could reflect increased inhibitory input to a region, or summing of both inhibitory and excitatory inputs (resulting in no change in spiking). Logothetis & Wandell also cite examples where BOLD might be different from single-unit recording - e.g., an area may appear highly direction-sensitive with BOLD, even if single-unit recordings show it not to be, because it is highly interconnected with a very direction-sensitive area. Attention effects, which are difficult to find in V1 with single-unit recordings but show up in fMRI, might be another example. So we should have caution in using models that require BOLD signal to directly index increased spiking outputs.

This is not to say it's impossible to map single-unit firing onto BOLD signal; retinotopy in visual cortex, for example, happens at the single-unit level, and is easily detectable with BOLD. But merely note: many factors, from input activity to vascular density (see above), etc., can affect regional BOLD response.

9. How does the hemodynamic response change with stimulus length?

In very nonlinear ways. See Glover (1999) in PercentSignalChangePapers (and Logothetis & Wandell, 2004, HrfPapers). You're on very shaky ground if you attempt to model stimuli longer than 6 sec by convolving a standard HRF with a boxcar. You may be better off using stimuli separated by longer times or shorter stimuli.


HRF Papers

Useful Papers - Your Friend, The Hemodynamic Response Function


Aguirre et al. (1998), "The variability of human BOLD hemodynamic responses," NeuroImage 8, 360-369 PDF

Summary: In order to evaluate how much variability exists in the shape of HRFs collected from different sessions, days, and subjects, Aguirre et. al tested various sets of 40 subjects tested on various days and in various sessions. They found a good deal of variance accounted for by differences in subjects, and significant differences for many subjects between different scanning days. Within the same day and subject, thought, the HRF seemed relatively stable.

Bottom line: Shows that subject-to-subject and day-to-day variance in HRF can be high, but within a day across runs, the HRF is relatively stable.


Miezin et al. (2000), "Characterizing the hemodynamic response: Effects of presentation rate, sampling procedure, and the possibility of ordering brain activity based on relative timing (Ed. - whew!)," NeuroImage 11, 735-759 PDF

Summary: A long paper describing a number of different studies/analyses looking at linearity of the HRF (as stimulus presentation rate and sampling procedure varies) and the variance between regions, subjects, and different aspects of the HRF (amplitude, time-to-onset, etc.).

Bottom line: Excellent look at the factors influencing the HRF and how stable all its aspects are. Demonstrates HRF remains stable and linear within a subject and certain timing parameters, but outside those, less so.

Della-Maggiore et al. (2002), "An empirical comparison of SPM preprocessing parameters to the analysis of fMRI data," NeuroImage 17, 19-28 PDF

Summary: Another find-the-best-analysis-parameters study, this time using Monte Carlo simulations on simulated data to study power and false-positive rate as various SPM parameters changed. Of particular interest here is the contrast between using the canonical HRF alone vs. the HRF with temporal derivative. The HRF with temporal derivative is found to significantly reduce power in many circumstances.

Bottom line: The canonical HRF by itself does better than it does with the temporal derivative added in much of the time.

Neumann et al. (2003), "Within-subject variability of BOLD response dynamics," NeuroImage 19, 784-796 PDF

Summary: Neumann et. al investigate how variable the HRF is within a subject, over several scanning sessions spanning days or weeks. They look at stability of a number of parameters - time to onset, time to peak, time to baseline, etc. and use a couple different methods to extract HRFs. Time-to-peak is the most stable parameter they found, and voxels activated in all sessions had much lower variability than those only activated in some.

Bottom line: At least a subset of activated voxels within a subject has quite a stable hemodynamic response, with time-to-peak being the most stable aspect.


Schicke et al. (2006), "Tight covariation of BOLD signal changes and slow ERPs in the parietal cortex in a parametric spatial imagery task with haptic acquisition," European Journal of Neuroscience 23, 1910-1918 DOI

Summary: Participants learned an object map by touch and then imagined discs flying from object to object during ERP and FMRI acquisition. ERP amplitude parametrically varied with disc-flying distance, as did BOLD in left parietal cortex; source modeling suggested the ERP originated from that left parietal source.

Bottom line: Slow ERPs, thought to arise from postsynaptic activity, correlated with BOLD in parietal cortex, suggesting the BOLD response may reflect increased input to an area rather than spiking activity directly.

Logothetis & Wandell (2004), "Interpreting the BOLD signal," Annual Review of Physiology 66, 735-69 DOI

Summary: Comprehensive and excellent review of the complete chain of events in the head leading to the BOLD signal, including a bit of physics and a lot of neurophysiology. Several studies containing both cellular and BOLD recordings are reviewed, as are studies of neural vasculature and other regional differences influencing the HRF, and some suggestions are made for interpreting the BOLD signal.

Bottom line: BOLD correlates somewhat with summed local spiking, but somewhat better (and in places a great deal better) with lower-frequency electrical activity (often postsynaptic) summed over many neurons. In other words, BOLD results from a combination of pre- and post-synaptic activity - that is, outputs from an area and inputs into it - and probably correlates better with the input side. Comparing absolute BOLD amplitude between regions isn't justified, theoretically.

Liu & Newsome (2006), "Local field potential in cortical area MT: Stimulus tuning and behavioral correlations," Journal of Neuroscience 26, 7779-7790 DOI

Summary:LFP and multi-unit activity were extracted simultaneously from speed- and direction-sensitive areas of MT to try and get a handle on the spatial sensitivity of the LFP signal; presumably if its spatial resolution is very low, it wouldn't exhibit the same kind of direction- or speed-sensitivity found at the single- or multi-unit level. LFP activity did exhibit both speed and direction tuning, but only at higher frequencies (gamma band), suggesting that spatially localized information is mostly carried in the upper bands. Speed tuning was present only at higher frequencies than direction tuning, perhaps reflecting the finer-grained spatial organization of speed-tuned neurons.

Bottom line: The LFP (which, as noted above, contributes to BOLD and probably correlates better with itthan local spiking) has good spatial resolution in higher frequencies, at least to the level of cortical columns, suggesting that BOLD driven by LFP signals is not being contaminated by long-distance signals.

Robinson et al. (2006), "BOLD responses to stimuli: Dependence on frequency, stimulus form, amplitude, and repetition rate," NeuroImage 31, 585-599 DOI

Summary:Pretty mathematical paper using current models of the BOLD-neuronal firing relationship and parameters derived from ERP literature to derive the theoretical BOLD response for various shapes of neuronal responses (damped sinusoids, like ERPs). Also derives optimal block-stimulation frequencies.

Bottom line: For short stimuli, ,maximum BOLD peak is roughly proportional to the time integral (signed area under the curve) of neuronal activity. When neuronal response shapes differ (i.e., oscillate more or less), you might have higher firing-rate peaks for one stimulus but lower BOLD if area under the curve is different (due to oscillating firing rates). As well, to maximize BOLD response, a 7-sec-on, 7-sec-off block frequency is suggested based on the derived BOLD response.


Jitter FAQ

Frequently Asked Questions - Jitter

Jittering is heavily mixed in with experimental design and setting your scanning parameters, so be sure to check out the other design-related pages:

1. What is jittering?

It's the practice of varying the timing of your TR relative to your stimulus presentation. It's also often connected to, or even identified as, the practice of varying your inter-trial interval. The idea in both of these practices is the same. If your TR is 2 seconds, and your stimulus is always presented exactly at the beginning of a TR and always 10 seconds long, then you'll sample the same point in your subject's BOLD response many times - but you might miss points in between those sampling points. Those in-between points might be the peak of your HRF, or an inflection point, or simply another point that will help you characterize the shape of your HRF. If you made your TR 2.5 seconds, you'd automatically get to sample several other points in your response, at the expense of sampling each of them fewer times. That's "jittering" your TR. Alternatively, you might keep your TR at two seconds, and make the time between your 10-second trials (your inter-stimulus interval, or ISI) vary at random between 0 and 4 seconds. You'd accomplish the same effect - sampling many more points of your HRF than you would with a fixed ISI. You'd also get an added benefit - you'd "uncover" a whole chunk of your HRF (the chunk between 10 seconds and 14 seconds) that you wouldn't sample at all with a fixed ISI. That lower portion can help you better determine the shape of your whole HRF and find a good baseline from which to evaluate your peaks. This added benefit is why most people go the second route in trying to "jitter" their experiment - varying your ISI gets you all the benefits of an offset TR, plus more.

2. Why would I want to do this in my experiment?

If all you care about is the amplitude of your response, you probably wouldn't. In this case, you're assuming a certain shape to the hemodynamic response, and all you care about is how "high" the peak of the HRF was at each voxel for each condition. You'd want a design with very high statistical power - the ability to detect amplitude. On the other hand, you might not want to assume that every voxel had the same HRF shape for every condition. If you'd like to know more about the shape, without assuming anything (or less than everything, at least), you need a design with high statistical efficiency - the ability to accurately estimate shape parameters, without assuming a shape. See Liu and Dale (JitterPapers) for more on the tradeoff between power and efficiency.

Varying your ISI is a strategy to increase the efficiency of your estimates at the expense of your power. Clearly, because you'll be sampling each point of your HRF fewer times, you'll necessarily have less confidence in the accuracy of any given estimate. But because you'll have so many more points to sample, you'll have much more confidence about the true shape of your HRF for that condition. This is critical when you believe your experiment may induce HRFs of different shape in different region, or if knowing the shape (lag, onset time, offset time, etc.) of your response is important (as it is in mental chronometry). With a variable-ISI design, you can run a rapid event-related design and pack many more trials into a given experimental time than you would for a fixed-ISI design that sampled the whole HRF, or you can sample much more of the HRF than a fixed-ISI design could with the same number of trials.

3. What are the pros and cons of jittering / variable-ISI experiments? When is it a good/bad idea?

Variable-ISI experiments are a way of making the tradeoff between power and efficiency in an experiment. Fixed-ISI designs are extremely limited in their potential efficiency. They can have high power by clustering the stimuli together: this is a standard block design. But in order to get decent efficiency in an experiment, you need to sample many points of the HRF, and that means variable ISI. Any experiment that needs high efficiency - say, a mental chronometry experiment, or one where you're explicitly looking for differences in HRF shape between regions - necessarily should be using a variable-ISI design. By contrast, if you're using a brand new paradigm and aren't even sure if you can get any activation at all with it, you're probably better off using a block design and a fixed ISI to maximize your detection power.

From a psychological standpoint as well, the big advantage of variable-ISI designs is that they seem far more "random" to subjects. With a fixed ISI, anticipation effects can become quite substantial in subjects just before a stimulus appears, as they catch on to the timing of the experiment. Variable ISIs can decrease this anticipation effect to a greater or lesser degree, depending on how variable they are. Liu (JitterPapers) explores the conept of "conditional entropy" as a measure of randomness in how an experiment "seems," and it is, predictably, intertwined with the power/efficiency tradeoff.

4. How do I decide how much to jitter, or what my mean ISI should be?

Great question. Depends a lot on what your experimental paradigm is - how long your trials are, what psychological factors you'd like to control - as well as what type of effect you're looking for. Dale (JitterPapers) lays out some fairly intelligible math for calculating the potential efficiency of your experiment. Probably even easier, though, is to use something like Tom Liu's experimental design toolbox (JitterLinks) or, even better, AFNI's 3dDeconvolve -noinput option, which will take a given experimental paradigm and calculate how good it is in terms of power and efficiency. Once you're within the ballpark for the type of paradigm you like, these tools can be an invaluable way to optimize your design's jitter / ISI variation, and are highly recommended for use.

5. But how do I get better temporal resolution than my TR?

Simple: don't always sample the sample points of your response. If you always sample the BOLD response 2 seconds and 4 seconds and 6 seconds after your stimuli are presented, for your whole experiment, you'll have a very impoverished picture of the shape of your HRF. But if, for example, you sampled 2 sec. and 4 sec. and 6 sec. post-stimulus for half the experiment, then cut one second between trials and sampled 1 sec. and 3 sec. and 5 sec. for the rest of your experiment - why, then, you'd have a better picture. The cost, of course, is reduced power and expanded confidence intervals at the points you've sampled.

With a good picture of the shape of your HRF, though, you could then compare HRFs from two different regions and see which one had started first, or which one had reached its peak first. If HRF timing is connected in some reliable way to neuronal activation, you then don't need to sample the whole experiment at a super-fast rate - you could infer from only a limited-sample picture of one part of the HRF where neuronal activity had started first and where it had started second, which allows you to rule out certain flows of information.


Jitter Papers

Useful Papers - Jitter

Also check out DesignPapers, ScanningPapers, FmriPhysicsPapers and PhysiologyPapers...


Liu (2004), "Efficiency, power and entropy in event-related fMRI with multiple trial types, Part II: design of experiments," NeuroImage 21, 401-413 PDF

Summary: The more intelligible part in a two-part manifesto. Liu expands on his 2001 paper (see DesignPapers) to demonstrate the tradeoff betwen efficiency and power is fundamental for multiple-trial-type experiments. Crucially, he also explores what specific experimental designs bring you closest to the theoretical best power for a given efficiency (or vice versa). Permuted block designs look like great happy mediums.

Bottom line: 2-block designs and m-sequence designs are at the opposite ends of the power/efficiency spectrum, each getting you close to the theoretical limit of how much power/efficiency you can get. Permuted block designs give you excellent tradeoffs between them.


Dale (1999), "Optimal experimental design for event-related fMRI," Human Brain Mapping 8, 109-114 PDF

Summary: Perhaps the best early defense of variable-ISI studies; Dale attempts to end the debate between long event-related designs and rapid event-related designs by showing the variable-ISI designs tremendously improve on the power of rapid designs. A mathematical argument about statistical efficiency is used to show that random-ISI (i.e., jittered) designs can outperform fixed-ISI designs for the same mean ISI by more than a factor of 10.

Bottom line: Once and for all: if possible, you should definitely be using a variable-ISI rapid design in any event-related study. The math is all here...

Friston et. al (1999), "Stochastic designs in event-related fMRI," NeuroImage 10, 607-619 PDF

Summary: Another look at which designs make the most sense - block, fixed-ISI, or variable-ISI (connected to the "stochastic" design idea). Authors make the point that the best type of design will depend very heavily on what type of effect you're looking for - differential effects between conditions do best with a different design than do evoked responses changing from baseline.

Bottom line: An early distinction between efficiency and power is drawn, as a difference between looking for differential or evoked effects. The idea of constructing variable-ISI designs from "null events" is brought forth.


Matlab Basics

Matlab Basics

Here's a very quick intro to some of the basic concepts and syntax of Matlab. The hope is that after covering some of these basics, and those described in the other Matlab docs pages above, you'll be able to troubleshoot SPM and Matlab errors to some degree. Experienced Matlab programmers may find most of this redundant, but there's always more you can pick up in Matlab, so please, if you experience folk've got tips and tricks to suggest, add them onto these pages!

This page looks specifically at the most basic of basics. For info on how to read m-files and m-file programming concepts, check out MatlabProgramming. As well, MatlabPaths answers questions about modifying your search paths, and MatlabDebugging touches specifically on strategies for troubleshooting - how to get more information, what parts of error messages to look for, and how you can dig through a script to find what's gone wrong.

If you take nothing else away from this page, take this: The Matlab tutorial, contained within the program's help section, is terrific. It'll teach you a lot of the basics of Matlab and a lot of things you wouldn't think to learn. The progam help is always a great place to go for more info or to learn something new. Nothing will help your troubleshooting and programming skills like curiousity - when you hit something you don't know or recognize, spend a couple minutes trying to figure out what it is, with the Matlab and SPM help. If your help within the program isn't working, it's all online at the Mathworks website, here:

As well, if you type help command into Matlab, where command is some Matlab function name, you'll almost always get a quick blurb telling you something about the function and how it's supposed to operate. It's invaluable for reference or learning on the fly.

In the following, text in fixed-width font are commands that you can type directly into Matlab.


Here are some quick basics on Matlab code and what it all means. You don't need to be expert on all of this, but it may help you understand what an error message or piece of script means - a sort of Rosetta stone, if you will.

Matlab Debugging

Matlab Debugging - Troubleshooting and Errors

This page looks specifically at how to troubleshoot Matlab problems - how to get more information, how to read and interpret error messages, and how to dig through m-files to find what's gone wrong. For the most basic of basics, check out MatlabBasics. As well, MatlabPaths answers questions about modifying your search paths, and MatlabProgramming talks about how to read m-files and how to program with them.

In the following, text in fixed-width font are commands that you can type directly into Matlab.

Troubleshooting SPM and Matlab

So there you are, tootling along estimating your model or coregistering your anatomy or what have you. And then, boom: computer beeps, error message pops up, everything stops, fire and brimstone come from the sky. What can you do about it? Lots, as it turns out. Armed with a sense of the Matlab basics and understanding of Matlab code, you can often divine a lot about what's going on in your error from your crash - often enough to fix the problem, but at least enough to narrow down the possibilities. The two crucial entry strategies to troubleshooting are: knowing where the problem happened, which frequently boils down to interpreting your error messages, and knowing what was happening when the problem hit, which relies on Matlab's debugging package. We'll tackle them separately.

  • Non-errors

    Before we jump into errors, a quick word about non-error problems. Warnings sometimes crop up in Matlab, and they allow the program to continue. They're not generally anything worth worrying about - they'll let the program run - but they're worth noting, and trying to understand what they mean, particularly when they precede an error. The programmer puts warnings in for a reason, to notify you of something weird happening, so listen to her...

    Also, people sometimes ask whether they can tell if an SPM program is crashed or just running for a long time. In the non-graphical version, it's difficult to tell - the Linux command "top" can sometimes tell you something - but in the graphical version, the status bar at the bottom of the window is a good indicator. If the program's working on something, it'll say "Busy" down there; if it's not, it won't. In general, Matlab rarely crashes outright without generating some kind of error message, so if something's stopped responding and you've got a "busy" message but no error message, it's probably still working. If it seems like it's been going for an inordinate amount of time, though, check in with somebody.

  • Where did the error happen? or, Everything you need to know about error messages

    SPM errors can happen in terribly uninformative fashion. Everyone who's used SPM has seen the infamous "Error during uicontrol callback," which is a generic Matlab error that means "Something went wrong in the function called by pushing whatever button you just pushed." Fortunately, Matlab has relatively well-designed error-detection and bug-tracking facilities. You can learn a great deal about your error just by interpreting your error message.

    Sometimes in SPM you'll get a full error message of several lines, but oftentimes you'll get the one-line "uicontrol callback" error. However, if you close the panel that's home to the button that generated the error, you can often get the full underlying error message. So if you hit "volume" in the results section and get a uicontrol callback error, closing the results panel will usually generate further information about the error. You may also be able to generate more information simply by going to the Matlab window and hitting return a couple of times. Always try and get a full error message when you have an error! Even if you can't figure out the problem, it's almost totally useless to another troubleshooter if you come to them and say, "I have an error!" or "I have a uicontrol callback error!" without any more information about the problem. Gather as much information about the error as you can - the Matlab window is your friend.

    Matlab error messages are of varying helpfulness, but they all follow the same structure, so let's take a look at one:

??? Error using ==> spm_input_ui Input window cleared whilst waiting for response: Bailing out!

Error in ==> c:\documents and settings\jeff cooper\my documents\gablab\spm99\spm_input.m On line 77 ==> [varargout{:}] = spm_input_ui(varargin{1:ib-1});

Error in ==> c:\documents and settings\jeff cooper\my documents\gablab\spm99\spm_get_ons.m On line 146 ==> Cname{i} = spm_input(str,3,'s',sprintf('trial %d',i),...

Error in ==> c:\documents and settings\jeff cooper\my documents\gablab\devel_99\spm_fMRI_design.m On line 225 ==> [SF,Cname,Pv,Pname,DSstr] = ...

Error in ==> c:\documents and settings\jeff cooper\my documents\gablab\devel_99\spm_fmri_spm_ui.m On line 289 ==> [xX,Sess] = spm_fMRI_design(nscan,RT);

??? Error while evaluating uicontrol Callback.

SPM errors will often look like this one: a "stack" of errors, although the stack may be bigger or smaller depending on the error. So what does it mean?

First, the order to read it in. The top of the stack - the first error listed - is the most immediate error. It's what caused the crash. Matlab will tell you what function the error was in - in this case, it's spm_input_ui - and some information about what the error was. That error message may be written by the function's programmer (in which case it's often quite specific) or be built-in to Matlab (in which case it's usually not). Sometimes that top error may also give you a line number on which the error was generated within the function.

The rest of the stack is there to tell you where the function that crashed was called from. In this case, the next one down the stack was in spm_input - so our error was in spm_input_ui, which was called by spm_input. The line in spm_input that calls spm_input_ui is highlighted for us - it's line 77 within spm_input.m. The next error down tells us that spm_input was itself called from within spm_get_ons, on line 146, and so on. The last error tells us that the very first function that was called was spm_fmri_spm_ui - that's the function that was called when we first hit a button.

From only this limited information, you can often tell a lot about the error. One important distinction to draw is whether the function where the error happened was an SPM function or a Matlab function. Built-in Matlab functions are generally pretty stable, and so if they crash, it's usually because you, the user, made a mistake somewhere along the line. Functions that commonly error in Matlab include:

  • load - if the files it's trying to load don't exist or have been renamed or moved, or if they're corrupted.
  • save - if the directory you're trying to save to doesn't have write permission, or if you're out of disk space, etc.
  • copyfile, delete, movefile - these sometimes crash because of bugs in Matlab.

When the function that generated the error is an spm function, or a Gablab-written function, you can usually tell - it'll have the prefix 'spm_' or 'roi_' on it, or generally sound like something from fMRI analysis and not Matlab generally. In this case, the error is sometimes in the code and sometimes a mistake earlier in your analysis. The error messages can sometimes tell you enough to figure out what's going on: perhaps it says that there's an index out of range, and you realize it's because you're trying to omit the 200th scan from a session that only has 199. Just looking at the line that generated the error and what the actual error message says can give you a lot of info off the bat. Study the message carefully and look at the function stack it came from and see if you can figure out what the program was doing when it crashed.

It's not always enough to look at the problem after it's crashed, though. Sometimes you need to figure out what's happening when it actually crashed, and in that case, you need Matlab's debugging package.

  • Matlab Debugging

    Matlab's debugging package is pretty good. Hard-core debugging is a skill that's acquired only after a lot of programming, but even someone brand-new to reading code can use some debugging techniques to make sense of what's going on with their data, and hopefully learn some details about Matlab code at the same time...

    You'll need to be running the graphical version of Matlab to take real advantage of Matlab's debugging; without it the text editor won't open and you won't be able to see the details of the function you're debugging.

    The starting point for any bug-finding expedition is to answer the question, "What was happening in the program when it crashed?" The hope is answering that will tell you why it crashed, and how to keep it from crashing again. When errors ordinarily occur in Matlab, they end the execution of the program and close the workspace of the error-ing function, losing all its variable and so forth. The debugging package allows you, instead, to wait for an error and then "freeze" the program when one happens, holding it still so you can examine the state of all its variables and data.

    If you want to try debugging an error, figure out when your error happens first (during model estimation or when you push "load" or whatever). Then type "dbstop if error" at the Matlab prompt. This will engage the debugger in "waiting" mode. Then run your program again and do whatever caused the error. This time, you'll get an error message, but the prompt will change to K>>, and a text window will pop open containing the function that caused the error. The K prompt means you have Keyboard control - the program is frozen, and you can poke around as much as you like in it. To end the program when you're done debugging, type "dbquit" at the K>> prompt, and the program will end.

    Once you have the program frozen, it's time to look around. Check out the line that generated the error and see if you can figure out what the error message means. If it's called by an error function, is it inside an if block? If so, what condition is the if block looking for? If the error is caused by an index out of range, can you figure out what the index variable is? What is its value? What is the array that it's indexing? What sort of information is in there? Type whos to figure out what variables are currently in the workspace and what types they are, and type the name of variables to find out their values.

    It may not be immediately clear why a variable has the value it does or what it represents. Try reading line-by-line backwards through the function and find out when the variable got its current value assigned. Where did it come from? Are there any comments around it telling you what it is? Are there any comments at the top of the function describing it? This is where the programmer's cheat sheet (soon to be posted, ask Jeff for copies) can be particularly helpful.

    Sometimes a variable may have a strange value passed into it by whatever function called the error-ing function. The Matlab debugger has frozen that function as well, and you can move 'up' into its workspace to take a look around in the same way. Type "dbup" at the K>> prompt, and you'll switch into the workspace of the function that called the current one. You can go up all the way to the base workspace, and "dbdown" will let you switch back down through the stack. Remember that each workspace is separate - their variables don't interact, at least when all the m-files are functions.

    There are other debugging commands as well, more useful for you in writing your own scripts. Check out the Matlab help for keyboard, dbcont, dbstep and things like that.

    There's not more specific debugging advice to give, besides go explore the function. Every error is a little different, and it's impossible to tell what caused it until you get into the guts of the program. Good luck!

Matlab Paths

Matlab Paths

This page looks specifically at how to read and modify your search path. For the most basic of basics, check out MatlabBasics. As well, MatlabProgramming talks about hw to read m-files and program with them, and MatlabDebugging touches specifically on strategies for troubleshooting - how to get more information, what parts of error messages to look for, and how you can dig through a script to find what's gone wrong.

In the following, text in fixed-width font are commands that you can type directly into Matlab.


Every time you type a name in Matlab - of a variable, a function, a script, anything - the program goes through a certain procedure to figure out what you're referring to. It'll first look in its workspace for variables of that name; if it doesn't find anything, it will assume you're trying to refer to a function, script, or .mat file. Matlab always looks first in the present working directory for the .m file or .mat file of the given name, but if it doesn't find one there, it has a specified sequence of directories it looks in.

This directory list is called the path. It's similar to the Linux concept of the same name. Maintaining and manipulating your path can be crucial to running Matlab programs and keeping track of different versions of functions. Fortunately, Matlab makes it pretty easy to work with paths.

You can always look at your path in Matlab by typing path; Matlab searches the output list from top to bottom. The easiest way to work with your path is by typing "pathtool" at the Matlab prompt. That'll open up a graphical interface that will display your current path, let you easily add or remove directories, or re-order what's in there already.

If you're curious where on your path a particular m-file is, you can type which filename; this will tell you the location of the first copy of that file Matlab hits on the path, and hence which version of the file it will run. If you type open filename and filename is on your path, open will open the first copy it hits on the path. These two commands can be super helpful in figuring out whether you've got the right version of code, as well as figuring out where functions are located.

Matlab has a whole bunch of directories it adds into the path when it starts up by default, and the Gablab-specific commands to start Matlab - spm99-6-devel, spm2, spm2-devel, etc. - are essentially distinguished only by what directories they add to the path. If you call spm99-6-devel, the only difference from calling spm99-6 is that you'll get the spm99/devel directory on the top of your path above the standard spm99 directory. This means Matlab will find the development code first when it searches for functions, and it'll run the most current version of SPM that's stored in that directory. spm2-devel and spm99-6-devel differ only because they add different spm version directories.

In general, any manipulations of the path you do are good only for your current session of Matlab; when you shut it down and restart it, you'll go back to your default path. The default path in Matlab is stored in a file called pathdef.m, and in the Gablab, you won't be able to change that, which means clicking "save" in pathtool won't do much for you. But you can change your own default path! All you need is a startup.m file.

Matlab Programming

Matlab Programming - Functions, Scripts, M-files

This page looks specifically at how to read m-files and m-file programming - functions, scripts, etc.. For the most basic of basics, check out MatlabBasics. As well, MatlabPaths answers questions about modifying your search paths, and MatlabDebugging touches specifically on strategies for troubleshooting - how to get more information, what parts of error messages to look for, and how you can dig through a script to find what's gone wrong.

In the following, text in fixed-width font are commands that you can type directly into Matlab.

M-files and Programming

Mental Chronometry FAQ

Frequently Asked Questions - Mental Chronometry

Also check out JitterFaq for more info on variable-ISI experiments...

1. What is mental chronometry?

As important (or moreso) than finding out where your activation happened is finding out when it happened. Where did the information flow during the processing of your stimuli? Which structures were active before other structures? Which structures fed output into other structures, and which structures processed end results? Such questions suggest a fairly crude schematic of brain processing, but still a useful one: if you could answer all of those accurately about a given task, tracking the millisecond-by-millisecond flow of information through the brain, you'd have a much fuller picture of the information processing than from a static activation picture. So mental chronometry experiments attempt to attack those questions. First done with reaction time data, using experiments that would add or remove certain stages of a task and find out whether reaction time was sped or stayed the same or what, chronometric experiments have moved into fMRI, with moderate success. Formisano & Goebel (below) review recent developments in fMRI chronometry, with a thorough overview of potential pitfalls. They examine several studies in which flow of information is pinned down with a couple hundred milliseconds - not quite the temporal resolution of EEG, but much better than previously thought could be achieved with fMRI.

2. But how do I get better temporal resolution than my TR?

Simple: don't always sample the sample points of your response. If you always sample the BOLD response 2 seconds and 4 seconds and 6 seconds after your stimuli are presented, for your whole experiment, you'll have a very impoverished picture of the shape of your HRF. But if, for example, you sampled 2 sec. and 4 sec. and 6 sec. post-stimulus for half the experiment, then cut one second between trials and sampled 1 sec. and 3 sec. and 5 sec. for the rest of your experiment - why, then, you'd have a better picture. The cost, of course, is reduced power and expanded confidence intervals at the points you've sampled.

With a good picture of the shape of your HRF, though, you could then compare HRFs from two different regions and see which one had started first, or which one had reached its peak first. If HRF timing is connected in some reliable way to neuronal activation, you then don't need to sample the whole experiment at a super-fast rate - you could infer from only a limited-sample picture of one part of the HRF where neuronal activity had started first and where it had started second, which allows you to rule out certain flows of information.

3. How does variable ISI relate to mental chronometry?

In order to do a mental chronometry experiment, you need to absolutely maximize your statistical efficiency - your ability to pin down the shape of your HRFs. You're going to be comparing HRFs from several regions, and you need to look for the differences between them, which means you can't assume much (if anything) about what shape they're going to be, or else you're going to bias your results. Assuming nothing about your HRF and still getting a good idea of its shape, as Liu describes, means you need an experiment with very high efficiency - and, as Dale demonstrates, those are precisely those designs with variable ISIs (both on MentalChronometryPapers). Only by randomizing (or pseudo-randomizing) your ISIs can you pack enough trials into an experiment for sufficient power and still have enough statistical flexibility to get a good look at the shape of your HRFs.

4. What are some pitfalls in mental chronometry with fMRI, then?

The big one is a crucial assumption mentioned above: that HRF timing is connected in some reliable way to neuronal activation. We assume you're not interested so much in the HRF for its own sake, but rather as an indicator of neuronal activity. So let's say you get two HRFs, one from visual cortex and one from motor cortex, and the motor HRF starts half a second later than the visual. Is that because neuronal activity started half a second later in motor cortex? Or is it because the coupling between neuronal activity and BOLD response is just slower in motor cortex in general? Or is it because the coupling in that particular subject just happens to be looser for motor than for visual - and maybe it'll be different for your other subjects! Clearly, no matter how good a look you get at your HRF, questions like these will dog your chronometric experiment unless you're careful about validating your assumptions. Several excellent studies have examined the issue of variability of HRF between regions, subjects, and times, and those studies are crucial to check out before drawing conclusions from this sort of data (check out HrfPapers for some).

Of course, other more mundane issues may well torpedo chronometric conclusions: if you don't have high enough efficiency in your experiment, you won't be able to distinguish one HRF's shape from another with high accuracy, and you'll have a hard time telling which one started first or second anyways.

5. If I want to do a mental chronometry experiment, how should I design it?

Chronometric experiments depend crucially on determining HRF shape. So start with maximizing that - you need a design that will absolutely maximize your efficiency, no matter what the power cost. M-sequence or permuted block designs are good ways to start (see Liu, MentalChronometryPapers). It should be obvious that with designs like that, you need experimental tasks that will generate fairly reliable activations; your experiment will suffer in terms of power from its focus on finding HRF shape and not using shape assumptions. Choosing a task with a reasonably long latency is also important - even with the best possible design, fMRI noise is such that resolution below a couple hundred milliseconds is simply not possible for now. So if your task only lasts half a seconds, you may not be able to get much information about the chronometric aspects with fMRI. As well, in an experiment like this, having more samples is always better - so you want to have the shortest possible TR. If you can focus your experiment to a smaller segment of the brain than the whole thing, you can get a good number of slices and still have very fast TRs.

One thing to try and avoid in doing chronometry is to toss it in as a fishing-expedition analysis: if your experiment isn't designed with doing chronometric analysis in mind, you'll almost certainly have trouble finding reliable latency differences in your subjects. Unless you've got an eye on this from the start, it's probably not worth doing. But if you do, you can get some pretty sweet looks at the temporal flow of activation and information around your subjects' heads.


Mental Chronometry Papers

Useful Papers - Mental Chronometry


Formisano & Goebel (2003), "Tracking cognitive processes with functional MRI mental chronometry," Current Opinions in Neurobiology 13, 174-181 PDF

Summary: A thorough review of current advances in mental chronometry with fMRI. Formisano & Goebel clearly lay out the objectives of chronometric experiments and the major hurdles with fMRI (heterogeneity of BOLD response, accuracy of estimates, etc.) and discuss some potential optimization steps; they also present the results of a couple recent studies.

Bottom line: Yup, it's possible. With a properly designed experiment, you can follow the flow of information through the brain with fMRI in a single task.

Menon et. al (1998), "Mental chronometry using latency-resolved functional MRI," Proceedings of the National Academy of Science USA 95, 10902-10907 PDF

Summary: The first article directly addressing mental chronometry with fMRI. Menon et. al investigate some of the necessary assumptions - such as the accuracy of relative timing between areas and present some early results tracking information from visual to motor areas.

Bottom line: A good luck at the underlying assumptions of fMRI chronometry and how you can begin to verify them.


Liu (2004), "Efficiency, power and entropy in event-related fMRI with multiple trial types, Part II: design of experiments," NeuroImage 21, 401-413 PDF

Summary: The more intelligible part in a two-part manifesto. Liu expands on his 2001 paper (see DesignPapers) to demonstrate the tradeoff betwen efficiency and power is fundamental for multiple-trial-type experiments. Crucially, he also explores what specific experimental designs bring you closest to the theoretical best power for a given efficiency (or vice versa). Permuted block designs look like great happy mediums.

Bottom line: 2-block designs and m-sequence designs are at the opposite ends of the power/efficiency spectrum, each getting you close to the theoretical limit of how much power/efficiency you can get. Permuted block designs give you excellent tradeoffs between them.

Dale (1999), "Optimal experimental design for event-related fMRI," Human Brain Mapping 8, 109-114 PDF

Summary: Perhaps the best early defense of variable-ISI studies; Dale attempts to end the debate between long event-related designs and rapid event-related designs by showing the variable-ISI designs tremendously improve on the power of rapid designs. A mathematical argument about statistical efficiency is used to show that random-ISI (i.e., jittered) designs can outperform fixed-ISI designs for the same mean ISI by more than a factor of 10.

Bottom line: Once and for all: if possible, you should definitely be using a variable-ISI rapid design in any event-related study. The math is all here...

Friston et. al (1999), "Stochastic designs in event-related fMRI," NeuroImage 10, 607-619 PDF

Summary: Another look at which designs make the most sense - block, fixed-ISI, or variable-ISI (connected to the "stochastic" design idea). Authors make the point that the best type of design will depend very heavily on what type of effect you're looking for - differential effects between conditions do best with a different design than do evoked responses changing from baseline.

Bottom line: An early distinction between efficiency and power is drawn, as a difference between looking for differential or evoked effects. The idea of constructing variable-ISI designs from "null events" is brought forth.


Normalization HOWTO

How-Tos - Normalization

How do I...

Normalize in SPM?

The suggested protocol below applies only if you're already co-registered your inplane anatomy to the functionals. See CoregistrationHowTos for directions. Don't use this protocol if you haven't done that!

First step: Normalize the anatomy.

The ordering of directions below applies to SPM2. In SPM99, the directions are the same, with two small differences: you explicitly enter in the number of subjects, and you enter the template image last. Otherwise identical.

Hit "Normalize" in the main SPM interface.

Which option? Select the default, Determine Parameters and Write Normalized.

Template image? Select the appropriate anatomical template - T1.mnc if the inplane is T1-weighted (generally the case) or T2.mnc if it's T2-weighted. In SPM99, the template images are .img files instead of .mnc files.

Source image, subj 1? Select your anatomy/inplane/V001.img file and click DONE. (This is called "image to determine parameters from" in SPM99).

Images to write, subj 1? Select your anatomy/inplane/V001.img file and click DONE.

Source image, subj 2? You can click "Done" here if you only want to normalize one person at a time. Or, you can repeat the above steps for as many subjects as you like. When you're done, just hit done at a source image prompt.

The normalization will now proceed for the anatomical image. This will find the transformation that maps your inplane anatomy (and anything coregistered with it, like your functionals) into the MNI template brain space. The prefix for normalization in SPM2 is "w," for "warped." The output will thus be a normalized wV001.img anatomy file, as well as a new V001_sn.mat file in the same directory, containing the transformation parameters. (In SPM99, the prefix for normalized is "n", so it'll be an nV001.img file, and the parameter file will be called V001_sn3d.mat.) These will be later applied to all functionals.

Second step: Apply the computed transformation parameters to your functionals.

This is the same for SPM99 and SPM2.

Click the "Normalize" button.

Select Write Normalized Only.

Parameters, Subj 1? Select your anatomy/inplane/V001_sn.mat file and click DONE. (For SPM99, this is the _sn3d.mat file.)

Images to write normalized, subj 1? : SCAN1/aV*.img, SCAN2/aV*.img, etc., DONE.

Parameters, Subj 2? You can click "Done" here, or choose other subjects to process by repeating the above steps. In SPM99, this question won't come up - you can only write one set of images at once.

(In SPM99 only - SPM2 uses trilinear by default) Which interpolation method? Select Sinc Interpolation(9x9x9).

This step typically take a longish time, from about an hour to several hours, depending on the number of functional files and the computer you are using. It will also create many new files - wV.img {or nV.img with SPM99) files, which will take a considerable amount of space, so it is a good idea before starting this process to make sure that there is enough space on the partition you are using.

Set/choose my defaults for normalization?

SPM2 and SPM99 actually have detailed help information about what all the defaults mean, so you should check that and read it. Type help spm_normalise_ui at the Matlab prompt if you're running SPM2, or help spm_sn3d if you're running SPM99. A couple notes on defaults to pay particular attention to
  • The "weight source images when registering" (or "mask object brain when registering" in SPM99) default allows you to specify a mask to tell the program to avoid "looking at" certain regions when normalizing - this can be helpful if the subject has a focal lesion or tumor.
  • In SPM99, you can specify whether you want SPM to do any nonlinear normalization at all, or if you just want an affine normalization. This eliminates all of the nonlinear warping, which can be helpful if you find your images being strangely distorted in particular areas. In both SPM99 and SPM2, you can also reduce the amount of nonlinear warping by reducing the number of nonlinear iterations that are done. In SPM2, you can't turn off nonlinear normalization altogether - only reduce the number of iterations.
  • The amount of nonlinear regularization (from "light" to "heavy", with the default at "medium") allows you to set the smoothness of the deformation field - the "heavier" it gets, the smoother the nonlinear deformations will become, and the less focused any warping will be. If your images are coming our heavily warped in particular regions, you may want to increase the regularization.
  • Voxel Size: in the "writing normalized" defaults, you can change the voxel size that the normalized images are re-sliced to. By default this is 2x2x2 mm, but if you don't care about having voxels that small, you can change it to another square value or specify your own rectangular value (including your original voxel size, if you like).
  • SPM2 only: SPM2, by default, uses trilinear interpolation for normalization, which is fast. If you want to slightly decrease the potential for interpolation errors, you can re-set the interpolation method to a B-spline method - 2nd or 3rd degree should be plenty accurate. This will significantly slow down the normalization writing process, though.
  • SPM99 only: The "Affine Starting Estimates" in the "Parameter Estimation" section determine whether or not your images are "flipped" during normalization. If your images are coming out of the scanner in radiological convention and you want them flipped over, then leave this at the default. But if you want to leave them in radiological, or if they come out of the scanner in neurological, then change this default to "neurological convention" - that won't flip the images.

Check my normalization?

You can use the "Check Reg" button in SPM's main interface to display two or more images at once with the same axes. Hit that, and then select the template image (usually T1.mnc or T1.img) and the normalized anatomical (and a normalized functional or two, if you'd like). Click around the outlines of the brains and see how well they line up. Deciding what constitutes a "good" normalization is a little arbitrary, but if a normalization has gone wrong, it'll usually go really wrong, and you'll be able to tell on visual inspection pretty easily.

Fix a bad normalization?

Couple ways. It depends a little bit on what's gone wrong. You can try tweaking the defaults (see above) to change the amount and smoothness of nonlinear transformations being applied. If the problem is that local areas are being too heavily warped - or, alternatively, that the gross shape of the brain is okay, but major local landmarks like the central sulcus are totally the wrong shape - those strategies may help. Increasing the regularization and decreasing the number of nonlinear iterations will help in the former case; the opposite changes will help in the latter. If, instead, the problem is that certain types of tissue or areas - like the subcortical gray matter structures - just look "wrong," then it may help to normalize only the gray matter in the brain; see below for directions on that. You might also try some of the methods used in fixing a bad coregistration (see CoregistrationHowTos) - getting better starting estimates for your normalization by manually zooming, rotating, or translating the source brain to better fit the template.

Whatever direction you go in, make sure you delete (or move to another directory) any existing sn.mat or sn3d.mat files that are around. You don't have to delete any wV.img or nV.img files - those will be overwritten.

Normalize only the gray matter in my brain?

Normalizing from the gray matter may be more accurate, especially for basal gray matter structures such as the basal ganglia. So if you want, here's how...

First segment the brain into gray matter, white matter and CSF. See SegmentationHowTos for directions. Your output will be: inplane/V001_seg1.img - the estimated gray matter; inplane/V001_seg2.img - the estimated white matter; inplane/V001_seg3.img - the estimated CSF.

Check to see how the gray matter was segmented with "Display." If you can see skull around the gray matter, create a skull-stripped brain with the Extract Brain module (see SegmentationHowTos for directions). Once you have the skull-stripped image, say, brain_V001.img, you can use ?ImCalc to intersect it with the gray matter image and get a skull-less gray matter image:

Hit "?ImCalc" in the main SPM interface.

Select images to work on : Select inplane/V001_seg1.img, inplane/brain_V001.img, DONE.

Output filename : Type V001_seg1_noskull.

Evaluated function : Select i1.*i2. This will multiply, voxel-by-voxel, the value of the gray matter image with the value of the skull-stripped brain, effectively producing an intersection of the two (since any zero in one will zero out the voxel in the final image).

This will produce a new file, inplane/V001_seg1_noskull.img in anatomy/inplane. Display the image to check how it looks. You can also compared it to inplane/V001_seg1.img using the "Check Reg" button.

Now you can normalize the inplane's gray matter to the MNI template's gray matter image. For this step, use either inplane/V001_seg1.img, or inplane/V001_seg1_noskull.img if you created the latter and it looks better.

First step: Normalize the gray matter image.

Hit "Normalize" in the main SPM interface. The order below is for SPM2; for SPM99, the template image will be selected last and you'll explicitly specify the number of subjects.

Which Option?...: Determine Parameters and Write Normalized

(SPM99 only) Number of subjects : 1

Template images : Go into the /usr/fmri_progs/matlab/spm2/vanilla/apriori directory and select the gray.mnc, DONE. (For SPM99, replace /spm2/ with /spm99/ and choose gray.img.)

Source image, subj 1 : Select anatomy/inplane/V001_seg1_noskull.img, then click DONE. (Or V001_seg1.img if you didn't skull-strip.) (In SPM99, this is called the "image to determine parameters from.")

Images to write, subj 1: Same as the source image.

Source image, subj 2: DONE.

The image will now be normalized. This will produce a new inplane/wV001_seg1_noskull.img, as well as a file with the normalisation parameters, V001_seg1_noskull_sn.mat, which will applied later to all functional images. In SPM99, the output is nV001_seg... and V001..._sn3d.mat.

Second step: Normalize all the functionals with your gray matter parameters.

Hit "Normalize."

Which Option? : Write Normalized Only

Parameter set, subj 1 : Choose anatomy/inplane/V001_seg1_noskull_sn3d.mat, then click DONE.

Images to write normalized, subj 1 : Select SCAN1/aV*.img, SCAN2/aV*.img, etc., then DONE.

Parameter set, subj 2: DONE.

(SPM99 only) Interpolation Method? : Sinc Interpolation(9x9x9)

The functional images will now be normalized, a longish process which may take an hour or a few. The process will also create many new files - wV.img {or nV.img with SPM99) files, which will take a considerable amount of space, so it is a good idea before starting this process to make sure that there is enough space on the partition you are using.

Normalize an elderly brain?

The standard normalization templates are created from healthy young adults, and attempting to normalize an elderly subject's brain to this template may result in a pretty poor normalization. A better way to go is to normalize to a template that's devised from elderly brains, and fortunately the good folks at UCLA have put together such a template: Some description of the template can be found at In order to normalize to this template, simply download it and select it instead of the T1.mnc file you select in the normal normalization. You may want to coregister and reslice the image first to the standard T1 template. Thanks to Arul Thangavel for tracking this down...

Normalize a brain with a lesion or tumor?

Normalizing a brain with a focal region that's abnormal isn't too tricky - you'd just like to ignore that focal region. Brett et. al describe the theory behind this in NormalizationPapers, but it's relatively easy to do by tweaking the defaults. The normalization defaults allows you specify a masking image where you can mask out the focal region, and then that region won't be taken into account during normalization.

First create the masking image. This should be a standard mask image, with only 0 and 1 as values, and it should include the whole brain as 1, with the lesioned/tumor area specified as 0. You can start with a whole brain mask (say, by segmenting the brain and then adding the three output images together - see SegmentationHowTos for directions). You'll then probably need to hand-draw the lesion area with MRIcro or some other image editing utility.

Then go to "Defaults" in the main SPM interface. Choose "Spatial Normalization," and then choose "Parameter Estimation." You'll go through a series of defaults; you can leave most of them. The one you're looking for is called "Weight source images when registering?" (in SPM2) or "Mask object brain when registering?" (in SPM99). In either case, change it to allow weighting/masking. Then run the normalization. You'll be asked to supply a weighting/mask image during your choices, and you should give it the mask image. When the normalization is done, inspect it carefully to make sure it looks all right.


Normalization Papers

Useful Papers - Normalization


Salmond et. al (2002), "The precision of anatomical normalization in the medial temporal lobe using spatial basis functions," NeuroImage 17(1), 507-12 PDF

Summary: The authors evaluate how closely they can get hand-drawn anatomical markers in MTL to line up as a function of their normalization parameters. They vary their number of nonlinear basis functions, their degree of regularization, and play around with templates a little bit. Results indicate that increasing constraints on the normalization actually increased their degree of anatomical precision.

Bottom line: A smaller number of nonlinear basis functions (4 x 5 x 4) is better for normalizing than a larger set (7 x 8 x 7). Increasing regularization (from medium to high) made the smaller basis set worse, but didn't improve the larger set's performance. Template choice (between a pediatric template and an adult) didn't make a lot of difference, even for pediatric data - but this is only in MTL...

Wilke et. al (2002), "Assessment of spatial normalization of whole-brain magnetic resonance images in children," Human Brain Mapping 17(1), 48-60 PDF

Summary: Using structural scans from 150 or so children of varying ages, Wilke et. al construct a pediatric template and then normalized their pediatric scans to both the new pediatric template and an adult template. They then examined the extent of deformation in each scan to find correlations with age and/or region as to how much deformation was needed to bring each scan into line with each template.

Bottom line: Lots of conclusions, but some main ones: brain size doesn't change much with age, but head size does, a lot; using the pediatric template required quite a bit less deformation overall, and there were several regions particularly affected by normalization to the adult template (prefrontal, temporal pole, etc.); even with the pediatric template, several regions are more variable than others in children (precuneus, prefrontal).


Crivello et. al (2002), "Comparison of spatial normalization procedures and their impact on functional maps," Human Brain Mapping 16(4), 228-50 PDF

Summary: The authors compared the performance of several normalization algorithms (affine, AIR's nonlinear, SPM, and a full-mesh method) in bringing segmented structural scans precisely in line with segmented templates, and on PET functional data.

Bottom line: The full-mesh method did best at bringing gray and white matter precisely into line with the templates, but the AIR nonlinear method and SPM both have the advantage of preserving local relationships between sulci and gyri more precisely. Functionally, although results differed between all four algorithms, none appeared to be significantly better than any others at relatively high PET resolution.

Ashburner & Friston (1999), "Nonlinear spatial normalization using basis functions," Human Brain Mapping 7(4), 254-66 PDF

Summary: Perhaps the classic paper about nonlinear normalization, and the defining paper for reference on how SPM does normalization. Also an excellent survey of the major issues about doing normalization, including their case for using voxel-based (as opposed to tensor-based) methods in SPM. Pretty technical, but highly recommended.

Bottom line: Any image can be transformed into any other image given enough time and an unconstrained linear transformation. But a precise transformation doesn't always get you what you want, which is overlap of functional areas. So the question is what constraints/priors will give you the best functional overlap and best performance.

Kochunov et. al (2000), "Evaluation of octree regional spatial normalization method for regional anatomical matching," Human Brain Mapping 11(3), 193-206 PDF

Summary: Can't say I got to this one, but it lays out a different method than the nonlinear-basis-function method that's standard these days.

Brett et. al (2001), "Spatial normalization of brain images with focal lesions using cost function masking," NeuroImage 14(2), 507-12 PDF

Summary: Can't say I got to this one either, but it's easily summarized: trying to line up images with big holes in them with templates is problematic when you allow nonlinear transformations, because the algorithm will tend to distort the lesion area and whole brain. An easy way to avoid that problem is by simply masking out the lesioned voxels when calculating your cost function, and Brett et. al demonstrate it works.


Nornmalization FAQ

Frequently Asked Questions - Normalization

1. What is normalization?

Inconveniently, brains come in all different sizes and shapes. But the standard statistical algorithms all assume that a given voxel - say 10, 10, 10 - samples exactly the same anatomical spot in every subject. So you'd really like to be able to squash every subject's brain into exactly the same shape and exactly the same space in your image. That's normalization - it's the process by which pictures of subjects' brains are squashed, stretched and squeezed to look like they're about the same shape

2. How does normalization work?

A lot like realignment. Normalization algorithms, just like realignment algorithms, search for transformations of the source image that will minimize the difference between it and the target image - transformations that, as much as possible, will make the source image 'look like' the target template. The difference is that realignment algorithms restrict themselves to rigid-body transformations - moving and turning the brain, but not changing its shape. Normalization algorithms allow nonlinear transformations as well - these actually change the shape of the brain, squeezing and stretching certain parts and not other parts, to make the source brain 'fit' the target brain. Different types of nonlinear transformations can be applied - some use sine/cosine basis functions, some use viscous fluid models or meshes - but all normalization can be thought of this way.

An important point about normalization is that any algorithm, if allowed to make changes on a fine enough scale, can precisely transform one brain into another, exactly. Sometimes, though, that's not what you want - if you're interested in looking at differences of gray matter in children vs. in adults, you'd like to normalize the general anatomy, but not at such a fine scale you remove exactly the difference you're looking for! Other times, though, you'd love to match up every point in a subject's brain exactly with the identical point in another subject's brain. Care should still be taken, though - normalization algorithms can align structural anatomy precisely, but can't guarantee the subjects' functional anatomies will align perfectly.

3. Why would I want to normalize? What are the drawbacks and/or advantages?

The advantage is simple: Brains aren't all the same size and shape. The simplest and most widespread methods of statistical analysis of brain data is to look each voxel across all your subjects and compare some measure in that voxel. For that method to be reasonable, equivalent voxels in each subject's images should match up with equivalent locations in each subject's brain. Since brain structures can be quite variable in size and shape, we need some way to 'line up' those structures. If you want to do any kind of voxel-based statistical analysis - not just of activation, but also of anatomy, as in voxel-based morphometry (VBM) - across a group, normalizing can largely remove a huge source of error from your data by removing variance between brain shapes.

The disadvantage is just as simple: Like any preprocessing, normalization isn't perfect. It generally requires interpolation, which introduces small errors into the images, and even with normalization, anatomies may not line up as well as you'd hope. It can also be slow - depending on the methods and programs used, normalizing a run of functional images can take hours or days. Still, to use voxel-based statistics, it's a necessary evil...

4. When is it unhelpful to normalize?

If you're running an analysis that's not voxel-based - say, one that's based on region-of-interest-specific timecourses - then normalization makes a lot less sense. An alternative to voxel-based methods is to compare some measure activation in particular structures hand-drawn (or automatically drawn) on individual subjects' images. Since a method like this gets summary statistics out from each subject individually, without requiring that any statistical images be laid on top of each other, normalization is totally unnecessary. Some researchers choose to preprocess their data on two parallel paths, one with normalization and one without, using the non-normalized data for region-of-interest analysis and the normalized for traditional voxel-based methods.

As well, several factors can make normalization difficult. Lesions, atrophy, different developmental stages, neurological disorders, and other problems can make standard normalization impossible. Some of these problems can be easily addressed (see Brett et. al in NormalizationPapers), and some can't be. Anyone using patient populations with significant neurological differences from their normalization templates should be advised to explore the literature on normalizing patients before proceeding.

5. How important is it to align images to AC-PC before normalizing?

This varies between programs. For AFNI and BrainVoyager, it's pretty important. The nonlinear transformations can account for non-aligned images in theory, but if you start the images off in a non-aligned state, the algorithm is more likely to get caught in a local minimum of the search space, and give you strange normalization parameters. If you aren't realigning before normalizing, it's best to make sure to examine the normalized brains afterwards to make sure that your normalization ran okay. SPM's normalization algorithm has a realignment phase built in that runs automatically before the nonlinear transformations are examined, so doing realignment beforehand isn't necessary. It can't hurt, particularly when realigning runs of functional data, and it's still wise to examine the normalized image afterwards as a sanity check...

6. How important is it to make sure your segmentation is good before normalizing to the gray template?

Very important. The gray template contains, in theory, only gray-matter voxels. Normalization algorithms find their transformations by trying to minimize the voxel-by-voxel intensity differences between images, and white matter, CSF and gray matter all have notably different intensity profiles. So if you have left-over fringes of white matter or CSF or occasional speckles of white matter included by error in your gray-matter image, they'll be treated as error voxels even if they're in the right place. The algorithm may still converge to the best gray-matter solution, but you can greatly increase your chances of getting a good gray-matter normalization by making sure your segmentation is clean and only includes gray-matter voxels.

7. Should you use the inplane anatomy or the high-res anatomy to determine parameters?

There's not a perfect answer, but probably if you have a high-res anatomy, you should use it. In theory, the high-res anatomy should provide you a better match, because it has more detail. However, if you have significant head movement between the high-res scan and the functionals, there will be an additional source of error in the high-res (even after realignment) that may not be there in the inplane if there's less movement between the inplane and the functionals. In general, though, the increased resolution of the high-res will probably provide better precision for your normalization parameters. In practice, the difference will probably be small, but every little bit helps...

8. When in my analysis stream should I normalize?

There are two obvious points when you can normalize - a) in the individual subject analysis, before you estimate your model / do your stats, b) after you've done your stats and calculated contrast images for each subject, but before you do your group analysis. In case a), you'll normalize all your functional images (usually after estimating parameters from an anatomical image); in case b), you'll normalize only your contrast images (always after estimating parametes from an anatomical image). In general, the standard is a), but I'm not sure exactly why. One problem with b) might be that interpolation errors are being introduced directly into your summary statistics, rather than in the functional images they're derived from. To the extent that contrast images are less smooth than functional images, this will tend to disadvantage b). As well, those interpolation errors are then going to be averaged over far fewer observations in b) - when you're combining only one contrast image for each person - than in a) - when you're often combining several hundred functional images for each person. Not sure whether this will make much difference, though... This is a test that should be run.

9. How can you tell how good your normalization is?

There are possible automated ways you can use to determine quantifiably how close your normalization has gotten - see Salmond et. al, (NormalizationPapers) for one - but, in general, the easiest way to go is just to compare the template image and your normalized image by looking at them side-by-side (or overlaid). Check to make sure that the gross structures and lobes line up reasonably well, and if you have any particular area of interest - hippocampus, V1, etc. - check those in detail to make sure they line up okay. If they don't, you may want to align your source image differently before normalizing, or try normalizing just your gray matter.

9. What’s the difference between linear and nonlinear transformations?

Roughly, linear transformations are those that treat the head as a rigid body, and allow it to be transformed only in ways that don't affect its shape or the shape of anything inside it. Rotations, translations, and scaling all fall in this category. Nonlinear transformations are any transformations that don't respect those constraints; these include any transformations that squeeze parts of the image, stretch parts of the image, or generally distort the shape of the head in any way.

10. How do I normalize children’s brains? How about aging brains?

There is a monster literature out there on normalizing various non-standard brains, too large to survey easily; the Wilke et. al is a good start for children and contains some good citations into that literature, and the Brett et. al contains some similarly nice citations for aging brains (both on NormalizationPapers). Anyone who knows this literature well is invited to contribute links and/or citations...


Nutshell SPM

SPM is a software package designed to analyze brain imaging data from PET or fMRI and output a variety of statistical and numerical measures that tell you, the researcher, what parts of your subjects' brains were signficantly "activated" by different conditions of your experiment. There are a couple of phases of analyzing data with SPM: spatial preprocessing, model estimation, and results exploration. This page aims to give you a nutshell explanation of what's actually happening in each of those phases (particularly model estimation) and what some of the files floating around your results directories are for. Links to pages with more detail about each aspect of the analysis are down at the bottom of this page.

Spatial preprocessing is conceptually the most straightforward part of SPM analysis. During this phase, you can align your images with each other, warp them (normalize) so that each subject's anatomy is roughly the same shape, correct them for differences in slice time acquisition, and smooth them spatially.

These steps are used for a couple of reasons. Registration and normalization aim to line images from a single subject up (since subjects' heads move slightly during the experiment) and normalization aims to stretch and squeeze the shape of the images so that their anatomy roughly matches a standard template; both of these aim to make localizing your activations easier and more meaningful, by making individual voxels' locations in a given image file match up in a standard way to a particular anatomical location. Slice timing correction and smoothing both enable SPM to make certain assumptions about the data images - that each whole image occurred at a particular point in time (as opposed to slices being taken over the course of an image acquisition, or TR), and that noise in an image is distributed in a relatively random and independent fashion (as opposed to being localized).

Model estimation is the heart of the SPM program, and it's also the most conceptually complex. What you as a researcher want to know from your data is essentially: what (if any) parts of my subject's brain were brighter during one part of my experiment relative to another part? (For details on why "brighter" is a measure of activation, check out our ?FmriFaq page). Another way of putting this question might be this: You as a researcher have a hypothesis or model of what happened in an experiment; you have a list of different conditions and when each of them took place, and your model of the person's brain is that there was some kind of reaction in the brain for every stimulus that happened. How good a fit, then, does your hypothetical model provide to the actual MRI data you saw from the person's brain? Specifically, are there particular locations in the brain where your model was a very good fit, and others where it wasn't a good fit? The main work SPM does is to try and find those locations, because locations where your hypothesis proves to be a good fit can be described as "responding" somehow to the conditions in your experiment.

When SPM estimates a model, what it's doing is essentially a huge multiple regression at each voxel of your subject's brain, to see how well the data across the experiment fits your hypothesis, which you describe to SPM as a design matrix. When you tell SPM what your conditions are, what your onset vectors are, etc., it sets up this matrix as a guess at what contribution each condition might make to every image in your experiment. As part of that guess, it automatically convolves the effects of the hemodynamic response function with your stimulus vectors, as well as doing some temporal filtering to make sure it ignores changes in the data that aren't relevant to your conditions (see ?FilteringFaq for more on this).

Once the design matrix has been set up, SPM walks through each voxel in the brain, and does a multiple regression on the data at that point that estimates how much of a contribution every condition in your experiment made to the data and how much error was left over after all the conditions you specified are taken into account. The fit of this regression line is important - how much error is left at the voxel tells you how good your model was - but for most purposes, the actual slope of the regression line is more important. A large positive or negative partial regression slope for a given condition tells us that that condition had a large influence in determining the data at that voxel. This slope, called the beta weight or parameter weight for that condition, is saved by SPM in the beta_* images - one for each column of the design matrix, where each voxel gives the beta weight for that condition at that point.

After the model estimation is complete, you now have a set of data telling you how big an effect each condition of your experiment had at each voxel. By itself, this information may not be useful - in most fMRI experiments, any condition by itself accounts for a tiny portion of the variance. What is useful to know, though, is if one condition made a significantly greater contribution than another condition did. This is where results analysis, and specifically contrast analysis, comes in. When you evaluate your results, SPM asks you to specify a contrast in terms of weights for each conditions. If you have only two conditions in your experiment, assuming your design matrix was (A B), then a contrast vector of (1 -1) would tell SPM you wanted to see at which voxels A had a signficantly larger contribution to brain activity than B did. SPM takes this contrast vector and literally uses it to make a weighted sum of the beta images it's just created; this new image, created by giving each beta image the weight you specified and adding them together, is a con_* image. SPM then looks across the con image at the distribution of weighted parameter values, and combines them with its estimate of the leftover variance from the model estimation, and assigns every voxel a T-statistic (creating an spm_T* image). When you ask SPM to only show you the voxels that are significantly active at a certain p-threshold, it looks at that T-stat image and finds only the voxels whose t-statistics are so large as to fit above that probability threshold - voxels where their weighted parameter values were so large as to be statistically unlikely at your specified level of significance.

Those voxels are where brain activity in your experiment was heavily influenced by one condition in your experiment more than another condition, so heavily influenced as to make it unlikely that activity was just noise. Those voxels were brighter / more intense in one condition of your experiment than they were in another with great reliability, and so they're considered active.

For more detail about a particular aspect of spatial preprocessing, check out the individual FAQ pages at:

For more detail about a particular aspect of model estimation or results analysis, check out some FAQ pages at:

P threshold FAQ

Frequently Asked Questions - P-thresholds

1. What is the multiple-comparison problem? What is familywise error correction (FWE)?

To start, Nichols and Hayasaka (PthresholdPapers) provide an excellent introduction to the issue of FWE in neuroimaging in very readable fashion. You're encouraged to check it out.

Many scientific fields have had to confront the problem of assessing statistical significance in the context of multiple tests. With a single statistical test, the standard conventionally dictates a statistic is significant if it is less than 5% likely to occur by chance - a p-threshold of 0.05. But in fields like DNA microassays or neuroimaging, many thousands of tests are done at once. Each voxel in the brain constitutes a separate test, which usually means tens of thousands of tests for a given subject. If the conventional p-threshold of 0.05 is applied on a voxelwise basis, then, just by chance you're almost guaranteed to have many hundreds of false-positive voxels. In order to avoid any false positives, then, researchers generally correct their p-threshold to account for how many tests they're performing. This type of correction prevents Type I error across the whole family of tests you're doing - a familwise error correction, or FWE correction.

The standard approach to FWE correction has been the Bonferroni correction - simply divide the desired p-threshold by the number of tests, and you'll maintain correct control over the FWE rate. In general, the Bonferroni correction is a pretty conservative correction, and it suffers from a fatal flaw with neuroimaging data. The Bonferroni correction demands that all the tests be independent from each other, and that demand is manifestly not fulfilled in neuroimaging data, where there is a complex, substantial and generally unknown structure of spatial correlations in the data. Essentially, the Bonferroni correction assumes there are more spatial 'degrees of freedom' than there really are; one voxel is not independent from the next, and so one only needs to correct for the 'true' number of independent tests you're doing. This effort, though, is tricky, and so a good deal of theory has been developed on ways around Bonferroni-type corrections that still control the FWE at a reasonable level.

2. What is Gaussian random-field theory and how does it apply to FWE?

Worsley et. al (PthresholdPapers) is one of the first papers to link random-field theory with neuroimaging data, and that link has been tremendously productive in the years since. Random-field theory (RFT) corrections attempt to control the FWE rate by assuming that the data follow certain specified patterns of spatial variance - that the distributions of statistics mimic a smoothly varying random field. RFT corrections work by calculating the smoothness of the data in a given statistic image and estimating how unlikely it is that voxels (or clusters or patterns) with particular statistic levels would appear by chance in data of that local smoothness. The big advantages of RFT corrections are that they adapt to the smoothness in the data - with highly correlated data, Bonferroni corrections are far too severe, but RFT corrections are much more liberal. RFT methods are also computationally extremely efficient.

However, RFT corrections make many assumptions about the data which render the methods somewhat less palatable. Chief among these is the assumption that the data must have a minimum level of smoothness in order to fit the theory - at least 2-3 times the voxel size is recommended at minimum, and more is better. For those researchers unwilling to pay the cost in resolution that smoothing imposes, RFT methods are problematic. As well, RFT corrections are only available for statistics whose distributions in a random field have been laboriously calculated and derived - the common statistics fall in this category (F, t, minimum t, etc.), but ad hoc statistics can't be corrected in this manner. Finally, it's become clear (and Nichols and Hayasaka show in PthresholdPapers), that even with the assumptions minimally satisfied, RFT corrections tend to be too conservative.

Random-field theory corrections are available by default in SPM; in SPM99 or earlier, choosing a "corrected" p-threshold means using an RFT correction, while in SPM2, choosing the "FWE" correction to your p-threshold uses these methods. I don't believe corrections of this sort are available in AFNI or BrainVoyager.

3. What is false discovery rate (FDR)? How is it different from other types of multiple-comparison correction?

RFT methods may have their flaws, but some researchers have pointed out a different problem with the whole concept of FWE correction. FWE correction in general controls the error rate for the whole family; it guarantees that there's only a 5% chance (for example) of any false positives appearing in the data. This type of correction simply doesn't fit the intuition of many neuroimaging researchers, because it suggests that every voxel activated is a true active voxel, and most researchers correctly assume there's enough noise in every stage of the process to make a few voxels here and there look active just by chance. Indeed, it's rarely of crucial interest in a particular study whether one particular voxel is necessarily truly or falsely positive - most researchers are willing to accept that some of their signal is actually noise - but that level of inference is precisely what FWE corrections attempt to license.

Benjamini & Hochberg, faced with this conundrum, developed a new idea. Rather than controlling the FWE rate, what if you could control the amount of false-positive data you had? They developed a method to control the false discovery rate, or FDR. Genovese et. al (PthresholdPapers) recently imported this method specifically into neuroimaging. The idea in controlling the FDR is not to guarantee you have no false positives - it's to guarantee you only have a few. Setting the FDR control level to 0.05 will guarantee that no more than 5% of your active voxels are false positives. You don't know which ones they might be, and you don't even know if fully 5% are false positive. But no more than 5% are falsely active.

The big advantage of FDR is that is adapts to the level of signal present in the data. With small signal, the correction is very liberal. With huge signal, it's relatively more severe. This adaptation renders it more sensitive than an RFT correction if there's any signal present in the data. It allows a much more liberal threshold to be set than RFT, at a cost that most researchers have already mentally paid - a few false positive voxels. It requires almost no computational effort, and doesn't require laborious derivations to be used with new statistics.

FDR is not a perfect cure-all - it does require some assumptions about the level of spatial correlation in the data. At the outer bound, allowing any arbitrary correlation structure, it is only slightly more liberal than the equivalent RFT correction. But with looser assumptions, it's a great deal more liberal. Genovese et. al have argued that fMRI data in many situations fits a very loose set of assumptions, enabling a pretty liberal correction.

The latest edition of every major neuroimaging program provides some methods for FDR control - SPM2 and BrainVoyager QX have it built-in, and AFNI's 3dFDR program does the same work. Tom Nichols has predicted FDR methods will essentially replace most FWE correction methods within a few years, and they are beginning to be widely used throughout neuroimaging literature.

4. What is permutation testing? How is it different from other types of multiple-comparison correction?

Permutation testing is a form of non-parametric testing, and Nichols and Holmes give an excellent introduction to the field in their paper (PthresholdPapers), a much better treatment than I can give it here. But here's the extreme nutshell version. Permutation tests are a sensitive way of controlling FWE that make almost no assumptions about the data, and are related to the stats/CS concept of 'bootstrapping.'

The idea is this. You hope your experimental manipulation has had some effect on the data, and to the extent that it has, your design matrix is a model that explains the data pretty well, with large beta weights for the conditions of interest. But what if your design matrix had been different? What if you randomly re-labeled your trials, so that a trial that was actually an A trial in the real experiment was re-labeled as a B, and put into the design matrix as a B, and a B trial was re-labeled and modeled as a C trial, and a C as an A, and so forth. If your experiment had a big effect, the new, randomly mixed-up design matrix won't explain it well at all - if you re-ran your model using that matrix, you'd get much smaller beta weights. Of course, on the null hypothesis, there wasn't any effect at all due to your manipulation, which means the random design matrix should explain it just as well.

And now that you've re-labeled your design matrix and re-run your stats, you mix up the design matrix again, differently and do the same thing. And then do it again. And again, until you've run through all the possible permutations of the design matrix (or at least a lot of them). You'll end up with a distribution of beta weights for that condition from possible design matrices. And now you go back and look at the beta weight from your real experiment. If it's at the extreme end of that distribution you've created - congrats! You've got a significant effect for that condition. The idea in permutation testing is you don't make any assumptions about what the statistic distribution could be - you go out and empirically determine it, from your own real data.

But how does that help you with the multiple-comparison problem? One nice thing about permuation testing is that aren't restricted to testing significance for stats with known distributions, like t or F. We can use these on any ad hoc statistic we like. So let's do it across the design matrices, using as our statistic the maximal T: the value of the maximum T-statistic in the whole image for that design matrix. We come up with a distribution, just like before, and we can find the t-statistic that corresponds to the 5% most extreme parts of the maximal T distribution. And now, the clever bit: we go back to our real experiment's statistical map, and threshold it at that 5% level from the maximal T. Hopefully the t-statistics from our real experiment are generally so much higher than those from the random design matrices as to mean a lot of voxels in our real experiment will have t-statistics above that level - and we don't need to correct their significance at all, because anything in that extreme part of the maximal T distribution is guaranteed to be among the most extreme possible t-statistics for any voxel for any design matrix.

Permuation tests have the big advantages of making almost no (but not totally none - see Nichols and Holmes for details) assumptions about your data, which means they work particularly well with low degrees of freedom, where other methods' assumptions about the shape of their statistic's distribution can be violated. They also are extremely flexible - any true or ad hoc statistic can be tested, such as maximal T, or size of structure, or voxel's favorite color - anything. But they have a big disadvantage: computational cost. Running a permutation test involves re-estimating at least 20 models to be able to guarantee a 0.05 significance level, and so in SPM for individual data, that cost can be prohibitive. For other programs, the situation's not as bad, but it can still be pretty difficult to wait. Permuation tests are available at least in SPM99 with the SnPM toolbox, and in AFNI with the 3dMonteCarlo program. Not sure about BrainVoyager.

5. When should I use different types of multiple-comparison correction?

Nichols and Hayasaka's paper (PthresholdPapers) does an explicit review of various FWE correction methods (as well as FDR) on simulated and real data of a variety of smoothness levels and degrees of freedom, to judge how conservative or liberal different methods were. Their main findings are:

Accordingly, the nutshell recommendations are as follows:

6. What is small-volume correction?

All the FWE correction methods here adapt to the number of tests performed. The fewer tests, the less severe the correction, and in neuroimaging, the number of tests performed corresponds to the number of voxels or the volume corrected. So it's to your advantage when doing FWE correction to minimize the volume you're testing. If you have an a priori hypothesis about where you might see activation, like a particular anatomical structure or a particular area found to be active in another study, you might restrict your correction to only that area and be perfectly valid in only performing FWER correction there. In practice, this is often done when a particular activation is above the uncorrected threshold, but you'd like to report corrected statistics. You might also try it when you're using a corrected threshold to start, but not seeing any activation where you might expect some - you could restrict your correction to a smaller volume than the whole brain and suddenly get activation popping up above the new, small-volume-corrected threshold.

SPM has a shortcut to this sort of volume restriction - the small volume correction (or S.V.C.) button in the results interface. It'll let you re-calculate corrected p-statistics for a specified region only - an ROI mask image, or a sphere around a point, etc. This change won't change the uncorrected p-statistics for any activations, but it will make the corrected p-statistic for any activations in that region significantly better, depending on how big your specified region is.

Note that if you're using an uncorrected threshold to start, using S.V.C. won't show you anything new. This correction only re-jiggers the corrected p-statistic for a given region.

7. What do all the different reported values in my SPM table mean (p-corrected, p-uncorrected, cluster, set, etc.)? How are they calculated?

SPM reports a pair of p-statistics for each voxel, a p-statistic for each cluster, and a p-statistic for each set. At the voxel level, these are relatively self-explanatory. The p-uncorrected statistic is the probability that, by itself, a voxel with that t- (or F-)statistic would occur just by chance. This is the statistic that's used to threshold the brain using the uncorrected threshold, or "None" correction in SPM2. The p-corrected statistic is the probability of that same t-statistic, but corrected for FWE using Gaussian RFT methods. This statistic reflects the volume that's being corrected (and hence changes in small-volume-corrected regions).

The cluster and set values are more obscure and less useful - they're explained in detail in the Friston et. al (PthresholdPapers). Briefly, the cluster-level p-statistic is the probability that a cluster of that size would occur just by chance in data of the given smoothness. The key difference is that the activation of a cluster doesn't imply that any particular voxel in the cluster is active - you can't use that statistic to license inference that any one voxel in the cluster is above some threshold. The set-level p-statistic is similar, at the level of the whole brain; it's the probability that a pattern of activation of that size (number of clusters) would occur in data of the given smoothness. But it doesn't mean that any given cluster is active - it only tells you that there's some particular pattern of activation happening, in a regionally unspecific manner. Because both of these statistics are derived from Gaussian RFT theory, they're both, by definition, corrected p-statistics. But because neither of them license inference to any particular voxel, they're not widely used or cited.

8. What should my p-threshold be for some analysis X?

p < 0.05, corrected, remains the gold standard for any neuroimaging analysis. Because RFT corrections are so severe, though (and because other methods aren't widespread enough to challenge them), a de facto standard of p < 0.001 seems to be in operation these days a lot of the time. Depending on the type of analysis, you may be able to go even looser - group-level regressions are sometimes seen more loosely, such as p < 0.005, although there's not a particularly good reason for this.

Using FDR control instead of FWE correction is relatively new, so by default an FDR of 0.05 seems to be the current standard, but Benjamini & Hochberg, among others, have argued that a more liberal threshold in some situations may be reasonable - as high as 0.1 or even a bit higher.

For any type of non-voxel-based analysis, such as correlations of beta weights, etc., p < 0.05 is still the magic number for most reviewers.

9. What should my p-threshold be for conjunction analyses?

A good question. Check out the conjunction papers in ContrastsPapers for more detail, but the basic argument is simple. If a voxel that's active in a conjunction analysis simply has to be active in all of the component analyses, and you're thresholding the conjunction (not any component analyses), then the component analyses should have lower thresholds than the conjunction. Specifically, if you wanted to threshold the conjunction at p < 0.001, and you had two components to the conjunction, then you should threshold each of the components at sqrt(0.001). Any voxel active in both of those at that level will be less likely in the conjunction, so you can threshold each component at a very liberal level and be sure the conjunction's threshold will be quite stringent. In short - the conjunction threshold is the product of the component thresholds.

Many researchers, however, disagree with this line of reasoning. First, obviously, this argument depends on all of the components being independent - if they're dependent at all, then the product of the individual thresholds will be more stringent than the true conjunction threshold. Even if they're all independent, though, it's clear that using this line of argument means that any active voxel in the conjunction is a voxel that may well not be active at a "reasonable" threshold in any of the components. This problem is exacerbated with more than two components - with three, say, each component could be thresholded at p < 0.1 uncorrected, and the conjunction could have a threshold of p < 0.001. This flies in the face of what many people try to argue about their conjunctions, which is that they represent areas that are activated in all of their components. So many researchers use the strategy of simply thresholding their individual components at some liberal but reasonable threshold - p < 0.001, or p < 0.005 - and then simply assess the intersection of the active areas as the conjunction. This clearly results in extremely significant p-statistics in the conjunction, but it at least gets closer to the idea of "conjunction" that most researchers seem to have.

10. What should my p-threshold be for masked analyses?

If you're masking your one analysis with the results of an another analysis, you're basically doing a conjunction (see above), so you can liberalize your threshold at least a bit. If you're masking your analysis with a region of interest mask, anatomical or otherwise, you might also consider using a small volume correction and using p < 0.05 corrected as a threshold. If you're doing some other crazy kind of mask... well, you're kind of in uncharted waters. Start with something reasonable and go from there, and good luck to you.


P threshold Papers

Useful Papers - P-thresholds


Genovese et. al (2002), "Thresholding of statistical maps in functional neuroimaging using the false discovery rate," NeuroImage 15, 870-878 PDF

Summary: Landmark paper applying Benjamini and Hochberg's original concepts (below) specifically to neuroimaging. Briefly reviews the concept of FDR, its mathematical background, and methods to control it, then demonstrates its use on sample and real fMRI datasets. Extremely readable and short - should be required reading for anyone using FDR control.

Bottom line: Indispensable citation for anyone interested in using FDR - and you should be.

Friston et. al (1996), "Detecting activations in PET and fMRI: levels of inference and power," (first couple of pages, at least) NeuroImage 4, 223-235 PDF

Summary: A good reference on Gaussian RFT methods in neuroimaging; reviews some of the ideas of FWE control in general and has a little bit of math, but not too much, on how Gaussian methods work. Set-, cluster- and voxel-level inferences are introduced, with power analyses for all and some discussion of when each is appropriate.

Bottom line: The voxel-level info is useful, as is the Gaussian RFT review.

Nichols & Holmes (2001), "Nonparametric permutation tests for functional neuroimaging: a primer with examples," (first ten pages) Human Brain Mapping 15, 1-25 PDF

Summary: A specific attempt to bring permuation tests to the masses, this is a clear and comprehensive introduction to permutation testing, with a concise but thorough review of the concepts, and, crucially, three fully worked-out examples showing how permutation testing is applied to PET and fMRI data.

Bottom line: Also indispensable, this time for anyone interested in using permutation testing.


Nichols & Hayasaka (2003), "Controlling the familywise error rate in functional neuroimaging: a comparative review," Statistical Methods in Medical Research 12, 419-446 PDF

Summary: The best overview of FWE correction (with an excellent section on FDR) I've seen. Somewhat technical, but comprehensive. The authors review the mathematical background for several FWE correction methods (RFT, permutation, Bonferroni, etc.), and then, crucially, perform a variety of tests comparing the different methods in simulated and real data with various characteristics.

Bottom line: Comprehensive look at the advantages and disadvantages of every major method of p-threshold control, demonstrating the usefulness of FDR in low-smoothness data, the excellent performance of permutation testing, and the troubles of RFT methods outside high smoothness. Tremendously useful.

Benjamini & Hochberg (1995), "Controlling the false discovery rate: a practical and powerful approach to multiple testing," Journal of the Royal Statistical Society Series B 57, 289-300 PDF

Summary: The original paper describing the current method of FDR control. Authors review the concepts of FWE correction and discuss why they're not always appropriate, then describe a simple mathematical procedure to control FDR, using some examples to show how it may be used. They use simulations to show that the gain in power over FWE methods may be substantial.

Bottom line: Good background on FDR - the original paper introducing the concept.

Worsley et. al (1996), "A unified statistical approach for determining significant signals in images of cerebral activation," Human Brain Mapping 4, 58-73 (see Jeff for paper copies - PDF coming soon)

Summary: Not the first paper applying Gaussian RFT methods to neuroimaging data, but one of the most important ones. Worsley et. al bring together several lines of research on Gaussian RFT methods and tie up a number of loose threads to create a single statistical system for correcting FWE in neuroimaging (at the time, generally PET) data. This paper's approach is the foundation of SPM96 and SPM99's FWE correction.

Bottom line: A landmark in Gaussian RFT methods as applied to neuroimaging.

Bullmore et. al (1999), "Global, voxel, and cluster tests, by theory and permutation, for a difference between two groups of structural MR images of the brain," IEEE Transactions on Medical Imaging 18, 32-42 PDF

Summary: An experimental paper showing the viability of permutation testing for a variety of different statistics in a group-analysis setting. Bullmore et. al look at practical issues surrounding permutation testing of various different statistics in real structural data.

Bottom line: Nice look at permuation testing in a practical setting.

Percent Signal Change FAQ

Frequently Asked Questions - Percent Signal Change

Check out RoisFaq for more info about region-of-interest analysis in general...

1. What’s the point of looking at percent signal change? When is it helpful to do that?

The original statistical analyses of functional MRI data, going way back to '93 or so, were based exclusively on intensity changes. It was clear from the beginning of fMRI studies that raw intensity numbers wouldn't be directly comparable across scanners or subjects or even sessions - average means of each of those things varies widely and arbitrarily. But simply looking at how much the intensity in a given voxel or region jumped in one condition relative to some baseline seemed like a good way to look at how big the effect of the condition was. So early block experiments relied on averaging intensity values for a given voxel in the experimental blocks, doing the same for the baseline block, and comparing the two of 'em. Relatively quickly, fancier forms of analysis became available, and it seemed obvious that correcting that effect size by its variance was a more sensitive analysis than looking at it raw - and so t-statistics came into use, and the general linear model, and so forth.

So why go back to percent signal change? For block experiments, there are a couple reasons, but basically percent signal change serves the same function as beta weights might (see RoisFaq for more on them): a numerical measure of the effect size. Percent signal change is a lot more intuitive a concept than parameter weights are, which is nice, and many people feel that looking at a raw percent signal change can get you closer to the data than looking at some statistical measure filtered through many layers of temporal preprocessing and statistical evaluation.

For event-related experiments, though, there's a more obvious advantage: time-locked averaging. Analyzing data in terms of single events allows you to create the timecourse of the average response to a single event in a given voxel over the whole experiment - and timecourses can potentially tell you something completely different than beta weights or contrasts can. The standard general linear model approach to activation assumes a shape for the hemodynamic response, and tests to see how well the data fit that model, but using percent signal change as a measure lets you actually go and see the shape of the HRF for given conditions. This can potentially give you all kinds of new information. Two voxels might both be identified as "active" by the GLM analysis, but one might have an onset two seconds before the next. Or one might have a tall, skinny HRF and one might have a short but wide HRF. That sort of information may shed new light on what sort of processing different areas are engaging in. Percent signal change timecourses in general also allow you to validate your assumptions about the HRF, correlate timecourses from one region with those from another, etc. And, of course, the same argument about percent signal change being somehow "closer" to the data still applies.

Timecourses are rarely calculated for block-related experiments, as it's not always clear what you'd expect to see, but for event-related experiments, they're fast becoming an essential element of a study.

2. How do I find it?

Good question, and very platform dependent. In AFNI and BrainVoyager, whole-experiment timecourses are easily found by clicking around, and in the Gablab the same is available for SPM with the Timeseries Explorer. Peristimulus timecourses, though, ususally require some calculation. In SPM, you can get fitted responses through the usual results panel, using the plot command, but those are in arbitrary units and often heavily smoothed relative to the real data. The simplest way these days for SPM99 is to use the Gablab Toolbox's roi_percent code. Check out RoiPercent for info about that function. That creates timecourses averaged over an ROI for every condition in your experiment, with a variety of temporal preprocessing and baseline options. In SPM2, the new Gablab roi_deconvolve is sort of working, although it's going to be heavily updated in coming months. It's based off AFNI's 3dDeconvolve function, which is the newest way to get peristimulus timecourses in AFNI. That's based on a finite impulse response (FIR) model (more on those below). BrainVoyager's ROI calculations will also automatically run an FIR model across the ROI for you.

3. How do those timecourse programs work?

The simplest way to find percent signal change is perfectly good for some types of experiments. The basic steps are as follows:

  • Extract a timecourse for the whole experiment for your given voxel (or extract the average timecourse for a region).
  • Choose a baseline (more on that below) that you'll be measuring percent signal change from. Popular choices are "the mean of the whole timecourse" or "the mean of the baseline condition."
  • Divide every timepoint's intensity value by the baseline, multiply by 100, and subtract 100, to give you a whole-experiment timecourse in percent signal change.
  • For each condition C, start at the onset of each C trial. Average the percent signal change values for all the onsets of C trials together.
  • Do the same thing for the timepoint after the onset of each C trial, e.g., average together the onset + 1 timepoint for all C trials.
  • Repeat for each timepoint out from the onset of the trial, out to around 30 seconds or however long an HRF you want to look at.

You'll end up with an average peristimulus timecourse for each condition, and even a timecourse of standard deviations/confidence intervals if you like - enough to put confidence bars on your average timecourse estimate. This is the basic method, and it's perfect for long event-related experiments - where the inter-trial interval is at least as long as the HRF you want to estimate, so every experimental timepoint is included in one and only one average timecourse.

This method breaks down, though, with short ISIs - and those are most experiments these days, since rapid event-related designs are hugely more efficient than long event-related designs. If one trial onsets before the response of the last one has faded away, then how do you know how much of the timepoint's intensity is due to the previous trial and how much due to the current trial? The simple method will result in timecourses that have the contributions of several trials (probably of different trial types) averaged in, and that's not what you want. Ideally, you'd like to be able to run trials with very short ISIs, but come up with peristimulus timecourses showing what a particular trial's response would have been had it happened in isolation. You need to be able to deconvolve the various contributions of the different trial types and separate them into their component pieces.

Fortunately, that's just what AFNI's 3dDeconvolve, BrainVoyager QX, and the Gablab's roi_deconvolve all do. SPM2 also allows it directly in model estimation, and Russ Poldrack's toolbox allows it to some degree, I believe. They all use basically the same tool - the finite impulse response model.

4. What's a finite impulse response model?

Funny you should ask. The FIR model is a modification of the standard GLM which is designed precisely to deconvolve different conditions' peristimulus timecourses from each other. The main modification from the standard GLM is that instead of having one column for each effect, you have as many columns as you want timepoints in your peristimulus timecourse. If you want a 30-second timecourse and have a 3-second TR, you'd have 10 columns for each condition. Instead of having a single model of activity over time in one column, such as a boxcar convolved with a canonical HRF, or a canonical HRF by itself, each column represents one timepoint in the peristimulus timecourse. So the first column for each condition codes for the onset of each trial; it has a single 1 at each TR that condition has a trial onset, and zeros elsewhere. The second column for each condition codes for the onset + 1 point for each trial; it has a single 1 at each TR that's right after a trial onset, and zeros elsewhere. The third column codes in the same way for the onset + 2 timepoint for each trial; it has a single 1 at each TR that's two after a trial onset, and zeros elsewhere. Each column is filled out appropriately in the same fashion.

With this very wide design matrix, one then runs a standard GLM in the multiple regression style. Given enough timepoints and a properly randomized design, the design matrix then assigns beta weights to each column in the standard way - but these beta weights each represent activity at a certain temporal point following a trial onset. So for each condition, the first column tells you the effect size at the onset of a trial, the second column tells you the effect size one TR after the onset, the third columns tells you the effect size two TRs after the onset, and so on. This clearly translates directly into a peristimulus timecourse - simply plot each column's beta weight against time for a given condition, and voila! A nice-looking timecourse.

FIR models rely crucially on the assumption that overlapping HRFs add up in linear fashion, an assumption which seems valid for most tested areas and for most inter-trial intervals down to about 1 sec or so. These timecourses can have arbitrary units if they're used to regress on regular intensity data, but if you convert your voxel timecourses into percent signal change before they're input to the FIR model, then the peristimulus timecourses you get out will be in percent signal change units. That's the tack taken by the Gablab new roi_percent. Some researchers have chosen to ignore the issue and simply report the arbitrary intensity units for their timecourses.

By default, FIR models include some kind of baseline model - usually just a constant for a given session and a linear trend. That corresponds to choosing a baseline for the percent signal change of simply the session mean (and removing any linear trend). Most deconvolution programs include the option, though, to add other columns to the baseline model, so you could choose the mean of a given condition as your baseline.

There are a lot of other issues in FIR model creation - check out the AFNI 3dDeconvolve model for the basics and more.

5. What are temporal basis function models? How do they fit in?

Basis function models are a sort of transition step, representing the continuum between the standard, canonical-HRF, GLM analysis, and the unconstrained FIR model analysis. The standard analysis assumes an exact form for the HRF you're looking for; the FIR places no constraints at all on the HRF you get. But sometimes it's nice to have some kinds of constraints, because it's possible (and often happens) that the unconstrained FIR will converge on a solution that doesn't "look" anything like an HRF. So maybe you'd like to introduce certain constraints on the type of HRFs you'll accept. You can do that by collapsing the design matrix from the FIR a little bit, so each column models a certain constrained fragment of the HRF you'd like to look for - say, a particular upslope, or a particular frequency signature. Then the beta weight from the basis function model represents the effect size of that part of the HRF, and you can multiply the fragment by the beta weight and sum all the fragments from one condition to make a nice smooth-looking (hopefully) HRF.

Basis function models are pretty endlessly complicated, and the interested reader is referred to the papers by Friston, Poline, etc. on the topic - check out the Friston et. al, "Event-related fMRI," here: ContrastsPapers.

6. How do you select a baseline for your timecourse? What are pros and cons of possible options? Do some choices make particular comparisons easier or harder?

Good question. Choosing a particular baseline places a variety of constraints on the shape of possible HRFs you'll see. The most popular option is usually to simply take the mean intensity of the whole timecourse - the session mean. The problem with that as a baseline is that you're necessitating that there'll be as much percent signal change under the baseline as over it. If activity is at its lowest point during the inter-trial interval or just before trial onset, then, that may lead to some funny effects, like the onset of a trial starting below baseline, and dramatic undershoots. As well, if you've insufficiently accounted for drifts or slow noise across your timecourse, you may overweight some parts of the session at the expense of others, depending on what shape the drift has. Alternatively, you could choose to have the mean intensity during a certain condition be the baseline. This is great if you're quite confident there's not much response happening during that condition, but if you're not, be careful. Choosing another condition as the baseline essentially calculates what the peristimulus timecourse of change is between the two conditions, and if there's more response at some voxels than you thought in the baseline condition, you may seriously underestimate real activations. Even if you pick up a real difference between them, the difference may not look anything like an HRF - it may be constant, or gradually increase over the whole 30 seconds of timecourse. If you're interested in a particular difference between two conditions, this is a great option; if you're interested in seeing the shape of one condition's HRF in isolation, it's iffier.

With long event-related experiments, one natural choice is the mean intensity in the few seconds before a trial onset - to evaluate each trial against its own local baseline. With short ISIs, though, the response from the previous trial may not have decayed enough to show a good clean HRF.

7. What kind of filtering should I do on my timecourses?

Generally, percent signal analysis is subject to the same constraints in fMRI noise as the standard GLM, and so it makes sense to apply much of the same temporal filtering to percent signal analysis. At the very least, for multi-session experiments, scaling each session to the same mean is a must, to allow different sessions to be averaged together. Linear detrending (or the inclusion of a first-order polynomial in the baseline model, for the AFNI users) is also uncontroversial and highly recommended. Above that, high-pass filtering can help remove the low-frequency noise endemic to fMRI and is highly-recommended - this would correspond to higher-order polynomials in the baseline model for AFNI, although studies have shown anything above a quadratic isn't super useful (Skudlarski et. al, TemporalFilteringPapers). Low-pass filtering can smooth out your peristimulus timecourses, but can also severely flatten out their peaks, and has fallen out of favor in standard GLM modeling; it's not recommended. Depending on your timecourse, outlier removal may make sense - trimming the extreme outliers in your timecourse that might be due to movement artifacts.

8. How can you compare time courses across ROIs? Across conditions? Across subjects? (peak amplitude? time to peak? time to baseline? area under curve?) How do I tell whether two timecourses are significantly different? How can you combine several subjects’ ROI timecourses into an average? What’s the best way?

All of these are great questions, and unfortunately, they're generally open in the literature. FIR models generally allow contrasts to be built just as in standard GLM analysis, so you can easily do t- or F-tests between particular aspects of an HRF or combinations thereof. But what aspects make sense to test? The peak value? The width? The area under the curve? Most of these questions aren't super clear, although Miezin et. al (PercentSignalChangePapers) and others have offered interesting commentary on which parameters might be the most appropriate to test. Peak amplitude is the de facto standard, but faced with questions like whether the tall/skinny HRF is "more" active than the short/fat HRF, we'll need a more sophisticated understanding to make sense of the tests.

As for group analysis of timecourses, that's another area where the literature hasn't pushed very far. A simple average of all subjects' condition A, for example, vs. all subjects' condition B may well miss a subject-by-subject effect because of differing peaks and shapes of HRFs. That simple average is certainly the most widely used method, however, and so fancier methods may need some justification. One fairly uncontroversial method might be simply analogous to the standard group analysis for regular design matrices - simply testing the distribution across subjects of the beta weight of a given peristimulus timepoint, for example, or testing a given contrast of beta weights across subjects.


Percent Signal Change Papers

Useful Papers - Percent Signal Change


Dale & Buckner (1997), "Selective averaging of rapidly presented individual trials using fMRI," Human Brain Mapping 5, 329-340 PDF

Summary: Once of the first series of event-related fMRI papers, this might have been the first fMRI paper to explore event-related averaging in parallel fashion to ERPs. The authors test the linearity of the addition of HRFs in a visual stimulation paradigm and find that overlapping HRFs sum in a roughly linear fashion - a key to the sort of deconvolution done today.

Bottom line: Event-locked averaging can be done even with overlapping trials, and single events can generate a strong enough response to be analyzed with fMRI

Glover (1999), "Deconvolution of impulse response in event-related BOLD fMRI," NeuroImage 9, 416-429 PDF

Summary: The other side of the linearity issue. For longer stimuli (longer than 4-5 seconds), Glover shows that the HRF does not track the standard boxcar-convolved-with-hrf model at all. A variant of the full FIR model, Wiener deconvolution, is used to show the effectiveness of linear deconvolution with stimuli separated by at least 4 seconds and with a subject-specific deconvolution filter.

Bottom line: Linear deconvolution works - at least up to a point. For longer stimuli, boxcars convolved with an HRF aren't really appropriate.


Ward, AFNI 3dDeconvolve manual (a good reference to skim, particularly early parts on theory) PDF

Summary: A great overview of the theory behind the FIR model and many of the issues influencing construction of those models.

Aguirre et. al (1998), "The variability of human BOLD hemodynamic responses," NeuroImage 8, 360-369 PDF

Summary: We saw this guy earlier... Aguirre et. al tested various sets of 40 subjects tested on various days and in various sessions. They found a good deal of variance accounted for by differences in subjects, and significant differences for many subjects between different scanning days. Within the same day and subject, thought, the HRF seemed relatively stable.

Bottom line: Shows that subject-to-subject and day-to-day variance in HRF can be high, but within a day across runs, the HRF is relatively stable.

Miezin et. al (2000), "Characterizing the hemodynamic response: Effects of presentation rate, sampling procedure, and the possibility of ordering brain activity based on relative timing (Ed. - whew!)," NeuroImage 11, 735-759 PDF

Summary: Ditto on seeing this earlier. Excellent look at the factors influencing the HRF and how stable all its aspects are. Demonstrates HRF remains stable and linear within a subject and certain timing parameters, but outside those, less so.


Physiology and fMRI FAQ

Frequently Asked Questions - Physiology and fMRI

This section is intended to address design-related questions that focus primarily on how physiological factors can affect your scanning - things like heart rate, breathing, the types of artifacts generated by those movements, etc. Obviously, physiological factors are mixed in heavily with your experimental design, so be sure to check out some other design-related pages:

1. Why would I want to collect physiological data?

No, the real question should be: why wouldn't you want to collect physiological data? And the only answer is: because you hate freedom. Haha! Phew.

Actually, the main reason is because physiological effects can be a significant source of noise in your data. The pulsing of blood vessels with the cardiac cycle can move brain tissue, jostle ventricles and alter BOLD signal in specific regions; respiration can induce magnetic inhomogeneities and create head movements. If you measure physiology, you can, at least in part, account for those sources of noise and remove their confounding effects from your data, boosting your signal to noise ratio.

Secondarily (well, primarily, for some), for many studies, physiological measures like heart rate or respiration rate can be an important source of data themselves, providing an important test of autonomic arousal that may supplement self-report data or other measures. All those uses are beyond the scope of this discussion, but it's something to keep in mind.

2. How do I do it?

Thankfully, scanner designers have generally had the foresight to think someone might want to collect this info, and so most scanners have instruments built in to record at least two main physiological measures - heartrate/cardiac cycle (usually with a photoplethysmograph - a small clip that goes on the finger) and respiration (usually with a pneumatic belt - a thin belt that wraps around the chest). For scanners without these instruments built in, several companies manufacture MR-compatible versions of these instruments.

There may be other measures that you may want to collect - galvanic skin response, for example - and here at Stanford, those sensors are relatively easily available. Other measures have less of an effect on creating noise in the signal, however, and so we'll primarily address those two measures.

3. How might the cardiac cycle influence my signal?

Several ways. The pulsation of vessels creates a variety of other movements, as CSF pulses along with it to make room for incoming blood, tissue moves aside slightly as vessels swell and shrink, and waves of blood (and the accompanying BOLD signal) move through the head. This sounds like the effects can be relatively small, but large structures of the brain can move significant amounts in the neighborhood of large vessels, and the resulting motion can significantly change your signal. Generally, the term "pulsatility" is used to describe the process that generates artifacts through cardiac movement. As well, because TRs for many experiments are slower than the cardiac cycle, these effects can occur image-to-image in an unpredictable way, as various points in the cycle are sampled in an irregular fashion.

These effects are more significant in some regions of the brain than in others; Dagli et. al (PhysiologyPapers) address the question of where pulsatility artifact is worst. Perhaps unsurprisingly, areas near large vessels tend to rank among those regions, but other areas are also affected - regions near the borders of are also significantly affected.

4. How might respiration influence my signal?

Also a couple ways. Respiration, by its nature, can cause the head and particular parts of it (sinus cavities, etc.) to move slightly, which can induce motion-related changes in signal. Perhaps more significantly, the inflating and deflating of the lungs changes the magnetic signature of the human body, and that signature change can induce inhomogeneities in the baseline magnetic field (B0) that you've carefully tuned with your shim. Those inhomogeneities can be unpredictable and can affect your signal in unpredictable ways. Breathing rate can also be significantly less predictable than cardiac cycle - many subjects take spontaneous deeper breaths at irregular intervals, for example. Van de Moortele et. al (PhysiologyPapers) address the sources of respiration artifact in some detail.

5. What can I do to account for these changes?

Thought you'd never ask. There are several ways. Pfeuffer et. al (PhysiologyPapers) present a navigator-based method in k-space that adjusts, in large part, for global effects, more due to respiration changes. Perhaps the most prevalent ways, though, are retrospective, and rest on the fact that generally, cardiac cycle and respiration cycle take place at a time scale far faster than stimulus presentation for most experiments. Isolating signal changes that take place at the appropriate frequency, then, can help isolate those sources of noise.

This isn't as easy as it sounds, due to the potential aliasing of this noise because of the difference between TR and cardiac cycle time. But it is possible, with some work. Glover et. al (PhysiologyPapers) present one of the industry-standard ways of doing this correction - by sorting images according to their point in the cardiac or respiration cycle, the appropriate amount of signal change due to those sources can be identified and removed from the image. This correction happens at the point of reconstruction of images from raw data, and is available automatically at Stanford in the makevols program.

Care should be taken with this option, though. As with any type of algorithm that removes "confounding" signals (realignment, say), this correction can't account for the extent to which physiological noise is correlated with the task. If there is significant correlation between your task and your physiological measures, removing noise due to the physiological sources will also remove task-related signal. This could happen with rapid event-related tasks if subjects breath in time with the tasks, or in any sort of experiment that might induce arousal - emotionally arousing stimuli may increase heart rate and respiration rate and thus change those noise profiles in a task-correlated way. In this case, Glover et. al suggest using only resting-state images to calculate this correction; this is always a significant consideration in physiological artifacts.

6. Are there other sources of physiological noise I might want to worry about?

Possibly. Peeters and Van der Linden (PhysiologyPapers) address the question of longer-term physiological changes, such as those induced by pharmacological manipulations or sudden environmental changes. As a drug begins to be absorbed, for example, there can be a gradual change in vasoconstriction or blood oxygenation in general that can look like a global signal drift but is, in fact, a changing of the sources of the signal. Researchers interested in looking at these sorts of changes should look with care at these sorts of corrections.


Physiology and fMRI Papers

Useful Papers - Physiology and fMRI

Also check out DesignPapers, JitterPapers, FmriPhysicsPapers and ScanningPapers...


Glover et. al, "Image-based method for retrospective correction of physiological motion effects in fMRI: RETROICOR," Magnetic Resonance in Medicine 44, 162-167 PDF

Summary: Glover et. al describe a fast and effective retrospective algorithm to identify physiological noise in raw data and remove it; the algorithm is based on sorting images based on their point in the cardiac and/or respiratory cycles. They show it to be more effective than a popular k-space correction program.

Bottom line: Unless correlation of physiological noise with your task is a concern, you should be using this correction. At Stanford, it's built into the makevols reconstruction program.

Dagli et. al, "Localization of cardiac-induced signal change in fMRI," NeuroImage 9, 407-415 PDF

Summary: The group uses a retrospective gating method in real data to find voxels that are significantly affected by cardiac-cycle noise, to see if that noise affects parts of the brain worse than others. Unsurprisingly, the noise is found to be especially bad near major arteries, as well as near the sinus regions.

Bottom line: fMRI signal without some kind of correction will be significantly degraded in large but spatially organized regions, focused on major arteries near the medial areas of the brain, as well as near sinus regions.


Pfeuffer et. al, "Correction of physiologically induced global off-resonance effects in dynamic echo-planar and spiral functional imaging," Magnetic Resonance in Medicine 47, 344-353 PDF

Summary: Presents a correction method (innovatively named DORK, thanks to our own Gary Glover) that operates using a navigator echo to correction off-resonance artifacts largely induced by respiration (see Van de Mooretele et. al, below). Can also be used with slightly reduced effectiveness without the navigator. Best for correcting global effects - less effect on cardiac, as those artifacts are more local.

Bottom line: A fast and simple correction for global artifacts in the respiration (and other) frequency band. Probably less useful than RETROICOR, in the final analysis, but more general for global effects.

Van de Moortele et. al, "Respiration-induced B0 fluctuations and their spatial distribution in the human brain at 7 Tesla," Magnetic Resonance in Medicine 47, 888-895 PDF

Summary: A companion paper to Pfeuffer et. al above, this discusses the sources of those respiration-induced global frequency shifts. The authors examined data at 7T to mathematically model the changes in susceptibility and B0 induced by respiration, and describe a previously published mathematical model than can model those changes well.

Bottom line: Good summary description of the sources of respiration-induced artifacts and how they differ from cardiac-induced artifacts.

Peeters & Van der Linden, "A data post-processing protocol for dynamic MRI data to discriminate brain activity from global physiological effects," Magnetic Resonance Imaging 20, 503-510 PDF

Summary: Authors attempt to devise a method to help distinguish long-term gradual global changes - like those induced by a sudden temperature change and vasoconstriction, or those induced by gradual "kicking in" of a pharmacological agent - from focal changes, to enable use of those factors in studying global and local interactions. The method is simple and probably easily confounded, but low in calculation effort and straightforward.

Bottom line: Probably too crude a method to be much use for physiological correction, but how often do you see a study that does MRI on anaesthesized carp? I mean, that's awesome.



Frequently Asked Questions - Region-of-Interest (ROI) Analysis


1. What's the point of region-of-interest (ROI) analysis? Why not just use the basic voxelwise stats? Are you too good for them or something? Huh, Mr. Fancy Pants?

Whoa, now, no need to get touchy. A lot of people think ROI analysis is a really good idea, possibly where the real future of fMRI lies. Nieto-Castanon et. al (RoisPapers) make the argument kind of like this: brain imaging is concerned, among other things, with analyzing how mental functions are connected to brain anatomy. Note that the word "voxel" didn't enter into that statement. Voxels aren't, on their own, a particularly useful concept for us, but the standard statistical model does all of its preprocessing and analysis on that level. That analysis path tends to completely blur anatomical boundaries, often by a great deal, smearing our resolution to hell and preventing us from making good clean associations between structure and function. So ROI analysis offers us a way to get around individual anatomical variability and sharpen our inferences.

As well, even if we don't start our statistical analysis at the ROI level, ROIs offer us a reasoned way to extract measures that differ from the standard voxelwise t-statistic. Measures like percent signal change timecourses or fit coefficients are additional information that you can extract from your data only by looking within a particular region. These measures can shed light on otherwise obscured aspects of your study - temporal characteristics, particular sizes and directions of effects, or correlations with behavior. Seen from this perspective, ROI analysis is a valuable parallel tool to the standard voxelwise GLM analysis and can often provide new and interesting pieces of the puzzle of your data.

2. How should I generate ROIs? What are the pros and cons of each way?

Funny you should ask; I just happened to have this little grid lying around, which has been helpfully converted into a sort of tree for the Digi-Web. The methods of ROI definition can be split along two axes - the type of brain ROIs are defined on (individual, group, atlas) and the features used to define it (microanatomy/cytoarchitecture, macroanatomy, function). The breakdown goes like this:

  • Microanatomy / cytoarchitecture ROIs

    • Individually-defined: This would mean drawing (by hand or automated method) ROIs on the cytoarchitectonic map for an individual. The pros of this method: you're quite close to the neuronal level of organization, you've got very high resolution on your ROI, and the mapping between cytoarchitectonic regions has generally proven to be close (see Brett et. al on RoisPapers). Cons: Not all functional info is represented (columnar structure, for example, is sometimes represented in cytoarchitecture and sometimes not), and, of course, getting the cytoarchitectonic map for a living human non-invasively is practically impossible at this point. That may change in the future, though - some groups are attempting to mine this data from fMRI maps.
    • Group-defined: This is almost never done - it would mean drawing cytoarchitectonic maps on a template or something.
    • Atlas-defined: This means getting coordinates for particular cytoarchitectonic regions from an atlas, and generally this means using the Talairach Daemon or some similar tool to get Brodmann Areas for particular atlas coordinates. Pros of this: the Tal Daemon's easy to get, easy to use, fast, popular, and can be easily used for MNI-space coordinates with a simple transform. Cons: Lots of sources of error in labeling. The original Talairach BA labeling is very crude (see Brett et. al, RoisPapers) and based on eyeball. The Talairach - MNI transform isn't perfect. And perhaps biggest of all, there is enormouse individual variability of the shape and size of cytoarchitectonic regions - what's BA32 on the atlas may well be deep into BA10 on your subject. Some of those problems are solved by getting a better cyto. atlas - Zilles et. al are working on one - but the problem of individual variability remains.
  • Macroanatomy / anatomically-defined ROIs

    • Individually-defined: This means drawing ROIs on each individual subject's anatomy, based on anatomical markers like sulci, gyri, and other features viewable by the naked eye. This drawing can be by hand or by an automated or semi-automated program (like that at (See SegmentationFaq and SegmentationPapers for more info on those - they're getting pretty good.) Pros: Gets around problem of individual anatomical variability - by extracting each measure from an individually defined structure, you eliminate the issue of whether everyone's structures line up. Allows you to keep your data unsmoothed (smoothing is often done to avoid problems of anatomical variability), and so gives you higher resolution without sacrificing power and even while gaining some. Better mapping of function to structure. Cons: Can be hard to get - requires either good use of an automated algorithm or a lot of labor and experience in hand-drawing ROIs. Makes it more difficult to report activation coordinates (but note Swallow et. al suggest you can normalize your data before defining and be okay - RoisPapers). Most importantly, the relationship between function and macroanatomy is very unclear in many regions of the brain, particularly associational cortex and prefrontal areas. Just because you're extracting from everybody's superior frontal gyrus doesn't mean you're going to get at all the same function. So this path may not gain you any power at all and might lose you some.
    • Group-defined: Again, this is rarely done - it would mean drawing your ROI on some group average. Probably all of the cons of above, with fewer pros.
    • Atlas-defined: This means taking some atlas system's description of anatomical features, like using the Talairach-defined amygdala or Talairach superior frontal gyrus, or possibly drawing an ROI on the MNI template brain. Often the ROIs are then reverse-normalized to fit the subject's non-normalized functional data. Pros: Very easy to get - probably the most-used method for defining anatomical ROIs. Very standardized, and so easily comparable across studies. There's not as much an issue with labeling as above - the Talairach and Tournoux atlas is very accurate at picking out coordinates for macroanatomical features, so you get to leverage their skill at drawing ROIs for your own study. Cons: The overlap of the atlas-defined region with your subjects' is only as good as your normalization - and Nieto-Castanon et. al (RoisPapers) offer a scathing commentary on how good standard normalization algorithms are, showing that even post-normalization, they had huge variations in the shape of particular gyri. Because of that, atlas-defined regions will always tend to obscure differences in anatomical variability, even with reverse normalization. As well, this method is still subject to the problem of unclear structure-function relationships in many areas of the brain.
  • Functional ROIs

    • Individually-defined: This means choosing functionally-activated voxels from each individual's results - either from a localizer task or from the actual study task. This is so popular these days we've actually broken it out to its own page, FunctionalLocalization.
    • Group-defined: This means choosing functionally-activated voxels from your group results and using that voxel set as an ROI to extract from individuals. Pros: Gets to function, as above. Can be a better guarantee that you've got some effect in a given region, since presumably voxels activated in the group had some decent activations in most or all subjects. A good chunk less labor than individually-defining ROIs. Cons: Swallow et. al (RoisPapers) demonstrate these ROIs are not as reliable (i.e., are more variable) than the individually-created ones. This method doesn't account at all for individual functional variability, which can be considerable (i.e., different voxels can be included or exluded wrongly in every subject).
    • Atlas-defined: Probably the closest thing to this is using some other study's reported sites of functional activation - choosing ROIs that correspond to the Tal or MNI coords of other studies' activations. Pros: Avoids the problem of defining localizers, allows direct comparison between studies. Cons: Ignores differences in subject pool and anatomical and functional variability in your own subjects. Also introduces whatever error the other study might have had in localization into your study - if they got their spot wrong, you might be even wrong-er...

Lotta options. All of 'em have their pros and cons - which one you choose will depend largely on the type of questions you want to ask.

3. When can I just look at peak voxels vs. whole regions?

This is still an open question in the literature. The argument for averaging across an ROI is that it should enhance signal to noise; the timecourse from a single voxel can be quite noisy and could, indeed, be some kind of outlier in the ROI. Averaging might give you a better picture of what's happening over the whole ROI. The argument for using a peak voxel, though, is that we know the peak voxel - the voxel that shows the most correlation to the task relative to its variance - is guaranteed to show the best effect of any voxel in the ROI. Additionally, since we know our resolution is blurred by the vascular structure in the region, any spatial smoothing we may have done, and registration and normalization errors, it's entirely possible that some of our ROI's activation isn't reflecting "true" neuronal activity but simply an echo or blurring of activity elsewhere in the ROI. So to average those timecourse together may well wash out our effect, which is after all calculated in voxelwise fashion.

Nieto-Castanon et. al (RoisPapers) choose to look at whole ROIs, and that's arguably the prevailing sentiment in the literature. Particularly with false discovery rate p-threshold correction rising in prominence (see PthresholdFaq), the risk of any given voxel being a false positive might seem too high.

On the other hand, at least one (and possibly more than one) empirical study - Arthurs & Boniface, RoisPapers - has found that peak-voxel activity correlates better with evoked scalp electrical potentials than does activity averaged across an ROI. They cite a couple other studies that have examined similar issues in animal models, and suggest that in mammal cortex in general, the brain may "water the garden for the sake of one thirsty flower," i.e., ROI activity may only reflect true neuronal changes in a few voxels of the ROI. So the question remains open...

4. What sort of measures can I get out of ROIs?

The two big ones that are usually looked at are:

  • beta weights (also called parameter weights or fit coefficients), which are voxel-by-voxel slope values from the multiple regression of your statistical model and correspond with effect sizes of particular conditions.
  • percent signal change or timecourse information - literally looking at the TR-by-TR image intensity at a particular voxel or ROI to get a timecourse of intensities in a particular area. That timecourse is often trial-averaged to come up with time-locked average timecourse, which correspond to the average intensity change following the onset of a particular trial type - essentially an empirical look at the shape of the hemodynamic response to different trial types in a given region.

Other measures are occasionally taken - for voxel-based morphometry, for example, where you might look at percentage of gray matter in a given voxel - but those two are the biggies. Beta weights are used in block-related and event-related experiments; percent signal change is usually more important for event-related experiments, although it's occasionally used for blocks as well.

5. What’s the point of looking at percent signal? When is it helpful to do that? How do I find it?

For everything you could want to know about percent signal change, check out PercentSignalChangeFaq.

6. What are beta weights / parameter weights / fit coefficients? When is it helpful to look at them? What types of analyses can I do with them?

When you run a general linear model to estimate your effect sizes (see BasicStatisticalModelingFaq for info on this), you're essentially running a giant multiple regression on your data, with the columns of your design matrix as the regressors. Each of those columns corresponds to a particular effect, and each of them is assigned by the GLM a particular parameter value: the B in the equation Y = XB + E, where Y is the signal, X is the design matrix, and E is error. That parameter value corresponds to how large an effect the particular condition had in influencing brain activity.

Importantly, beta weights are not an index of how well your condition's design matrix fit the brain activity - it is not the r or r-squared value for the regression. It's the slope of the regression. This means you could conceivably have a very small effect that fit the model incredibly well, or a very large effect with a great deal of noise in the response. This slope corresponds better to the idea of 'level of activation for a particular condition' that we want to find. As an example, a design matrix column that was all zeros might predict the brain activity perfectly - there might be essentially no change in a given voxel down the whole timecourse. In that case, our r-squared would be very high for that column of the regression - but we wouldn't want to say that voxel was active, because it was totally insensitive to any experimental manipulation. It makes more sense to look at how big the effect size was - whether a given voxel seemed to respond very highly to a given trial type - and then, if we're concerned about noise, we can normalize the effect size by some measure of the effect variance, to get a t-statistic. That's generally what's done in most neuroimaging programs these days.

The big reason to extract beta weights at all is that they give you a numerical estimate of the effect size of a particular condition. If the beta for A at a given point for one subject is three times that at the same point for another subject, you know that A had three times a bigger effect in the first subject. This can be an ideal measure to use in regressions against some behavioral measure. For example, you might want to know if a given's subject's self-reported difficulty with a task correlated with the size of the effect of that condition in a particular region. You can extract the beta weights from that regions for each subject, run a simple regression, and find how significant it comes out.

You can also use beta weights to correlate with each other as a crude way of indexing connections between regions. If subjects with anterior cingulates that responded more to condition A also had cerebellums that responded less to condition A, and that correlation is significant across your subject pool, that may tell you something about how the cingulate and cerebellum relate in your task. Check out ConnectivityFaq for more info that direction.

Essentially, beta weights can be used in a myriad of ways - any time you'd like to have some numerical estimate of a given effect or contrast size, rather than simply a statistical measure of the activation.

7. How do I find beta weights / etc.?

Depends on the program, but every major neuroimaging program can create, in the process of running a GLM, an image of the voxel-by-voxel beta weights for each condition. In SPM, these are the beta_00*.img files produced by estimating a model; in AFNI, they're the fit-coefficient or parameter images that can be produced as part of the bucket dataset output. Other programs generally have similar names for these images. Getting the beta weights is simply a matter of using some image extraction utility - something like roi_extract for SPM - to get the voxel-by-voxel intensity values in your desired ROI. These can then be averaged across the ROI, or you can look only at the peak beta value, etc. Check out RoisFaq for more.

8. How do I combine information from an ROI across the whole thing?

The most common strategy in dealing with multiple voxels in an ROI is simply to average your measure across all voxels in the ROI. This has the advantage of being simple to do and simple to explain in a paper. Friston et al. (2006) point out this method may be too conservative; if the ROI has any heterogeneity (say, half of it activates and half of it de-activates), you'll tend to miss things. More complicated methods can be used to identify different subsets of voxels within the ROI with separable responses. Taking the first eigenvariate of the response across voxels from a principal components analysis of the ROI is a simple version of this (and supported in SPM).


ROI Papers

Useful Papers - Region-of-Interest (ROI) Analysis

Also check out SegmentationPapers (e.g., Yushkevich et al. (2006)) for automated ROI-generation methods.


Nieto-Castanon et al. (2003), "Region of interest based analysis of functional imaging data," NeuroImage 19, 1303-1316 PDF

Summary: Arguing that standard voxelwise statistical methods provide no guarantees about the mapping between function and a particular brain (as opposed to voxel) location, Nieto-Castanon and colleagues propose a GLM-based statistical analysis that operates on signal from whole, anatomically specified, ROIs, rather than individual voxels. They point out, crucially, that even in normalized brains there is little to no overlap between anatomically-marked ROIs over more than a pair of people.

Bottom line: If one point of brain imaging is associating function with particular anatomical locations, why aren't why analyzing data in terms of anatomical locations? Here's how it can be done in a reasoned statistical fashion.

Brett et al. (2002), "The problem of functional localization in the human brain," Nature Reviews Neuroscience 3, 243-249 PDF

Summary: A nice review of many of the problems plaguing any kind of regional labeling of functional activations. Brett et. al introduce a taxonomy of labels - cytoarchitectonic, macro-anatomic, etc. - and review the issues with Talairach space, Brodmann areas, anatomical connection with function, etc. that are currently clouding the issue of how we should be labeling our activation sites. The connection to normalization is also highlighted.

Bottom line: A great overview of some outstanding issues in localizing activity, and almost as importantly, labeling it.


Swallow et al. (2003), "Reliability of functional localization using fMRI," NeuroImage 20, 1561-1577 PDF

Summary: Some researchers have made persuasive arguments for using only functionally-defined ROIs in your analysis, but how should you best make them? Swallow et. al examine two key steps in functional ROI generation - normalization (i.e., can you define your ROIs after normalization) and group averaging (i.e., can you define your ROIs at the group level and have them hold for the individual?) Short answers, respectively: normalization is fine, group averaging is not.

Bottom line: Functional ROIs are fine to define on normalized individual data, but not at the group level.


Arthurs & Boniface (2003), "What aspect of the fMRI BOLD signal best reflects the underlying electrophysiology in human somatosensory cortex?," Clinical Neurophysiology 114, 1203-1209 PDF.

Summary: The authors correlate BOLD signal in some ROIs resulting from electric nerve stimulation with ERPs from the same paradigm, and find that the BOLD signal from the peak voxel of a cluster correlates better with the electrophysiology than the average BOLD signal from the whole cluster does. They have some suggestions about why, including quoting the picturesque turn of phrase, "watering the garden for the sake of a single thirsty flower."

Bottom line: Averaging across a (small) ROI and taking the peak voxel are about the same, but peak voxels might correlate slightly better with the underlying activity.

Friston et al. (2006), "A critique of functional localisers," NeuroImage 30, 1077-1087 DOI.

Summary: Generally argues for the use of factorial designs rather than functional localizer tasks, on the account that they do many of the same things and additionally allow specific tests of interactions. See FunctionalLocalization for a bit more on details of the argument. Also argues for using other measures of signal from the ROI than simple averaging - looking at first eigenvariates, for instance.

Saxe et al. (2006), "Divide and conquer: a defense of functional localizers," NeuroImage 30, 1088-1096 DOI.

Summary: Argues for the use of functional localizers (a response to Friston et al.). Some reasons pass each other in the night a bit, but Saxe et al. also express some ambivalence about factorial designs because a) they may in fact add a bunch of fairly uninteresting comparisons (why add all the extra cells of the design if you don't care about them?) and b) they use the same data to identify regions as to estimate effects (as opposed to different datasets). Off the top of my head, this latter seems like a spurious criticism to me, due to the orthogonality of the design...

Friston & Henson (2006), "Commentary on: Divide and conquer; a defense of functional localisers," NeuroImage 30, 1097-1099 DOI.

Summary: Dammit it, British people spell it with an "s", not a "z"! Oh, there's more, too. They agree with the intuition that main-effect constraints do not, in fact, bias the estimation of other main effects or interactions. Localizer tests statistically correspond to split-half tests, which aren't as efficient as full likelihood ratio tests (of the kind you get when everything's included in the same model). Also, Karl is pissed he got rejecting from PLoS-B.

Bottom line: If your questions of interest will support it (and you'll have useful theories about all the cells), a factorial design is usually better than a separate localizer scan. But separate localizers can be useful for well-characterized context-independent regions with known anatomical-functional mappings - like retinotopic areas of visual cortex, MT, etc. Even if you do a separate localizer, including it in your main model is probably justified and may allow better variance estimates.


ROI Percent

This page attempts to answer some frequently asked questions about the "% signal change" script, roi_percent.m, in the GablabToolbox.

For general questions about percent signal change as a measure, check out PercentSignalChangeFaq.

You might also check out RoiDeconvolve for answers to common questions about the "sister" script to roi_percent. The two are used for exactly the same thing, but are best at differing types of experimental designs.


The full text documentation for this program can be found here : percent_signal_change_readme.txt

When running roi_percent, the program pops out 4 files, one of which is a .txt file with four rows (pct, std, conint, and Y). What do all of these columns represent? How is the Std calculated, and how would one convert that to standard error values?

The .txt file you're talking about is the percent_signal_condition.txt file. The output in this file is intended to allow you to compare percent signal responses to different conditions in your experiment. Each condition in your experiment should have its own section in the file, with the first row of each section being the name of your condition. The basic output is the pct row, which is a timecourse of percent signal change values. The timecourse should be 32 seconds worth of TRs long (varies depending on your TR length). That timecourse is the average percent signal change response in that region following the onset of that condition, averaged across all onsets of the condition in the whole experiment. It's also called a time-locked average, or a peristimulus timecourse. The idea is that you can plot this to look at the hemodynamic response timecourse for a particular condition.

The other rows operate on the same timescale, but provide different measures on that time axis. std is the standard deviation of the percent signal change at that peristimulus timepoint across all occurrences of that condition; conint is 1.96 times the standard error for that timepoint, or one-half the 95% error bar value. Y is a peristimulus timecourse of the scaled intensities that the percent signal change is calculated from; depending on how you chose your temporal filtering options, the absolute numbers may vary widely here.

But my trials are only 10 seconds long. What's the rest of that timecourse mean?

Not much. The timecourse that's output covers 32 seconds no matter what, because that's the length of the canonical HRF in SPM. But if your trials are shorter than that, the remainder of that timecourse is likely to be a mishmash of responses to whatever trials follow the condition you're looking at.

Example: if your trials are 10 seconds long with a TR of 2, and you're examining the peristimulus timecourse for condition X, only the first five values in that timecourse happened during an X trial. The rest of the values in that timecourse reflect the intensities from volumes after an X trial ended, presumably from some other types of trials, and so they're probably not interpretable. If your trials are very short compared to your TR - say, only one or two TRs long - you're probably better served by trying to deconvolve your neural signal and looking at percent signal change from that deconvolved timecourse.

ROI Toolbox

In the Gabrieli Lab, one major package of custom programs we've written is the ROI Toolbox, which is available to anyone working on a Gablab machine (and someday on the lab website).

This page has a quick description of what all the scripts in the toolbox do, and at some point will hopefully link to their documentation as well (when I get around to updating them and putting them on the web). A couple of the scripts already have their own FAQ pages up - scroll down to find those links.

However, you can always access the documentation for any script from the ROI Toolbox itself, in the upper right-hand corner. Make sure you're running the graphical version of Matlab, or else you won't be able to get to them. They're also available in the Gablab file structure at


You can run the Toolbox by running spm99-6-devel from a terminal prompt at a Gablab machine (which will bring up Matlab), and then typing "roimod1" at the Matlab prompt. The Toolbox is compatible with Matlab 6.0 (R12) or 6.5 (R13), so far as I know.

Toolbox Overview

Global Variate (artdetect4.m)

More info at ArtDetect...

An interactive tool to identify and repair motion-related outlier images in your

experiment. The tool displays a plot of global intensity values for each scan, z-scores for each of those intensity values, and plots the realignment movement parameters for each scan, so you can identify scans whose intensity values are way outside the mean and which occurred at the same moment as a large head movement. The tool then allows you to repair the timecourse by replacing outlier scans with a mean functional image or with an interpolated image created from the outlier’s neighboring scans.

Movement Parameters (plot_move.m)

A simple display tool to look at the realignment movement parameters for a given

scan session. Parameters are plotted on two sets of axes; the first displays x,y,z motion for the head in mm, while the second plots pitch, roll, and yaw motion for the head in radians.

Movie of Images (spm_movie.m)

Runs through every image in a given timecourse as a movie, which allows quick

viewing of all the scans. Useful to detect bizarre outlier scans that automated methods might miss.

ROI stats (roi_stats.m)

Given an ROI .img file and a set of data or beta images to extract from, this function

extracts the number of non-masked voxels in the ROI in each image, the average intensity value of all voxels in the ROI from each image, the variance of intensities across all voxels in the ROI from each image, and the min and max intensities in the ROI from each image, and returns a data structure containing vectors of all those values.

ROI Extract (roi_extract.m)

Just like ROI stats, but this function only extracts the mean. Given an ROI .tal

and a set of data images to extract from, this function extracts the average intensity value across all voxels in the ROI from each image. It optionally writes those values to a text file.

% Signal Change (roi_percent.m)

More info at RoiPercent.

This function takes ROI .tal files and a set of data files and extracts the % value that the mean intensity of all the voxels in the ROI differs at each scan from the mean ROI intensity across the whole data set (or for a particular condition). It optionally applies a number of temporal preprocessing options to the data set before the values are extract to clean up the values. It can also be set to an individual voxel mode, in which % signal change is extracted from a single voxel. It writes the values for the whole timecourse out to a text file, as well as average signal change values for each condition of the experiment.

Display ROIs (display_rois.m)

This button pops up the standard SPM interactive display screen, with three

orthogonal views, and then allows the user to superimpose up to three ROI images on top of the background image in different colors.

Display Slices (display_slices.m)

This button asks for a set of background and ROI images to display, then asks the

user to select what sort of image (structural or blob) each is, as well as the desired orientation for displayed slices and a range to pick slices from; it then pops up a non- interactive multiple-slice viewing window with any ROI images displayed as blobs, suitable for printing.


Activates the SPM render facility to create a rendered image of the brain on which

an ROI image may be superimposed; the results are displayed in a non-interactive window suitable for printing.

img2txt, txt2img (roi_list.m, mm2img.m)

These functions are used to change .img files into .tal files and back again.

Changing an image into a text file is pretty self-explanatory, but txt2img, used to convert a .tal into a .img, is a little trickier; since the coordinates in .tal files are listed in millimeters, changing that coordinate list into a voxel-based .img file requires knowing what the voxel size and origin coordinates should be. So txt2img requires an “template image” – a .img which defines the space in which the new .img will be made.

mni2tal, tal2mni (mni2talgui.m, tal2mnigui.m)

These functions are used to change ROIs, in the form of .tal files, from MNI space

to Talairach space and back again. These create new .tal files in the desired output space. The mni2tal function appends _tal to the filename in its translation process; the tal2mni function appends _mni.

Generate Tal ROIs (tal_roi.m)

Uses the Talairach Daemon database to generate ROI .img files based on various

anatomical landmarks. ROIs can be generated by intersecting or connecting any gyri, Brodmann areas, hemispheres, tissue types, etc. desired – typical results would be “left amygdale” or “intersection of right BA 10 and inferior frontal gyrus.” These .img files are in Talairach space, so before they are directly applied to SPM results, they should be converted into MNI space with tal2mni.

XYZ_rois (roi_xyz.m)

Allows the user to generate cubical ROIs based on specified x, y, and z limits in

millimeters. The initial output is a .tal file called “roi.tal,” but the program automatically enters into the txt2img facility to allow creation of a .img file from the roi.tal file.

roi_process (roi_process.m)

Combines a sequence of steps intended to be applied to .imgs that have come

right out of the Talairach Daemon or “Generate Tal ROIs” button – the idea would be to run that to specify anatomical regions of interest, then immediately run roi_process on these raw .imgs to prepare them for SPM. Roi_process takes a set of ROI .img files and converts them into MNI space (by running them through a .img-to-.tal conversion, a Talairach-to-MNI conversion and a .tal-to-.img conversion back to images), then smooths the ROIs with a specified Gaussian kernel and finally truncates them, converting them into black-and-white images suitable for SPM statistical use.

Smooth (spm_smooth_ui.m)

Activates the SPM smoothing facility to allow spatial smoothing of ROI .img


Truncate (roi_truncate.m)

After an ROI is smoothed, its image intensities likely no longer consist of only

ones and zeros; it may also have been enlarged by the smoothing process. Truncation allows the user to select an intensity threshold, then sets all voxels whose values are below the threshold equal to zero and all voxels above the threshold to one. This creates a black-and-white image suitable for use as a mask.

Reverse Norm (reverse_norm.m)

Given a .tal file containing ROI coordinates and an “_sn3d.mat” file of the sort

output by SPM’s normalization process, this function inverts the normalization parameters used to normalize a particular image and applies the inverted parameters to the .tal file. This can be used to take an ROI in standard, MNI space and convert it to one which is precisely fitted to a particular subject’s anatomy.

ROI Time Series Analysis

An interactive tool which allows the display of intensity and % signal change timecourses from particular ROIs

and/or particular voxels; these timecourses can be for a complete experiment, or a given condition or number of conditions, and can be updated interactively to examine the effects of temporal preprocessing on the data. Sort of like % signal change above, but interactive.

MARSBAR (marsbar_wrap.m)

An outside ROI package developed by Matthew Brett and others which contains

some of the functionality of this Toolbox as well as a number of other facilities. See MARSBAR documentation for details.

Generate func ROIs (spm_results.m)

This button calls the SPM results facility (just as if you’d hit “Results” in SPM),

in order to call up activations for a particular contrast and threshold level. The “S.V.C.” button in the results control panel can be used to isolate a particular cluster and save it out as a .tal file.

SPM Tal Stats (glassbrain.m)

This button can only be used if an SPM results window is currently up. If SPM

results are being displayed and this button is selected, a file (possibly more than one) is generated which contains the complete list of the coordinate positions of all the activated voxels in the current results, converted into Talairach space – effectively a way to “dump” voxel locations from SPM into a text file.

Tal stats summary (sum_coord.m)

This function summarizes an output file generated by the Talairach Daemon

program, giving you a file which tells you how many voxels were in each location.

Display_Tal_Space (TSU_wrap.m)

This function displays selected functional clusters on an illustrated Talairach

atlas, allowing them to be lined up precisely with Talairach-space anatomy, and allows them to be rendered into 3-D on the Talairach brain.

tbx_roi (tbx_roi_wrap.m)

This button calls up Russ Poldrack’s ROI Toolbox, an outside package developed

to work with SPM’s functional ROI capabilities. Provides a number of ways to generate multiple functional ROIs and ROIs of different shapes based around functional clusters. See tbx_roi documentation for more details.

Random and Fixed Effects FAQ

Frequently Asked Questions - Random and Fixed Effects in fMRI

1. What is a random-effects analysis? What's a fixed-effects analysis? What's the difference?

Random-effects and fixed-effects analyses are common concepts in social science statistics, so there are a lot of good intros to them out on the web, such as this one:, or But in a very small nutshell: A fixed-effects analysis assumes that the subjects you're drawing measurements from are fixed, and that the differences between them are therefore not of interest. So you can look at the variance within each subject all lumped in together - essentially assuming that your subjects (and their variances) are identical. By contrast, a random-effects analysis assumes that your measurements are some kind of random sample drawn from a larger population, and that therefore the variance between them is interesting and can tell you something about the larger population.

Perhaps the most fundamental difference between them is of inference. A fixed-effects analysis can only support inference about the group of measurements (subjects, etc.) you actually have - the actual subject pool you looked at. A random-effects analysis, by contrast, allows you to infer something about the population from which you drew the sample. If the effect size in each subject relative to the variance between your subjects is large enough, you can guess (given a large enough sample size) that your population exhibits that effect - which is crucial for many group neuroimaging studies.

2. So what does the difference between them mean for neuroimaging data?

If you're interested in making any inferences about the population at large, you essentially are required to do some kind of random-effects analysis at some point in your stream. Not all studies demand this - some types of patient studies, for example - but in general, a random-effects analysis will take place at some point. However, random-effects analyses tend to be less powerful for neuroimaging studies, because they only have as many degrees of freedom as number of subjects. In most neuroimaging studies, you have vastly more functional images per subject than you do subjects, and so you have vastly more degrees of freedom in a fixed-effects analysis.

3. In what situations are each appropriate for neuroimaging analysis?

Generally, a neuroimaging study with more than one or two subjects will have a place for both types of analysis. The typical study proceeds with a type of model called the hierarchical model, in which both fixed and random effects are considered, but the two types of factors are limited and entirely separable. Single-subject analyses are generallly carried out with a fixed-effects model, where only the scan-to-scan variance is considered. Those analyses generally yield some type of summary measure of activation, be it a T-statistic or beta weight or other statistic. Once those summary measures are collected for each subject, then, a random-effects analysis can be performed on the summaries, looking at the variance between effect sizes as a random effect. Again, only a single source of variance is considered at a single time.

For the most part, the rule of thumb is: single-subject analyses should be fixed-effects (to leverage the greater power of a fixed-effects model) and any analysis involving a group of subjects that you'd like to express something about the population should be random-effects.

4. How do I carry out a fixed-effects analysis in AFNI/SPM/BrainVoyager?

Generally, the standard single-subject model in all neuroimaging software is a fixed-effects model. Only a single source of variance is considered - the variance between scans (or points in time). If you include several subjects' functional images in a single fMRI model (as opposed to basic model) in SPM, for example, the program will run fine - you'll just get a fixed-effects model over several subjects at once. Any program that produces summary statistic images from single subjects will generally be a fixed-effects model: the standard GLM analysis in SPM and BrainVoyager, for example, or 3dFIM+ or 3dDeconvolve in AFNI. All of these apply a fixed-effects model of your experiment to look at scan-to-scan variance for a single subject. Other subjects could be included, as mentioned, but the variance between subjects will not generally be considered.

5. How do I carry out a random-effects analysis in AFNI/SPM/BrainVoyager?

Until a few years ago, this was a trickier question, but the Holmes & Friston paper (RandomAndFixedEffectsPapers) highlighted the need for random-effects models in group neuroimaging studies, and since then (and before, in some cases), every major neuroimaging program has made the hierarchical model the default for group analysis. The idea is built into every program and quite simple: once you've got summary images of the effect sizes from each of your subjects (from single voxels or ROIs or whatever), you then simply throw those effect size summaries into a 'basic' statistical test to look for effect size across the effect sizes. The simplest is a one-sample t-test, but more complicated models can also be used: regressions, ANOVAs, etc. In SPM and BrainVoyager, the 'basic models' button or menu will take you to these sorts of group tests; in AFNI, 3dttest is a simple group t-test program, or 3dregana will do group regressions.

6. Which files should I include in my random-effects analysis? Contrast images? T-statistic images? F-statistics? Why one and not the others?

This is an important point, and explained better by Holmes' random-effects model, which should be required reading for anyone doing a random-effects test. In general, you want to include whatever image is a summary of your effect size, and not a measure of the significance of your effect size. Evaluating the significance of a group of significances is a layer beyond the statistics you're interested in - you want your measurements to really reflect how big the effect was at the ROI or voxel, not anything about the rest of the variance across the brain. So in general, for a t-test contrast, the image you want to include is the contrast image - the weighted sum of your beta weights - or a raw beta image. In SPM, that's the con_00*.img files, or the beta files themselves.

As a side note, for tests of more than one constraint at once - such as F-tests - the proper summary image is actually kind of tricky - simply including the ESS (Extra Sum of Squares) image into a standard random-effects test is not the way to go. SPM has a multivariate toolboox that may be of help in handling group F-tests directly, but more usually, the approach is to figure out what constraint in the F-test is driving the effect and use that constraint's contrast image in the group analysis.


Random and Fixed Effects Papers

Useful Papers - Random and Fixed Effects in fMRI


Holmes (2000), SPM99 random effects manual TXT

Summary: An excellent summary of the distinction between random-effects and fixed-effects models in neuroimaging: what the big deal is, why you'd want to use a random-effects model in some situations, and then a great deal of detail on how to do it in SPM. Level of detail isn't so tight as to prevent its usability as a guide for other programs - an identical approach will work for any typical neuroimaging analysis package for a standard-style hierarchical analysis.

Bottom line: Required reading for group analysis in SPM or other programs.

Friston et. al (1999), "How many subjects constitute a study?," NeuroImage 10, 1-5 PDF

Summary: An interesting look in how a paper can end up being cited for reasons completely counter to why it was written. Often cited in papers describing why they used a random-effects analysis, this is actually a theoretical treatment of how fixed-effects models and conjunction analyses with relatively small numbers of subjects can be used to establish an effect as "typical" - an conceptual analogue to "average" that authors argue supports some of the same inferences as "average." This is not, on the whole, an argument that's been bought an awful lot in the literature at large.

Bottom line: Generally, it's cited because of the sideline admission by the authors that some forms of inference can only be supported by random-effects analyses. Turns out those are the kinds of inference that people actually want to make. Weird.

Miller et. al (2002), "Extensive individual differences in brain activations associated with episodic retrieval are reliable over time," Journal of Cognitive Neuroscience 14:8, 1200-1214 PDF

Summary: A fairly extensive critique of group analyses and averaging as the sole criterion of activation in a group study. Authors review a number of studies on individual variability and episodic retrieval, and demonstrate that when six subjects from a 2001 study were retested with the same paradigm, their patterns of activation were extremely consistent with those from the earlier test, suggesting that individual differences among them aren't simply noise. Some exploration of the relationship between individual brain pattern and real-life performance.

Bottom line: Just looking at your group averages isn't good enough. It's important to look at your individual subjects' data, because differences between them may be meaningful and might be related to performance in a variety of ways.


Holmes & Friston (1998), "Generalisability, random effects and population inference," NeuroImage 7, S754 PDF

Summary: Really the original paper on random-effects tests in neuroimaging. Argues that population inference can only truly be supported by random-effects analyses, and presents the strategy SPM99 and forward would take in doing group testing for random effects: the hierarchical, separable model in which each subject is modeled separately and summary images of their activations are then analyzed at a new level.

Bottom line: Turns out you need random effects for population inference; no more getting away with a whole bunch of scans on two or three people...

McGonigle et. al (2000), "Variability in fMRI: an examination of intersession differences," NeuroImage 11, 708-734 PDF

Summary: A look not at individual differences between subjects, but at difference in patterns of activation within a given subject between sessions. This paper provides a good example of how the same data can be analyzed both with fixed-effects and random-effects models, and of the different conclusions that can be drawn from activations found with each type of model. Includes a nice look at what types of variability show up with each type of model (Fig. 8, p. 729).

Bottom line: Random effects analyses raise the threshold of variability necessary for significance, but with the bonus of supporting potentially more interesting kinds of inference.

Aguirre et. al (1998), "The variability of human BOLD hemodynamic responses," NeuroImage 8, 360-369 PDF

Summary: An excellent look at the underlying assumption of identical shape of HRF across subject, day and scan session. Similar to the above McGonigle et. al in its use of fixed- and random-effects models to make different inferences about variability due to different factors.

Bottom line: Another interesting example of the usefulness of both types of models in the same study to demonstrate different effects or influences.


Realignment FAQ

Frequently Asked Questions - Realignment

1. What is realignment / motion correction?

In a perfect world, subjects would lie perfectly still in the scanner while you experimented on them. But, of course, in a perfect world, I'd be six foot ten with a killer JumpHook, and my car would have those hubcaps that spin independently of the wheels. Sadly, I only have three of my original Camry hubcaps, and subjects are too darned alive to hold perfectly still. When your subject moves in the scanner, no matter how much, a couple things happen:

  • Your voxels don't move with them. So the center of your voxel 10, 10, 10, say, used to be just on the front edge of your subject's brain, but now they've moved a millimeter backwards - and that same voxel now sampling from outside the brain. If you don't correct for that, you're going to get an blurry-looking brain when you average your functional effect over time. Think of the scanner as a camera taking a really long exposure - if your subject moves during the exposure, she'll look blurry.
  • Their movement generates tiny inhomogeneities in the magnetic field. You've carefully prepared your magnetic field to be perfectly smooth everywhere in the scanner - so long as the subject is in a certain position. When she moves, she generates tiny "ripples" in the field that can cause transient artifacts. She also subtly changes the magnetic field for the rest of the experiment; if she gradually moves forward, for example, you may see the back of her brain get gradually brighter over the experiment as she changes the field and moves through it. If you don't account for that, it'll look like the back of her brain is getting gradually more and more active.

    Realignment (also called motion correction - they're the same thing) mainly aims to help correct the first of these problems. Motion correction algorithms look across your images and try to "line up" each functional image with the one before it, so your voxels always sample the same location and you don't get blurring. The second problem is a little trickier - see the question below on including movement parameters in your design matrix, as well as the e-mails in RealignmentPapers for further discussion. This issue also comes up in correcting for physiological movement, so check out PhysiologyFaq as well.

2. How do the major programs' algorithms work? How do they perform relative to each other?

SPM, AFNI, BrainVoyager, AIR 3.0, and most other major programs, all essentially use modifications of the same algorithm, which is the minimization of a least-squares cost function. The algorithm attempts to find the rigid-body movement parameters for each image that minimizes the voxel-by-voxel intensity difference from the reference image.

The particular implementation of the algorithm varies widely between programs, though. Ardekani et. al (below) does a detailed performance breakdown of SPM99 vs. AFNI98 vs. AIR 3.0 vs. TRU. AFNI is by far the fastest and also the most accurate at lower SNRs; SPM99, though slower, is the most accurate at higher SNRs. See below for more detail...

3. How much movement is too much movement?

Tough to give an exact answer, but Ardekani et. al find that SPM and AFNI can handle up to 10mm initial misalignment (summed across x/y/z dimensions) without significant trouble. Movement in any single dimension greater than the size of a single voxel between to images is probably worth looking at, and several images with greater than one-voxel motion in one run is a good guideline for concern.

4. How should you correct motion outliers? When should you just throw them out?

Attempting to correct extreme outliers is a tricky process. On the one hand, images with extremely distorted intensities due to motion can have a measurable distortion effect on your results. On the other hand, distinguishing intensity changes due to motion as opposed to task-related signal is by no means an easy process, and removing task-related signal can also measurably worsen your results (relative to both Type I and II errors).

Our current thinking in the lab is that outlier correction should be attempted only when you can find isolated scans who show significantly distorted global intensities (several standard deviations away from the mean) that are with a TR or two of a significant head movement. A significant head movement without a global intensity change is probably handled best by the realignment process itself; a significant intensity change without head motion may have nothing to do with motion. The artdetect (Global Variate button) script is designed to do this for SPM data; see ArtifactDetection for step-by-step instructions, which may be useful also for other programs that have easy ways to view intensities and motion parameters.

Another option is to simply censor (i.e., not use) the images identified as iffy; this is easier in AFNI than in SPM. This has the disadvantage of possibly distorting your trial balancing in a given session if whole trials are removed, as well losing whatever task signal there may be in that scan. It has the advantage of being more statistically valid - outlier correction with interpolation obviously introduces new temporal correlation into your timeseries.

Several things might make a particular session entirely unusable: several isolated scans with head motion of greater than 10mm (summed across x/y/z); several scans with head motion in a single direction greater than the size of a single voxels; a run of several scans in a row with significant motion and significant intensity change; high correlation of your motion parameters with your task (see below). All subjects should be vetted for these problems before their results are taken seriously...

5. How can you tell if it's working? Not working?

Realignment in general is pretty robust; the least-squares algorithm will always produce some solution. It may, however, get caught in a non-optimal solution, particularly with scans that have a lot of motion and/or a big intensity change from the one before. It's difficult to evaluate realignment's effects post hoc just by looking at your results; the best way to make sure it's worked is visual inspection. SPM's "Check Reg" button will allow you to simultaneously display up to 15 images at once, side-by-side with a crosshair placed identically in each of them, to make sure a given voxel location lines up among several images. You may want to look particularly at scans you've identified with significant head motion, as well as comparing the first and last images in your run...

6. Should I include movement parameters in my design matrix? Why or why not?

The e-mail threads on RealignmentPapers are the best examination of that issue from the people who know. In a nutshell, even after realignment, various effects like interpolation errors, changes in the shim from head movement, spin history effects, etc. can induce motion-correlated intensity changes in your data. Including your motion parameters in your design matrix can do a very good job of removing these changes; including values derived from these parameters, such as their temporal derivative, or sines of their values (see Grootonk et. al at RealignmentPapers) can do an even better job.

The reason not to include these parameters is that there's a pretty good chance you also have actual task-related signal that happens to correlate with your motion parameters, and so removing all intensity changes correlated with motion may also significantly decrease your sensitivity to task-related signal. Before including these parameters in your matrix, you're probably wise to check how much your motion correlates with your task to make sure you're not inadvertantly removing the signal you want to detect.

7. What is 'unwarping'? Why is it included as a realignment option in SPM2? And when should I use it?

The follow-up e-mail thread on RealignmentPapers is a good overview of the issue - thanks to Trey Hedden for bringing this issue up. Head motion can cause artifacts for a variety of reasons - gradual changes in the shim, changes in dropout, changes in slice interpolations, spin-history effects - but certainly one of the big ones is motion-by-susceptibility interactions. In areas typically vulnerable to susceptibility-induced artifacts - artifacts caused by magnetic field distortion due to air-tissue interfaces - head motion can cause majors changes in those susceptibility regions, and intensities around the interface can change in unpredictable ways as the head moves. The guys at SPM describe it as being like a funhouse mirror near the interface - there's usually some spot of blackout (signal dropout) where susceptibility is really bad, but right around it, the image is distorted in weird ways, and sliding around can change those distortions in unpredictable ways.

Motion-by-susceptbility interaction is certainly one of the biggest sources of motion-related artifact, and some people think it's THE biggest, and so the "unwarp" utility in SPM2 is an attempt to specifically address it. Even if you get the head lined up image-to-image (realignment), these effects will remain, and so you can try and remove them additionally (unwarping). This is essentially a slightly less powerful but very much more specific version of including your motion parameters in the design matrix - you'll avoid almost all the problems about task-correlated motion that go with including your motion parameters as correlates, but (hopefully) you'll get almost all the same good effects. The benefits will be particularly noticeable in high-susceptibility regions (amygdala, OFC, MTL).

One BIG caveat about unwarping, though - as it's currently implemented, I believe it's ONLY usable for EPI images, NOT for spiral. So if you use spiral images, you shouldn't use this. But if you use EPI, it can be worth a try, particularly if you're looking at high-susceptibility regions. Check the follow-up e-mail thread and paper for more info.

8. What's the best base image to realign to? Is there any difference between realigning to an image at the beginning of the run and one in the middle of the run?

Not a huge difference, if any. AFNI has a program (findmindiff, I think) that identifies the image in a particular series that has the least difference from all the other images, which would be the ideal one to use. In practice, though, there's probably no significant difference between using that and simply realigning to the first image of the run, unless you have very large (10mm+) movement over the course of the run, in which case the session is probably of questionable use as well...

9. When is realigning a bad idea?

The trouble with the least-squares algorithm that realignment programs use is that it's easily fooled into thinking differences in intensity between images are all due to motion. If those differences are due to something else - task-related signal, or sudden global intensity changes - the realignment procedure can be fooled and come up with a bad realignment. If the realignment is particularly bad, it can completely obscure your signal, or (arguably worse) generate false activations! This is most pressing in the case of task-correlated motion (see below for discussion), but if you have significant global intensity shifts during your session that aren't motion-related, your realignment will probably introduce - rather than remove - error into your experiment. There are other realignment methods you can use to get around this, but they're slow. See Friere & Mangin on RealignmentPapers, and the CoregistrationFaq page.

10. What can I do about task-correlated motion? What's the problem with it?

See Bullmore et. al, Field et. al, and Friere & Mangin in RealignmentPapers for more details about this issue. The basic problem stems from the fact that head motion doesn't just rotate and shift the head in an intensity-invariant fashion. Head motion actually changes the image intensities, due to inhomogeneity in the magnetic field, changes in susceptibility, spin history effects, etc. If your subject's head motions are highly correlated with your task onsets or offsets, it can be impossible to how much of a given intensity change is due to head motion and how much is due to actual brain activation. The effect is that task-correlated motion can induce signficant false activations in your data. Including your motion parameters in your design matrix in this case, to try and account for these intensity changes, will hurt you the other way - you'll end up removing task-correlated signal as well as motion and miss real activations.

The extent of the problem can be significant. Field et. al, using a physical phantom (which doesn't have brain activations) were able to generate realistic-looking clusters of "activation," sometimes of 100+ voxels, with head movements of less than 1mm, simply by making the phantom movements increasingly correlated with their task design. Bullmore et. al point out that patient groups frequently have significant differences in how much task-correlated motion they have relative to normals, which can significantly bias between-group comparisons of activation.

Even worse, the least-squares algorithm commonly used to realign function images is biased by brain activations, because it assumes intensity changes are due to motion and attempts to correct for them. As Friere & Mangin point out, even if there's no motion at all, the presence of significant activations could cause the realignment to report motion parameters that have task-correlated motion!

So what can you do? First and foremost, you should always evaluate how correlated your subjects' motion is with your task - the parameters themselves and linear combinations of them. (An F-test can do this - we'll have a script available for this in the lab shortly.) The correlation of your parameters with your task is hugely more important than the size of your motion in generating false activations. Field demonstrated false activation clusters with correlations above r = 0.52. If your subject exhibits very high correlation, there's not much you can do - they're probably not usable, or at least their results should be taken with a grain of salt. There are some techniques (see Birn et. al, below) that may help distinguish activations from real motion, but they're not perfect...

Bullmore et. al, below, report some ways to account for task-correlated motion that may be useful.

Even without any task-correlated motion, though, you should be aware your motion parameters may be biased, as above, towards reporting a correlation. This is not usually a problem with relatively small activations; it may be bigger with very large signal changes. You can avoid the problem entirely by using a different realignment algorithm - based on mutual information, like the algorithms here (CoregistrationFaq) - but these are impossibly slow, and not practically usable.

Among the usable algorithms, Morgan et. al reported SPM99 was the best at avoiding false-positive voxels... Keep an eye out for more robust algorithms coming in the future, though... And you may want to try and use one of the prospective motion correction algorithms, as described in Ward et. al. at RealignmentPapers.


Realignment HOWTO

How-Tos - Realignment

How do I...

Do realignment in SPM

Realignment works in two stages. First, the first files from each session are realigned to the the first file of the first session. Second, within each session, the second, third, etc... (2..n) images are realigned to the first image of the session. Thus, after realignment, all files are effectively realigned to the first file of the first session.

The process produces text files with the estimated realignment (or motion) parameters for each session. These are the rp_V001.txt (or realignment_params_V001.txt) stored in each session's directory. They contain 6 columns and one row for each V-file.

The columns are the estimated translations in mm ("right", "forward", "up") and the estimated rotations in rad ("pitch", "roll", "yaw") that are needed to shift each V-file.

These text files can be used later at the statistics stages to enter the estimated motion parameters as user-specified regressors in the design matrix. If you choose 'all images' in 'Create What?', realignment will create r*.img files. The process usually takes several hours, depending on the total number of V-files and the computer you're using.

Number of subjects: You'd most likely enter 1 here.

Num sessions for subject 1: Enter the number of sessions for the entire experiment.

Select SCAN1/V*.img , then click DONE.

scans for subj 1, sess2: Select SCAN2/V*.img, then click DONE.

Repeat the same until all sessions (and subjects) are done.

Select the default, coregister & reslice.

  • Selecting coregister only will realign by creating .mat files containing realignment transformations to be applied to the corresponding V-files (i.e no new *V.img file will be produced -- the image files will not be resliced).
  • Selecting reslice only will produce new rV*.img files. The V*.img files you import will be transformed accoring to their corresponding V*.mat files (given that they exist) and the resulting images will be written out as rV*.img files with the usual corresponding rV*.hdr files. No rV*.mat files will be created.
  • Selecting coregister & reslice will both realign the selected files and produce new *.mat files. The files produced will depend on what is entered in response to the "Create what?" prompt below.

Select the default, Sinc Interpolation.

Select mean image only.

  • All images (1..n) will produce as many rV*.img files as V*.img files are selected. If normalization will not be performed, the images should probably be resliced as this stage.
  • Images 2..n will produce a new rV*.img for each V*.img file loaded, except the first V*.img from the first session. Since every file in the sequence is being aligned to this first file, strictly speaking, it does not need to be transformed or resliced. For practical reasons, however, it is better if all files have the same prefix (which won't be the case if you choose this option).
  • All images + mean image will produce as many rV*.img files as V*.img files are selected, and in addition, a new mean_V001.img file will be produced in the first session's directory. The mean_V001.img file is the mean image of all files selected for realignment and it can be used with subsequent coregistration and normalization steps. Although the mean_V001.img is stored in the directory of the first session, it is actually the mean of all of the sessions together. The mean image is created after realignment and has the realignment transformations already applied to it.
  • Mean image only will produce only one new mean_V001.img file, without reslicing the selected V*.img files. If normalization will be performed next, there is no need to reslice the aV*.img files at this stage. The transformation parameters saved in the aV*.mat file to each image will be applied to the images during the normalization stages. This way, the images will be resliced only once. In general, unnecessary reslicing should be avoided, because the images lose some spatial resolution every time it's performed.

Adjust sampling errors? No. (See RealignmentFaq for why.)

Realignment will produce the following chart of the corrections made and save it to the file:

Include my motion parameters in my design matrix

SPM: You'll include your motion parameters as user-specified regressors in your design matrix (see BasicStatisticalModelingFaq for more on user-specified regressors). In order to include them, you'll need to have already run realignment, and you'll need to know where your realignment parameters file are (rp_V001.txt, or realignment_parameters_V001.txt). Usually they can be located in each session's functional image directory - one parameter file for each session.

Begin setting up your design matrix as usual (see BasicStatisticalModelingHowTos for step-by-step instructions). When asked if your conditions are replicated between sessions, say "no."

You'll be asked about your onset vectors and parametric modulations for each condition. After you've entered in your conditions for the first session, you'll be asked how many user-specified regressors you want. Enter "6."

When you're prompted to enter the regressor, type spm_load. This will bring up a file menu. Choose the realignment parameters file for the first session.

You'll then be asked to enter the names for each regressor. You can leave them as default, but if you'd like to label them in the design matrix, you should enter, in order, "x," "y," "z," "pitch," "roll," and "yaw."

Repeat the same steps for each session in your experiment. Your design matrix should then contain the motion parameters as separate columns in your design matrix. You should leave them out of most contrasts, but you can make contrasts with them to see motion-related activity in your experiment.

If you want to enter some derivation of your motion parameters - like the sine or cosine, or temporal derivative of them - you can embed the spm_load command into a Matlab function, like sin or abs. In this case, instead of typing simply spm_load at the prompt, you'll enter sin(spm_load), for example. If the derivation you want isn't a simple Matlab function which operates on every element of a matrix, you'll need to construct it in the Matlab workspace before you start entering your model, and refer to it by variable name instead.


Realignment Links

Links - Realignment

E-mail thread (1999) between Field and Ashburner about entering motion parameters into your design matrix (click 'Next in Topic' once you're there to follow the thread along) Thread

E-mail thread (2000) between Flaisch, Henson, and others, more about motion parameters in the design (click 'Next in Topic' once you're there to follow the thread along) Thread

Summary: For both of these: Including your motion parameters in your design matrix is like regressing out motion-correlated signal, which can reduce your false positives at the expense of reducing your true positives. Values derived from the parameters (sines, squares, etc.) can also be useful.

Bottom line: Including these depends heavily on your own study - how much motion, how big your signal is. Some objective tests (like Skudlarski et. al PDF have shown, in general, that it doesn't make a huge difference either way. Probably worth testing on your own studies...

E-mail thread (2003) from Jesper Andersson, who wrote the unwarping code, and others, about why unwarping is useful and how it differs from including your motion parameters in your matrix. (click 'Next in Topic' once you're there to follow the thread along) Thread

Summary: Good overview of unwarping from the guy that wrote it.

Bottom line: If you use EPI, it can be useful. Unwarping is a more highly targeted version of including your motion parameters in your design matrix - instead of taking out ALL motion-correlated variance (including real activations), it only knocks out motion-by-susceptibility artifacts, and those hopefully account for a good chunk of your motion artifact, particularly in high-susceptibility regions.


Realignment Papers

Useful Papers - Realignment


Ardekani et. al (2001), "A quantitative comparison of motion detection algorithms," Magnetic Resonance Imaging 19, 959-963 PDF

Summary: Just what it says - tests SPM99, AFNI98, AIR, and TRU (pyramid) against each other on a data set generated from real data but with known misalignments. Evaluates how close each algorithm comes to correctness and how fast each runs.

Bottom line: SPM99 and AFNI98 outperform AIR and TRU by a wide margin, particularly with a big misalignment. AFNI98 is almost as good as SPM99 (better at some SNRs) and is wayyyy faster (we needed an experiment to tell us this?).

Bullmore et. al (1999), "Methods for diagnosis and treatment of stimulus-correlated motion in generic brain activation studies using fMRI," Human Brain Mapping 7, 38-48 PDF

Summary: Introduces the issue of task-correlated motion, demonstrates a between-group study that is severely biased by it, and introduces some methods to account for the degree of correlation and, in some small ways, correct for it.

Bottom line: Task-correlated motion can severely bias between-group studies, but you can compensate to some degree for it if you can measure the degree of the correlation.

E-mail thread (1999) between Field and Ashburner about entering motion parameters into your design matrix (click 'Next in Topic' once you're there to follow the thread along) Thread

E-mail thread (2000) between Flaisch, Henson, and others, more about motion parameters in the design (click 'Next in Topic' once you're there to follow the thread along) Thread

Summary: For both of these: Including your motion parameters in your design matrix is like regressing out motion-correlated signal, which can reduce your false positives at the expense of reducing your true positives. Values derived from the parameters (sines, squares, etc.) can also be useful.

Bottom line: Including these depends heavily on your own study - how much motion, how big your signal is. Some objective tests (like Skudlarski et. al PDF have shown, in general, that it doesn't make a huge difference either way. Probably worth testing on your own studies...


Field et. al (2000), "False cerebral activation on BOLD functional MR images: study of low-amplitude motion weakly correlated to stimulus," American Journal of Neuroradiology 21, 1388-1396 PDF

Summary: Uses a computer-controlled physical phantom to see if highly task-correlated but small motions can induce false activations in data.

Bottom line: Yes, in a big way. Even with submillimeter motions, head motion that correlated better than r = 0.52 or so routinely generated false activations - sometimes very realistic-looking and significant false activations. Evaluate the task correlation of your motion parameters!

Ward et. al (2001), "Prospective multiaxial motion correction for fMRI," Magnetic Resonance in Medicine 43, 459-469 PDF

Summary: Demonstrates the effectiveness of a prospective (i.e., during image acquisition) motion correction algorithm that handles 3D correction and removes worries about intensity changes or realignment validity.

Bottom line: This is one direction realignment research is going; early testing showed their algorithm effectively compensated for motion of up to 10mm and up to 10 degrees, but at the cost of 320 msec per TR.

Grootonk et. al (2000), "Characterization and correction of interpolation effects in the realignment of fMRI time series," NeuroImage 11, 49-57 PDF

Summary: Argues that residual motion-correlated intensity changes after realignment are largely the result of interpolation errors in the realignment, and that including the sines and cosines of your motion parameters in your design matrix can account for these.

Bottom line: The algorithm handles interpolation errors very well, but is still subject to the same concerns about including motion parameters in your design matrix discussed above...

Freire & Mangin (2001), "Motion correction algorithms may create spurious brain activations in the absence of subject motion," NeuroImage 14, 709-722 PDF

Summary: Introduces the problem with least-squares algorithms, and demonstrates how they can induce false activations with both simulated and real data. Also demonstrates that non-least-squares methods don't all suffer from the problem.

Bottom line: Motion parameters can be biased by activations with large signal changes; other algorithms like mutual information don't suffer from that problem.

Morgan et. al (2001), "Comparison of functional MRI image realignment tools using a computer-generated phantom," Magnetic Resonance in Medicine 46, 510-514 PDF

Summary: Similar to Ardekani et. al, but explicitly compares tools in terms of activation results, not just correctness of algorithm. Uses computer-generated data to compare SPM99, AFNI98, SPM96, AFNI96, and two types of AIR.

Bottom line: Most of the algorithms perform about the same for true-positive data (besides SPM96, which suffers), but SPM99 is better than the other at removing false positives.


E-mail thread (2003) from Jesper Andersson, who wrote the unwarping code, and others, about why unwarping is useful and how it differs from including your motion parameters in your matrix. (click 'Next in Topic' once you're there to follow the thread along) Thread

Summary: Good overview of unwarping from the guy that wrote it.

Bottom line: If you use EPI, it can be useful. Unwarping is a more highly targeted version of including your motion parameters in your design matrix - instead of taking out ALL motion-correlated variance (including real activations), it only knocks out motion-by-susceptibility artifacts, and those hopefully account for a good chunk of your motion artifact, particularly in high-susceptibility regions.

Andersson et. al (2001), "Modeling geometric distortions in EPI time series," NeuroImage 13, 903-919 PDF

Summary: Haven't read it yet meself, but this lays out the mathematical background for unwarping - describes the model of susceptibility-by-motion interactions that SPM now uses to calculate the deformation fields and remove their effects. They describe a further paper, but I don't know of it...

Birn et. al (1999), "Event-related fMRI of tasks involving brief motion," Human Brain Mapping 7, 106-115 PDF

Summary: Several tasks of potential psychological interest - talking, swallowing, tongue movement - involve head motion that introduces significant artifacts into fMRI images, making it difficult to distinguish task-related activation in these trials from artifactual activation. But because the temporal profiles of the two sources are different, it possible in principle to separate the two in event-related designs, which is what Birn et. al attempt here, with some success.



Frequently Asked Questions - SPM

This set of pages is intended to answer as many common questions about SPM's software quirks as humanly possible. But these pages are not intended to collect everything about SPM - rather, they're intended as a catch-all for SPM questions that don't fall into a theoretical category. If, for example, you have a question about how to do some normalization thing in SPM, check NormalizationHowTos before you tack it onto these pages. You can get a complete listing of the topics that already have FAQs and How-tos at CategoryFaq and CategoryHowTos, respectively.

But if your question doesn't seem to fit any of the categories about analyses already up there... here's the place for you.

Many SPM questions tend to be either of the form, "How do I do x in SPM?" or "My SPM crashed for some x reason - what's the deal?" SpmHowTos and CommonSpmErrors, respectively, are good places for those questions. This FAQ page is intended to be more about general SPM questions - documentation, how to find out more, etc. Good rants are always, of course, welcome.

1. What is SPM? What does it do, exactly?

SPM stands for Statistical Parametric Mapping. And the question of what it's doing can be answered at NutshellSpm. (That page will be edited down shortly and a summary stuck here as well. But it's late right now.)

2. Where can I find some documentation for SPM99?

Well, here, hopefully. :) But if you mean official-type documentation...

The basic, which-button-do-I-push, manual is here:

It doesn't have a lot of info about what exactly SPM is doing, though - just how to make it run.

For help on a particular function in SPM99, the "help" button is often very useful; many of the functions have useful comments at the beginning of their code that is displayed by the "help" button.

You can also check out the archives of the SPM e-mail list (see SpmLinks).

3. Where can I find documentation for SPM2?

Unfortunately, there isn't an "official" manual yet for SPM2. The SPM99 manual above may be somewhat helpful, especially for spatial preprocessing questions; those functions haven't changed a ton between SPM99 and SPM2.

Beyond that, the SPM2 help is a good starting point - many of the functions have helpful comments that will be displayed with its built-in help function. You can access it with the "help" button in the SPM main interface. You can also search the SPM e-mail list archives (see SpmLinks) for a particular question.

4. What's new in SPM2?

Here's a summary of the changes they've made, although some of it's pretty dense:

5. In the SPM data structures, why is the design matrix called xX.X? Why is the filter called K? Why is there a field called SPM.xY.VY? Why are the stimulus functions put in a variable called SPM.Sess.U.u? What does that even mean?

No one knows. Not sure anyone actually writing the software even knows. Possibly a bizarre plot to drive people using SPM insane.

6. What would possess a group writing software to release a highly-anticipated new version of their software and a) make it totally incompatible with any earlier version of their software, to the point of making it actually dangerous to the truth of your results to mix versions and b) changing variable and output names apparently at random from one version to the next with no explanation as to why? Why? Why would someone do such a thing?

See answer to question 5.


Scanning FAQ

Frequently Asked Questions - fMRI Scanning

This section is intended to address design-related questions that focus primarily on technical aspects about the scanner - things like TR, pulse sequence, slice thickness, etc. Obviously, setting your scanner parameters is mixed in heavily with your experimental design, so be sure to check out some other design-related pages:

ScanningPapers has some nice handouts from Gary Glover and Philippe, summarizing some of these articles and (very informally and clearly) addressing questions about TR length, signal-to-noise tradeoffs, what k-space is exactly, etc. Definitely check those out.

1. What pulse sequence shoudl I use (EPI or spiral)?: What are pros and cons of each? What do each of them get you?

  • EPI: More widely used, and hence supported by all fMRI analysis programs. Some programs (FSL, or SPM's unwarping module, for example) do not support spiral data. Can be subject to less drop-out in some regions than spiral-in or spiral-out data alone. Can be easier to figure out what the slice ordering is.
  • Spiral: Properly weighted and combined, spiral in-out shows significantly less signal drop-out and shows significantly greater activations in many areas of the brain, including ventromedial PFC, medial temporal lobe, etc. Effect is even more pronounced at higher field strengths (see Preston et. al on ScanningPapers). However, it is less widely supported, and Gary's trademark spiral i-o sequence may not even be physically possible on some other institutions' equipment.

2. What should your TR be? What are the tradeoffs, and what's the best tradeoff of coverage vs. speed for different types of analysis?

Bottom line: TR should be as short as possible, given how many slices you want to cover and the limits of your task. Gary's handout and monograph on ScanningPapers speak best to this issue, and are good quick reads. Decreasing your TR decreases your signal-to-noise ratio (SNR) in any one functional image, but because you have more images to work with, your overall SNR increases with decreasing SNR. Your TR, however, is limited by how many slices you want to take. On the 3T scanner here at Stanford, using spiral in-out, each slice takes approximately 65 msec/slice (TE = 30 msec), so you can get 15 full slices in 1 second (on the 1.5T, slices take about 75 msec, so you can get 13 full slices in 1 sec.). Your tradeoff is that with fewer slices, you have to either accept less coverage of the brain, or thicker slices, which will have poorer resolution in the z direction.

For certain experiments, then - ones focused on primary sensory on motor cortices, when you don't care about the rest of the brain - you can buy yourself shorter TRs by decreasing your number of slices, or you can increase your number of slices and make them smaller, while keeping your TR constant. Assuming you need full coverage of the brain, you can only decrease your TR by making your slices thicker, which you should do as much as possible within the constraints of your desired resolution.

Alternatively, in experimental designs where you're not particularly focused on timecourse information and where you already have good statistical power - namely, block-design experiments - you may want to get better resolution by increasing your number of slices and hence your TR. In event-related experiments, having a short TR becomes even more imporant, due to the relative lack of experimental power in such designs and relative importance of timecourse information.

3. What should your slice thickness be?

As thick as you think you can get away with. Increasing your slice thickness allows you to decrease your TR and maintain the same coverage, which is desirable as you get better SNR with decreasing TR. Alternatively, if you need good resolution in all dimensions, you can shrink your slice thickness at the expense of either brain coverage or having a longer TR.

4. What should your slice resolution / voxel size be?

64 x 64 is standard around here for full-brain coverage. With experiments focusing on smaller areas - primary motor and/or sensory cortices - something else (like 128 x 128) may be useful to get better resolution in a smaller area.

This differs from what size you interpolate your voxels to in normalization, which is covered in NormalizationFaq...

5. Should you acquire axially/coronally/something else? How come?

Big issue here, as I understand it, is that your slices are often thicker than your in-slice voxels, and hence your resolution is often poorest in the direction perpendicular to your slices. (Hence, if you acquire axially, your inferior-superior or z-direction resolution may not be great.) If you have a particular structure of interest, depending on its orientation, you may want to arrange your slices so as to get good resolution in the direction necessary to nail down that structure. Anyone else have any comments on this one?

6. BOLD vs. perfusion: what are pros and cons of each? What sorts of experiments would you use perfusion for?

Perfusion imaging - in which arterial blood is magnetically 'labeled' with an RF pulse, and then tracked as it moves through the brain - has two main advantages we discussed, one of which is thoroughly discussed in the Aguirre article in ScanningPapers. That advantage is the relatively different noise profile present within the perfusion signal. Unlike BOLD, perfusion noise doesn't have very much autocorrelation, which isn't by itself anything special, but means that perfusion contains much, much less noise relative to BOLD at very low experimental frequencies. There is more noise in general, though, in perfusion imaging, so in general SNR ratios are better for BOLD. But for experiments with very low task-switching frequency - say, blocks of 60s or more, even up to many minutes or hours - BOLD is almost useless, due to the preponderance of low-frequency noise, whereas the perfusion signal is unchanged. This means that with block lengths of longer than a minute, perfusion imaging is probably a better way to go, and experiment which previously weren't possible - block lengths of several minutes, or task switching taking place over several days - might be designed with perfusion.

Another feature of the noise in perfusion imaging is that it appears to be more reliable across subjects. While SNR within a given subject is higher for BOLD, group SNR appears (with limited data in Aguirre on ScanningPapers) to be higher in perfusion imaging. More research is needed on this subject, but this relative SNR advantage may be useful for experiments with small numbers of subjects, as across-subject variability is always the largest noise source for BOLD experiments, often by a huge factor.

The other primary advantage of perfusion relative to BOLD imaging is that the perfusion signal is an absolute number, rather than a contrast. Each voxel is given a physiologically intelligible value - amount of cerebral blood flow - which means that it can be especially useful for comparing groups of populations. Comparing the results of a particular contrast in depressed vs. normal subjects might not yield any results, for example, but overall blood flow might just be lower in the absolute in a particular regions for depressed subjects relative to normal subjects - which would be very interesting. The ability to compare baselines in perfusion is a strong case for using it in particular types of experiments where baseline information may be interesting.


Scanning HOWTO

How-Tos - fMRI Scanning

How do I...

Use the spiral in-out pulse sequence?

The best way to do this is to work off of an existing protocol that uses sprial in/out. When viewing a functional series at the scanner, the "PSD name" field Under the "imaging parameters" heading should show "sprlio". See the section on "How do I set up a scanning protocol at the Lucas Center?" (coming soon) for more info.

Figure out what TR I need to use?

It depends on the number of slices you want (i.e. how much coverage of the brain you'd like). In fMRI, there is a trade-off between number of slices and the TR. This is because the time it takes to acquire a slice is relatively fixed. As of 4/2/04, sprlio (at 22 cm FOV) acquires a slice at 66.2 ms. This means that you can acquire 15 slices in 1 sec (15 slices * 66.2ms/slice), but 15 slices would most likely not cover the entire brain. It's more likely that 23 slices will (1.5 sec TR). See Gary Glover's time/slice specifications for more: info

Send my data files from scanner to the Gabrieli Lab/BIAC servers

When collecting new data, you need to transfer anatomical and functional data to your data directory on the Gablab/BIAC servers. A UNIX program on the Gablab/BIAC servers, called "do_fmri_lite", conveniently accomplishes a number of steps to help you in this process. First, it creates the relevent directories for a new subject in your data directory (i.e. anatomy/ raw/ behave/ functional/ etc/), then it retrieves anatomical files from the Lucas Center (i.e. it runs brain_script automatically and copies over the resulting .tar file). At the moment (June 4, 2004), do_fmri_lite does not have the full functionality to copy over the functional files (the P* and E* files). To transfer these files, you will need to use ftp from the scanner console to the Gablab/BIAC servers and put the files into the subject's raw/ directory. Incidentally, makevols_batch (see below) assumes that you have the functional/ and raw/ directories in place and that your E* and P* files are in raw/.

Copy over the anatomical data

  • If you don't use "do_fmri_lite" to transfer the anatomical files (see above), you'll have to use the following procedure to transfer anatomical files:

1. login to ""

2. type "brain_script" to get info on usage. example: "brain_script -e <exam-number> -d <directory-name> -3"

3. cd /temp/gabrieli

4. you'll find a tar file in this directory called "<directory-name>.tar"

5. copy this file to the gablab/BIAC servers using secure copy, "scp". example

scp <directory-name>.tar user@sperry:/my/data/directory/subject/anatomy/

6. when logged into your gablab/BIAC account,

cd /my/data/directory/subject/anatomy/
tar xvf <directory-name>.tar
mv <directory-name>/anatomy/* .

Reconstruct my functional volumes?

1. What files do I need before reconstrucing raw functional data?
Short answer: P*.mag and E-file.
Long answer: The scanner produces 4 files after each functional run (P-file, P*.mag file, E-file, and P.hdr file). Of these, you only need the P*.mag and E-files for reconstruction. The P*.mag (e.g., P12800.7.mag) is the auto-reconstructed data that is produced automatically at the scanner. It is a single file that contains information about all of the volumes in the functional run. The E-file is a text file containing information about the run (e.g., how many images were collected, the TR, etc.). An example of an E-file is the following: E09797S004P12800.7. Note that the "E" refers to the exam number which follows (E09797), the "S" refers to the series number (S004), and the "P" refers to the P-file number (P12800.7). (The P-file (e.g., P12800.7 is the raw data that comes out of the scanner; The P*.hdr (e.g., P12800.7.hdr) file contains the same info as the E-file, but is not easily readable - you won't do much with this and can easily delete it without regret).

2. What programs do I run to reconstruct the data?
"makevols" will produce individual analyze images (i.e., V001.img plus V001.hdr, etc.) for each run. Typing in "makevols" alone at the unix prompt will give you some instructions on how to use it. You must me in the same directory as your P*.mag files for the program to work. One common usage is: "makevols <E-file> <scan-name>", where <E-file> corresponds the the P*.mag file of interest and <scan-name> is the prefix you'd like to add to your images (i.e., "scan1" would produce images named "scan1.V001.img"). So, one example of a call to "makevols" might be: "makevols E09797S004P12800.7 scan1".

*"I have 10 functional runs (i.e. 10 P.mag files) and running makevols ten times per subject is so tedious! Can I run it en masse in batch form?"*

YES, you can run "makevols_batch" to run makevols for several scans at once. Typing the name at the command line will give you instructions on how to use it. Briefly, the program runs makevols for each scan and moves the resulting volume images to their respective scan directory (within the functionals directory). It also gives an option to move the first specified number of images into a subdirectory in each scan dir (this is useful for when the NEXTRA in the scanning protocol is set to 0, meaning that the scanner has saved the first few images rather than tossing them out). You'll need to make a text file containing two columns, one containing Efiles, the other containing corresponding scan names. If you're using makevols individually for each scan, you'll appreciate the time this saves you.

3. What do I do if I don't have an "E-file" (i.e. I forgot to transfer it from the scanner or I accidentally deleted it)?

No worries, you can use the program writeihdr at the UNIX prompt to generate the E-file.
Simply type: writeihdr <Pfile> and you'll get a text file with the header info.

Reconstruct my anatomical volumes?

The program, "" is a perl script that will fulfill most of your anatomical reconstruction needs. It simply takes a collection of "I-files" (the raw image data in which each I file represents a slice) and converts them to a single ANALYZE format volume (e.g., V001.img and corresponsing V001.hdr).

Typing, "" at the UNIX prompt will give you usage info:

Usage: -f <nframes> -s <nslices>

-i <inplane resolution> -t <thruplane resolution>
-c <condition 1> -c <condition 2> ...

Optional arguments:

-x <byte-swap data>
-nodelete <don't delete I files>
-cuthdr <cut GE headers>
-sag <saggital orientation>
-cor <coronal orientation>
-z <use if slice 1 is inferior (axial), left (saggital), or anterior (coronal)>
-64 <for 64x64 images>
-256 <for 256x256 images>
-d <debug option, no execution>
changes files in directories named after -c from I.x format used
in IDL to volume format used in SPM, and byte-swaps files for use
on Linux machine (e.g., cajal) (if -x option is used).

Here are some examples:

TO MAKE HI-RES ANATOMY IMAGE: (S2_SAG): -f 1 -s 124 -i 0.9375 -t 1.5 -c s2:sag -x -nodelete -cuthdr -256 -z -sag

TO MAKE INPLANE ANATOMY IMAGE: (S3_INPLANE): -f 1 -s 23 -i .9375 -t 5.0 -c s3:inplane -x -nodelete -cuthdr -256 -z

NOTE that the "inplane resolution" (the -i option) is the field of view divided by the image matrix size (i.e. 240/256 in the above examples)

Do perfusion imaging?


Scanning Papers

Useful Papers - fMRI Scanning

Also check out DesignPapers, JitterPapers, FmriPhysicsPapers and PhysiologyPapers...


Aguirre et. al (2002), "Experimental design and the relative sensitivity of BOLD and perfusion fMRI," NeuroImage 15, 488-500 PDF

Summary: Aguirre et. al experimentally examine the noise profile and power profile of perfusion imaging, demonstrating that perfusion should show increased power relative to BOLD when task-switching frequencies go below blocks of around 60 sec, as well as showing the relative greater across-subject noise for BOLD, and the lack of autocorrelation in perfusion noise.

Bottom line: Try using perfusion imaging for extremely long block-length experiments, or possibly for any experiment with a group analysis.

Glover & Law (2001), "Spiral in/out BOLD fMRI for increased SNR and reduced susceptibility artifacts," Magnetic Resonance in Medicine 46, 515-522 PDF

Summary: Gary describes his trademark spiral in-out sequence, a combination of spiral-in and spiral-out images weighted in a variety of ways, and shows that its use significantly increases SNR over traditional spiral methods and greatly reduces drop-out due to magnetic susceptibility.

Bottom line: If your institution supports it, spiral in-out beats traditional methods senseless for reducing drop-out due to susceptibility and should be used.


Preston et. al (2004), "Comparison of spiral-in/out and spiral-out BOLD fMRI at 1.5T and 3T," NeuroImage 21, 291-301 PDF

Summary: Preston et. al compare a set of tasks known to activate susceptibility-heavy areas using spiral in-out imaging at both 1.5T and 3T, so show that where traditional spiral methods can actually increase dropout at higher field strengths, spiral in-out gets better activation volumes and less dropout at 3T than at 1.5T and confirms spiral in-out is better for a variety of real experimental tasks than traditional spiral.

Bottom line: No, really, we mean it. You should definitely use spiral in-out, even if you have a giant magnet.

Yang. et al (2000), "A CBF-based event-related brain activation paradigm: characterization of impulse-response function and comparison to BOLD," NeuroImage 12, 287-297 PDF

Summary: Yang et. al describe ways to substantially improve the effective TR of perfusion imaging (although some schemes don't work with all experimental designs), and use event-related paradigms to describe the shape of the impulse response function with perfusion, which is similar to that in BOLD, but precedes it in both rising and falling. The perfusion IRF shows similar nonlinearities to the BOLD IRF.

Bottom line: Perfusion can be effectively used in an event-related setting, and the hemodynamic response in perfusion precedes that in BOLD on both onset and offset.


Glover (1999), "On signal to noise ratio tradeoffs in fMRI," monograph. PDF

Summary: Gary discusses various tradeoffs about SNR, such as length of TR and number of interleaves, and walks through the consequences of changing those parameters both in mathematical models and in single-subject data.

Bottom line: Use the smallest TR possible; the decrease in SNR in individual image is more than offset by the increase in image number. Fewer interleaves are better, as is higher field strength.

Glover handout: Summarizes the above monograph, with a little bit extra about slice thickness. Handout

Philippe handout: A nice primer on k-space and perfusion, and some good page-long summaries on the primary articles. Handout

Constable & Spencer (2001), "Repetition time in echo planar functional MRI," Magnetic Resonance in Medicine 46, 748-755. PDF

Summary: Authors attempt to determine, from theoretical and empirical footings, the optimal, RT for a given experimental paradigm, and come to the conclusion that shorter TRs (1-1.5 sec) vs. longer (3-4 sec) offer greater statistical power, at the expense of some other tradeoffs.


Segmentation FAQ

Frequently Asked Questions - Segmentation

1. What is segmentation?

Segmentation is the process by which you separate your brain pictures into different tissue types. You give the segmentation program a brain image, and it classifies every voxel by tissue type - grey matter, white matter, CSF, skull, etc. Some segmentation algorithms operate on a probabilistic basis rather than a "hard" classification (so one voxel might by 60% likely to be grey, 10% likely to be white, etc.). Some segmentation algorithms go further than tissue type, and classify individual anatomical regions as well (see SegmentationPapers). Segmentation algorithms often give back output images, consisting of all the grey voxels in the brain, for example.

A subset of segmentation algorithms focus only on the problem of separating brain from skull tissue; these are often called "skull-stripping" or "brain-extraction" algorithms. The problem of classifying brain from skull is slightly easier than classifying different brain tissue types, but many of the same problems are faced, so we lump them in together with general segmentation algorithms.

2. Why should you segment?

Lotta reasons. Might be you're interested in the details of the segmentation - how much gray matter is in a particular region, how much white matter, etc. A lot of those analyses fall under the label of voxel-based morphometry (VBM), discussed below in the Ashburner & Friston paper. Alternatively, you might be interested in masking your analysis with one of the segments and only examining activated voxels that are in gray matter in a particular region. You might want to segment only to increase the accuracy of another preprocessing step - you might care that your normalization, say, is especially good in gray matter while you don't care as much about its accuracy in white matter. Simply extracting the brain has even more utility; some analysis programs or preprocessing steps require you to strip skull tissue off the brain before using them. You might simply want to create an analysis mask of all the brain voxels and ignore the other ones.

All of those issues would require you to identify which voxels of your image (almost always anatomical) are gray matter, which are white, and which are CSF or skull or other stuff (or at least which are brain and which are not). You can do this by hand, but it's an arduous and hugely time-consuming process, infeasible for large groups of subjects. Several automated methods are available, though, to do it. Generally, the algorithms take some input image and produce labels for every voxel, assigning them to one of the categories above, or sometimes anatomical labels as well (see below). Alternativey, some algorithms exist that do a "soft" classification and assign each voxel a certain probability of being a particular tissue type. Which you use will depend on exactly what your goals for segmentation are.

Finally, segmentation algorithms are increasingly being used not only to separate tissue types but to automate the production of individualized anatomical ROIs. Would you like to hand-draw your caudate or thalamus, say, but figure it'll take too long or be too hard? Automated segmentation algorithms could be used to simplify the process.

3. What are the problems I might face with segmentation?

Segmentation algorithms face two big issues: intensity overlap and partial voluming. Intensity overlap refers to the fact that the intensity distributions for different tissue types aren't completely separate - they have significant overlap, such that a bright voxel might be a particularly bright gray matter voxel or, just as easily, a particularly bright white matter voxel. Because all segmentation algorithms have to work with is the intensity value at each voxel (and those around it), this poses a problem for hard classifications. As well, inhomogeneities in the magnetic field, susceptibility-induced magnetic changes or head motion during acquisition can all produce gradual shadings of light or dark in images that can make the different tissue types even harder to distinguish - a particular brightness level might be gray matter at the front of the head, but white matter at the back of the head. One way to address these problems is to take spatial location into account; at the simplest level, voxels can always be assigned a high probability of being the same tissue type as the voxels around them (spatial coherence), or one can use a more sophisticated method like incorporating a full prior probability atlas (see Fischl et. al and Marroquin et. al, below).

Partial voluming refers to the fact that even high-res MRI has a limited spatial resolution, and a given voxel might include signal from several different tissue types to varying degree. This is particularly important along tissue-type borders, where if an algorithm is biased towards one tissue type or another, estimates of one tissue type's volume within an area can be significantly inflated or deflated from reality. One way to address this problem is with "soft" classification - instead of semi-arbitarily assigning voxels to definite tissue types, one can assign voxels probabilities of being in a tissue type, and take that confidence level into account when deciding tissue volume, etc.

4. How are coregistration and segmentation related?

Fischl et. al (SegmentationPapers) make the point that the two processes operate on different sides of the same coin - each one can solve the other. With a perfect coregistration algorithm, you could be maximally confident that you could line up a huge number of brains and create a perfect probability atlas - allowing you the best possible prior probabilities with which to do your segmentation. In order to do a good segmentation, then, you need a good coregistration. But if you had a perfect segmentation, you could vastly improve your coregistration algorithm, because you could coregister each tissue type separately and greatly improve the sharpness of the edges of your image, which increases mutual information.

Fortunately, MI thus far appears to do a pretty good job with coregistration even in unsegmented images, breaking us out of a chicken-and-egg loop. But future research on each of these processes will probably include, to a greater and greater extent, the other process as well.

Check out CoregistrationFaq for more info on coregistration...


Segmentation HOWTO

How-Tos - Segmentation

How do I...

Do segmentation in SPM?

The below instructions are for SPM2. SPM99 is almost identical, with a couple small differences: you'll specify the number of subjects explicitly to start, and you'll be explicitly asked about inhomogenity correction. SPM2 specifies that information in the segmentation defaults, and the options chosen here are roughly the defaults in SPM2.

Hit "Segment" in SPM's main interface.

Select images, subj 1? Choose your anatomical image - V001.img. Hit done.

Select images, subj 2? Hit done (or choose other subjects if you want to do them all at once).

Already spatially normalized? "No" if you're segmenting the original anatomy; "Yes" if you've chosen the wV001.img file (or nV001.img in SPM99).

Modality? Generally T1 MR. Choose T2 MR if you have a T2-weighted anatomical (if gray matter looks bright and white matter looks dark, for example). If you've chosen a functional image - echo-planar or spiral imaging both - choose EPI MR.

(SPM99 only) Attempt to correct intensity inhomogeneities? Choose "Lots of inhomogeneity correction." Inhomogeneity correction (or bias correction) attempts to remove any gradual intensity changes in the image due to scanner effects - say, the middle appears much brighter than the edges. This is a big problem for segmentation algorithms, and SPM's bias correction algorithm does a goob job accounting for the problem. If you notice very strange effects in your segmented images (like gray matter appears to fade in and out across the image irregularly, for example), you can try lessening the inhomogeneity correction or turning it off. But generally it's a good idea to turn it on full-bore, which is the default in SPM2.

(SPM99 only) Write inhomogeneity-corrected image? Choose "Write corrected image." This will actually write the corrected image to disk, which is nice to have around.

The image will now segment, and you'll get at least three images out as output: V001_seg1.img, V001_seg2.img, V001.seg3.img. The first is the gray-matter-only image, the second is the white-matter-only image and the third is the CSF-only image. If you chose to write the bias-corrected image (in SPM99) or if you're using SPM2 (which writes it by default), you'll get a fourth image, mV001.img (or corr_V001.img in SPM99), which is the original image, corrected for any nonlinear biases in its intensity, so it should look much more even and unshaded. Use "Check Reg" to compare the two if you like. The segmented images can now be used in whatever way your little heart desires, so long as it's within the law.

Do skull-stripping (or brain extraction)?

SPM99 had a specific function for doing brain extraction that actually wrote an image file out. But then, in SPM2, they folded a good deal of the code in that function into the segmentation function, but didn't leave any option to write an all-brain mask image. The nice thing, though, is that what made SPM99's skull-stripping necessary was that the gray matter image often left a lot of skull on the image. That's no longer true (supposedly) in SPM2, so you shouldn't have to do skull-stripping there. But you still might want to have an all-brain mask image, or maybe your gray matter mask still has skull on it after all. For whatever reason, you might find yourself still wanting to use the skull-stripper in SPM.

So you have two separate things you can do in SPM. One is to use spm_xbrain, the SPM99 facility. You do this if a) you're using SPM99 and want a skull-stripped gray matter image and/or skull-stripped all-brain mask, or b) you're using SPM2 and you aren't happy with how much skull is left in your gray matter mask.

The second possibility is only for those using SPM2. If you're segmented with SPM2 and you want to have an all-brain mask, you can simply combine the gray and white matter masks together to make one. You can't do this in SPM99, because there's usually skull left on the gray matter mask, and the skull-stripping function automatically writes a combined mask for you. In SPM2, you have to do the combination manually, but you don't have to strip the skull.

Instructions for both are here:

Option 1: Use spm_xbrain, the SPM99 facility. You can use this even if you've been using SPM2 for the other stages of your analysis; you won't need any SPM99-specific files, and it won't do anything tricky to the images. To go this route:

Start SPM99 and hit "Render" in the SPM main interface. Choose "Xtract Brain" in the drop-down menu.

Select gray and white matter images: Select your gray and white matter images - this is V001_seg1.img and V001_seg2.img in your anatomy/inplane directory.

Save what? Save Extracted Brain. The other options are for doing 3-D rendering of your brain if you like.

The rendering will combine the segmented images and erode the skull, to produce a brain_V001.img file in the same directory. If you're feeling extra crazy, you can intersect the brain mask with the gray matter image to skull-strip your gray-matter mask, as follows:

Hit "ImCalc" in the main SPM interface.

Select images to work on : Select inplane/V001_seg1.img, inplane/brain_V001.img, DONE.

Output filename : Type V001_seg1_noskull.

Evaluated function : Select i1.*i2. This will multiply, voxel-by-voxel, the value of the gray matter image with the value of the skull-stripped brain, effectively producing an intersection of the two (since any zero in one will zero out the voxel in the final image).

This will produce a new file, inplane/V001_seg1_noskull.img in anatomy/inplane. Display the image to check how it looks. You can also compared it to inplane/V001_seg1.img using the "Check Reg" button.

Option 2: Combine the gray and white matter masks into an all-brain mask. This is essentially what is done, by the way, by the specmask script (see GablabScripts). specmask also smooths the brain to make extra sure it's got the whole brain; we won't do that, to avoid getting any skull.

Hit "ImCalc" in the main SPM interface.

Select images to work on : Select inplane/V001_seg1.img, inplane/V001_seg2.img, DONE.

Output filename: Type union_V001.img.

Evaluated function: Type i1 + i2. This will add, voxel-by-voxel, the value of the gray matter image with the value of the white matter image, effectively producing the union of the two.

In order to turn this image into a mask image, though, we need to make it binary-valued. So hit "ImCalc" again, and choose union_V001.img as the image to work on.

Output filename: Type brain_V001.img.

Evaluated function: Type i1 > 0.5. This will replace the value of every voxel whose value is greater than 0.5 with 1, and every other voxel with 0. The resulting image file, brain_V001.img, should be a fairly tight mask of the brain-only voxels in your anatomical file. Use "Check Reg" to compare its outlines with the outlines of the brain in your anatomical to make sure.


Segmentation Papers

Useful Papers - Segmentation


Fischl et al. (2002), "Whole brain segmentation: Automated labeling of neuroanatomical structures in the human brain," Neuron 33, 341-355 PDF

Summary: Summary of some of the problems facing intensity-based segmentation and presentation of an algorithm that segments whole brains into various anatomical structures in completely automated fashion. The algorithm uses not just intensity information but a good deal of atlas information about prior probability of anatomical structure location. Fischl et. al show it has very good correlation with labeling by hand, both in normals and in patients with mild Alzheimer's. The software is commercial, and we may or may not be able to lay our hands on it here...

Bottom line: Intensity-based segmentation is hard, and anatomical labeling is harder still. If you can compile a good atlas of tissue types and anatomical locations, though, you can use prior probability information to do a pretty good automated job of segmentation.


Ashburner & Friston (1997), "Multimodal image coregistration and partitioning - a unified framework," NeuroImage 6, 209-217 PDF

Summary: The original paper defining the old SPM (pre-SPM2) way of doing coregistration. The authors suggest defining within-modality templates that are already coregistered and using least-squares methods to coregister the experimental images to those templates. Using segmentation during the coregistration can help improve the success and accuracy of that registration.

Bottom line: A bit obsolete these days; SPM2 has moved to MI coregistration, which is simpler and shows better success rates.

Marroquin et al. (2002), "An accurate and efficient bayesian method for automatic segmentation of brain MRI," IEEE Transactions on Medical Imaging 21(8), 934-945 PDF

Summary: Technical paper demonstrating another Bayesian-based probability atlas for doing automatic brain segmentation (Fischl et. al above use another). Some of the important problems in segmentation are surveyed in mathematical detail - things like partial voluming, etc. Different aspects of the algorithm - the values of the priors, the assumed spatial coherence, the noise, etc. - are varied to test their effects on the algorithm.

Bottom line: Bayesian parameter estimation can do a good job handling segmentation - with the right atlas. Far and away the biggest effect on the success of these atlases is what the value of the priors are, so a good atlas is essential...

Ashburner & Friston (2000), "Voxel-based morphometry — the methods," NeuroImage 11, 805-821 PDF

Summary: Didn't get to these next two, but this paper summarizes the whole analysis path in doing detailed analyses of structural images, say for a between-group study of anatomy. A number of the big problems in the path - inhomogeneity, segmentation, etc. - are discussed.


Yushkevich et al. (2006), "User-guided 3D active contour segmentation of anatomical structures: Significantly improved efficiency and reliability," NeuroImage 31, 1116-1128 DOI

Summary: Presents ITK-SNAP (, an open-source free automated segmentation algorithm, and demonstrates that SNAP segmentations of the caudate agree highly with manual segmentations by trained raters, and are even more reproducible (in overlap terms) than manual segmentations. SNAP segmentation also went significantly faster and required a hell of a lot less training.

Bottom line: ... It's almost... like a real piece of software... written by software engineers or something? <gasp> Honestly, this seems like really a pretty good tool for making individualized anatomical ROIs quickly and reproducibly.


Slice Timing FAQ

Frequently Asked Questions - Slice Timing Correction

1. What is slice timing correction? What's the point?

In multi-shot EPI (or spiral methods, which mimic them on this front), slices of your functional images are acquired throughout your TR. You therefore sample the BOLD signal at different layers of the brain at different time points. But you'd really like to have the signal for the whole brain from the same time point. If a given region that spans two slices, for example, all activates at once, you want to see what the signal looks like from the whole region at once; without correcting for slice timing, you might think the part of the region that was sampled later was more active than the part sampled earlier, when in fact you just sampled from the later one at a point closer to the peak of its HRF.

What slice-timing correction does is, for each voxel, examine the timecourse and shift it by a small amount, interpolating between the points you ACTUALLY sampled to give you back the timecourse you WOULD have gotten had you sampled every voxel at exactly the same time. That way you can make the assumption, in your modeling, that every point in a given functional image is the actual signal from the same point in time.

2. How does it work?

The standard algorithm for slice timing correction uses sinc interpolation between time points, which is accomplished by a Fourier transform of the signal at each voxel. The Fourier transform renders any signal as the sum of some collection of scaled and phase-shifted sine waves; once you have the signal in that form, you can simply shift all the sines on a given slice of the brain forward or backward by the appropriate amount to get the appropriate interpolation. There are a couple pitfalls to this technique, mainly around the beginning and end of your run, highlighted in Calhoun et. al below, but these have largely been accounted for in the currently available modules for slice timing correction in the major programs.

3. Are there different methods or alternatives and how well do they work?

One alternative to doing slice-timing correction, detailed below in Henson et. al, is simply to model your data with an HRF that accounts for significant variability in when your HRFs onset - i.e., including regressors in your model that convolve your data with both a canonical HRF and with its first temporal derivative, which is accomplished with the 'hrf + temporal derivative' option in SPM. In terms of detecting sheer activation, this seems to be effective, despite the loss of some degrees of freedom in your model; however, your efficiency in estimating your HRF is very significantly reduced by this method, so if you're interested in early vs. late comparisons or timecourse data, this method isn't particularly useful.

Another option might be to include slice-specific regressors in your model, but I don't know of any program that currently implements this option, or any papers than report on it...

4. When should you use it?

Slice timing correction is primarily important in event-related experiments, and especially if you're interested in doing any kind of timecourse analysis, or any type of 'early-onset vs. late-onset' comparison. In event-related experiments, however, it's very important; Henson et. al show that aligning your model's timescale to the top or bottom slice can results in completely missing large clusters on the slice opposite to the reference slice without doing slice timing correction. This problem is magnified if you're doing interleaved EPI; any sequence that places adjacent slices at distant temporal points will be especially affected by this issue. Any event-related experiment should probably use it.

5. When is it a bad idea?

It's never that bad an idea, but because the most your signal could be distorted is by one TR, this type of correction isn't as important in block designs. Blocks last for many TRs and figuring out what's happening at any given single TR is not generally a priority, and although the interpolation errors introduced by slice timing correction are generally small, if they're not needed, there's not necessarily a point to introducing them. But if you're interested in doing any sort of timecourse analysis (or if you're using interleaved EPI), it's probably worthwhile.

6. How do you know if it’s working?

Henson et. al and both Van de Moortele papers below have images of slice-time-corrected vs. un-slice-time-corrected data, and they demonstrate signatures you might look for in your data. Main characteristics might be the absence of significant differences between adjacent slices. I hope to post some pictures here in the next couple weeks of the SPM sample data, analyzed with and without slice timing correction, to explore in a more obvious way.

7. At what point in the processing stream should you use it?

This is the great open question about slice timing, and it's not super-answerable. Both SPM and AFNI recommend you do it before doing realignment/motion correction, but it's not entirely clear why. The issue is this:

  • If you do slice timing correction before realignment, you might look down your non-realigned timecourse for a given voxel on the border of gray matter and CSF, say, and see one TR where the head moved and the voxel sampled from CSF instead of gray. This would results in an interpolation error for that voxel, as it would attempt to interpolate part of that big giant signal into the previous voxel.
  • On the other hand, if you do realignment before slice timing correction, you might shift a voxel or a set of voxels onto a different slice, and then you'd apply the wrong amount of slice timing correction to them when you corrected - you'd be shifting the signal as if it had come from slice 20, say, when it actually came from slice 19, and shouldn't be shifted as much.

    There's no way to avoid all the error (short of doing a four-dimensional realignment process combining spatial and temporal correction - possibly coming soon), but I believe the current thinking is that doing slice timing first minimizes your possible error. The set of voxels subject to such an interpolation error is small, and the interpolation into another TR will also be small and will only affect a few TRs in the timecourse. By contrast, if one realigns first, many voxels in a slice could be affected at once, and their whole timecourses will be affected. I think that's why it makes sense to do slice timing first. That said, here's some articles from the SPM e-mail list that comment helpfully on this subject both ways, and there are even more if you do a search for "slice timing AND before" in the archives of the list.

  • Thread from Rik Henson
  • Argument from Geoff Aguirre
  • Response to Aguirre from Ashburner

8. How should you choose your reference slice?

You can choose to temporally align your slices to any slice you've taken, but keep in mind that the further away from the reference slice a given slice is, the more it's being interpolated. Any interpolation generates some small error, so the further away the slice, the more error there will be. For this reason, many people recommend using the middle slice of the brain as a reference, minimizing the possible distance away from the reference for any slice in the brain. If you have a structure you're interested in a priori, though - hippocampus, say - it may be wise to choose a slice close to that structure, to minimize what small interpolation errors may crop up.

9. Is there some systematic bias for slices far away from your reference slice, because they're always being sampled at a different point in their HRF than your reference slice is?

That's basically the issue of interpolation error - the further away from your reference slice you are, the more error you're going to have in your interpolation - because your look at the "right" timepoint is a lot blurrier. If you never sample the slice at the top of the head at the peak of the HRF, the interpolation can't be perfect there if you're interpolating to a time when the HRF should be peaking - but hopefully you have enough information about your HRF in your experiment to get a good estimation from other voxels. It's another argument for choosing the middle slice in your image - you want to get as much brain as possible in an area of low interpolation error (close to the reference slice).

10. How can you be sure you're not introducing more noise with interpolation errors than you're taking out with the correction?

Pretty good question. I don't know enough about signal processing and interpolation to say exactly how big the interpolation errors are, but the empirical studies below seem to show a significant benefit in detection by doing correction without adding much noise or many false positive voxels. Anyone have any other comments about this?


Slice Timing HOWTO

How-Tos - Slice Timing Correction

How do I...

Do slice timing correction in SPM?

This should be done before any other preprocessing steps (coregistration, normalization, realignment, etc.) Slice timing will produce aV*.img files.

Number of subjects/sessions You'd typically enter 1 here. However:

  1. If you have multiple sessions, enter the number of sessions.
  2. If you have multiple subjects, enter the number of subjects multiplied by the number of sessions to get the total number of sessions you need slice-time corrected. (For example, you'd enter 12 if you had 3 subjects, 4 sessions per subject).

Select all SCAN1/V* images for your first session, then all SCAN2/V* images, etc., then repeat for each additional subject.

Select ascending if the way you prescribed and collected slices was starting from the bottom of the brain and going to the top (inferior to superior). Select descending if you prescribed and collected top-down. For any spiral imaging, select ascending (see Why ascending in spiral?).

Selecting user specified or interleaved would only be necessary if the slices were not collected sequentially (in space, that is). See the how-to for specifying a sequence in this case: Do slice timing correction with an interleaved slice sequence?

Reference slice (1=bottom): Enter the slice you want to consider as a reference point. All other slices will be corrected to what they would have been if they were acquired when the reference slice was acquired. The default is the middle slice (although, please make sure the default value given is indeed the middle slice for the number of slices you have). The default is generally fine, but make sure you write down what the value was! You should adjust your defaults for it down the road. Also read the SliceTimingFaq for thoughts on choosing your reference slice. If you have a structure of a priori interest (e.g., the hippocampus or something), it may make sense to choose a different slice than the default.

NB: The number you enter here is selected regardless of the acquisition sequences specified in 'Select sequence type' above. To decide what number you should enter here, disregard how your sequence was collected, and imagine a brain with slices numbered sequentially from bottom to top : 1 2 3 4 5 6 ... up to your #_of_slices. Now select the number corresponding to the slice you would like to use as reference. The number you have now is correct if you are using the standard spm99 release.

Interscan interval (TR) {sec} Enter the TR in seconds.

Acquisition time (TA) {sec} Enter the time between beginning of acquisition of the first slice and the beginning of acquisition of the last slice of one scan. Typically, i.e. for continuous acquisition protocols, TA = (TR/#_slices)*(#_slices - 1). The default value is calculated according to this formula.

For some specific acquisition protocols however (for instance clustered acquisition), the gap between acquiring the last slice of one scan and the first slice of the next scan may be considerably longer than the gap between acquiring sequential slices within the same scan. In such cases, you should calculate and enter a TA different than the default value.

Why ascending in spiral?

Note that this may be specific to spiral sequences acquired at the Lucas center.

From a question posed to G. Glover:
Q: In all of our protocols we specify our slice thickness (ex: 5mm) and our spacing (ex: interleave) in our scanning range. With the spiral-in-out functional scans, our question is whether our slices are acquired in a sequential (1,2,3...) or in an interleaved (1,3,5...2,4,6...) fashion from the seed slice. While under "Spacing" in the Scanning Range we specify "interleaved", we are not sure how this relates to the acquisition order if at all.

A: A lovely question. In my sequences the acquisition order is always sequential, whether you have skip 0, intlvd, or a number. So for time slice correction you need simply use a sequential schedule, where slice 1 is acquired TE + 10 ms after the start of TR, and successive slices are acquired in spatially contiguous order TR/nslices apart (unless its clustered).

Possibly Too Much Information (TMI):
The reason for interleaving in normal sequences is to try to mitigate as much as possible the influence of imperfect slice selection profiles, i.e. "slice bleed". If you have contiguous slice spacing (skip 0) but the slice oozes into the adjacent ones, then spin saturation effects reduce the net SNR. This effect can be reduced by acquiring the slices as far apart in time as possible, e.g. 1-3-5... 2-4-6.... The trouble is the interleave factor depends on the number of slices, so sometimes it could be 1-4-7... 2-5-8.... I simply avoid this by the use of a very high performance RF pulse so that slice bleed isn't a problem, and then I can use the simpler sequential order for your preprocessing pleasure.

Do slice timing correction with an interleaved slice sequence?

Just a stub - no info yet...


Slice Timing Papers

Useful Papers - Slice Timing Correction


Henson et. al (1999), "The slice-timing problem in event-related fMRI," NeuroImage 9, S125 PDF

Summary: Describes the slice-timing problem for event-related experiments, including very nice and dramatic pictures of data without the correction and without it. Describes both interpolation and temporal derivative ways of solving problem.

Bottom line: The basic 1-page slice-timing paper: shows correction is needed and that interpolation works well to correct it.

Van de Moortele et. al (1999), "Latencies in fMRI time-series: effect of slice acquisition order and perception," Nuclear Magnetic Resonance in Biomedicine 10, 230-236 PDF

Summary: Demonstrates slice-timing problem in block experiment, and shows very regular effect spread within single clusters. Demonstrates effectiveness of interpolation even on activations within clusters.

Bottom line: Shows that slice-timing can be useful even in block experiments; shows empirically slice-timing problem happens within clusters.


Van de Moortele et. al (1998), "Slice-dependent time shift efficiently corrected by interpolation in multi-slice EPI fMRI series," NeuroImage 7:4 (Supplement), S607 PDF

Summary: Early paper suggesting slice-timing correction algorithm now used by all major programs; uses it in early/late onset contrast.

Bottom line: Yup, interpolation correction works; if you want timecourse information, you should do it.

Calhoun, V. et al (2000), "Improved fMRI slice timing correction: interpolation errors and wrap-around effects," Proceedings, ISMRM, 9th Annual Meeting, Denver, CO, 810 PDF

Summary: Highlights pitfalls of interpolation correction - describes the shape and profile of interpolation errors and talks about improper interpolation at beginning and end of timecourse.

Bottom line: Consider low-pass smoothing your data if interpolation errors are a big deal (but low-pass smoothing probably removes more signal from your data than you'll get back from the error correction), and use an up-to-date program to do slice-time correction.


Smoothing FAQ

Frequently Asked Questions - Smoothing

1. What is smoothing?

"Smoothing" is generally used to describe spatial smoothing in neuroimaging, and that's a nice euphamism for "blurring." Spatial smoothing consists of applying a small blurring kernel across your image, to average part of the intensities from neighboring voxels together. The effect is to blur the image somewhat and make it smoother - softening the hard edges, lowering the overall spatial frequency, and hopefully improving your signal-to-noise ratio.

2. What's the point of smoothing?

Improving your signal to noise ratio. That's it, in a nutshell. This happens on a couple of levels, both the single-subject and the group.

At the single-subject level: fMRI data has a lot of noise in it, but studies have shown that most of the spatial noise is (mostly) Gaussian - it's essentially random, essentially independent from voxel to voxel, and roughly centered around zero. If that's true, then if we average our intensity across several voxels, our noise will tend to average to zero, whereas our signal (which is some non-zero number) will tend to average to something non-zero, and presto! We've decreased our noise while not decreasing our signal, and our SNR is better. (Desmond & Glover (DesignPapers) demonstrate this effect with real data.)

Matthew Brett has a nice discussion and several illustrations of this on the Cambridge Imagers page:

At the group level: Anatomy is highly variable between individuals, and so is exact functional placement within that anatomy. Even with normalized data, there'll be some good chunk of variability between subjects as to where a given functional cluster might be. Smoothing will blur those clusters and thus maximize the overlap between subjects for a given cluster, which increases our odds of detecting that functional cluster at the group level and increasing our sensitivity.

Finally, a slight technical note for SPM: Gaussian field theory, by which SPM does p-corrections, is based on how smooth your data is - the more spatial correlation in the data, the better your corrected p-values will look, because there's fewer degree of freedom in the data. So in SPM, smoothing will give you a direct bump in p-values - but this is not a "real" increase in sensitivity as such.

3. When should you smooth? When should you not?

Smoothing is a good idea if:

  • You're not particularly concerned with voxel-by-voxel resolution.
  • You're not particularly concerned with finding small (less than a handful of voxels) clusters.
  • You want (or need) to improve your signal-to-noise ratio.
  • You're averaging results over a group, in a brain region where functional anatomy and organization isn't precisely known.
  • You're using SPM, and you want to use p-values corrected with Gaussian field theory (as opposed to FDR).

Smoothing'd not a good idea if:

  • You need voxel-by-voxel resolution.
  • You believe your activations of interest will only be a few voxels large.
  • You're confident your task will generate large amounts of signal relative to noise.
  • You're working primarily with single-subject results.
  • You're mainly interested in getting region-of-interest data from very specific structures that you've drawn with high resolution on single subjects.

4. At what point in your analysis stream should you smooth?

The first point at which it's obvious to smooth is as the last spatial preprocessing step for your raw images; smoothing before then will only reduce the accuracy of the earlier preprocessing (normalization, realignment, etc.) - those programs that need smooth images do their own smoothing in memory as part of the calculation, and don't save the smoothed versions. One could also avoid smoothing the raw images entirely and instead smooth the beta and/or contrast images. In terms of efficiency, there's not much difference - smoothing even hundreds of raw images is a very fast process. So the question is one of performance - which is better for your sensitivity?

Skudlarski et. al (SmoothingPapers) evaluated this for single-subject data and found almost no difference between the two methods. They did find that multifiltering (see below) had greater benefits when the smoothing was done on the raw images as opposed to the statistical maps. Certainly if you want to use p-values corrected with Gaussian field theory (a la SPM), you need to smooth before estimating your results. It's a bit of a toss-up, though...

5. How do you determine the size of your kernel? Based on your resolution? Or structure size?

A little of both, it seems. The matched filter theorem, from the signal processing field, tells us that if we're trying to recover a signal (like an activation) in noisy data (like fMRI), we can best do it by smoothing our data with a kernel that's about the same size as our activation.

Trouble is, though, most of us don't know how big our activations are going to be before we run our experiment. Even if you have a particular structure of interest (say, the hippocampus), you may not get activation over the whole region - only a part.

Given that ambiguity, Skudlarski et. al introduce a method called multifiltering, in which you calculate results once from smoothed images, and then a second set of results from unsmoothed images. Finally, you average together the beta/con images from both sets of results to create a final set of results. The idea is that the smoothed set of results preferentially highlight larger activations, while the unsmoothed set of results preserve small activations, and the final set has some of the advantages of both. Their evaluations showed multifiltering didn't detect larger activations (clusters with radii of 3-4 voxels or greater) as well as purely smoothed results (as you might predict) but that over several cluster sizes, multifiltering outperformed traditional smoothing techniques. Its use in your experiment depends on how important you consider detecting activations of small size (less than 3-voxel radius, or about).

Overall, Skudlarski et. al found that over several cluster sizes, a kernel size of 1-2 voxels (3-6 mm, in their case) was most sensitive in general.

A good rule of thumb is to avoid using a kernel that's significantly larger than any structure you have a particular a priori interest in, and carefully consider what your particular comfort level is with smaller activations. A 2-voxel-radius cluster is around 30 voxels and change (and multifiltering would be more sensitive to that size); a 3-voxel-radius cluster is 110 voxels or so (if I'm doing my math right). 6mm is a good place to start. If you're particularly interested in smaller activations, 2-4mm might be better. If you know you won't care about small activations and really will only look at large clusters, 8-10mm is a good range.

6. Should you use a different kernel for different parts of the brain?

It's an interesting question. Hopfinger et. al find that a 6mm kernel works best for the data they examine in the cortex, but a larger kernel (10mm) works best in subcortical regions. This might be counterintuitive, considering the subcortical structures they examine are small in general than large cortical activations - but they unfortunately don't include information about the size of their activation clusters, so the results are difficult to interpret. You might think a smaller kernel in subcortical regions would be better, due to the smaller size of the structures.

Trouble is, figuring out exactly which parts of the brain to use a different size of kernel on presupposes a lot of information - about activation size, about shape of HRF in one region vs. another - that pretty much doesn't exist for most experimental set-ups or subjects. I would tend to suggest that varying the size of the kernel for different regions is probably more trouble than it's worth at this point, but that may change as more studies come out about HRFs in different regions and individualized effects of smoothing. See Kiebel and Friston (SmoothingPapers), though, for some advanced work on changing the shape of the kernel in different regions...

7. What does it actually do to your activation data?

About what you'd expect - preferentially brings out larger activations. Check out White et. al (SmoothingPapers) for some detailed illustrations. We hope to have some empirical results and maybe some pictures up here in the next few weeks...

8. What does it do to ROI data?

Great question, and not one I've got a good answer for at the moment. One big part of the answer will depend on the ratio of your smoothing kernel size to your ROI size. Presumably, assuming your kernel size is smaller than your ROI, it may help improve SNR in your ROI, but if the kernel and ROI are similar sizes, smoothing may also blur the signal such that your structure contains less activation. With any luck, we can do a little empirical testing on this questions and have some results up here in the future...

9. What is Smoove-ing?

Talk to this guy :


Smoothing HOWTO

How-Tos - Smoothing

How do I...

Smooth in SPM?

Easiest thing there is to do in SPM, actually. Which is nice.

Fire up SPM. Hit "Smooth" in the main interface.

smoothing (FWHM in mm)? Enter in your kernel size. See SmoothingFaq, #5 if you don't know what it should be.

Select scans: Select all of your functional images for the subject. If you like, you can select all of the functional images for all of your subjects all at once. Smoothing doesn't look at how many scans you have or care about whether they're all from the same subject or session; smoothing is done on a totally image-by-image basis, with no interaction whatsoever between scans, and it's done identically for all sessions and subjects. So you can choose as many or as few functional scans as you like.

Smoothing is a relatively fast process, taking a fraction of a second per image. It produces s*.img files as output, in the same directory as the original files.

Figure out what my smoothing kernel should be?

See SmoothingFaq, question #5.

Figure out what my smoothing kernel was?

In SPM, at least, this isn't too tricky. The smoothing algorithm leaves a description in the header file of the image which you can view with SPM's "Display" function. So fire up SPM and hit "Display." Choose one of your s*.img files - even if you've done more processing on the images after smoothing, the s*.img is the one that's most likely to have the smoothing message. In the right-hand panel, below the line that reads "Intensity" and above the line that reads "Vox size," there should be a message that looks something like "spm - 3D normalized - conv(6,6,6)." Non-normalized data won't have that bit about normalization, but the message should end in "conv(something, something, something)." That "something" is your smoothing kernel, in mm. The example above had a 6mm smoothing kernel used on it.

Do something tricky, like multifiltering?

Well, "something tricky" is a little generic. But if you actually want to do multifiltering, here's a sketch of the analysis path:

Do your preprocessing up through smoothing as usual. Before you smooth, make a separate copy of your whole functional image directory. Label one "functional_smoothed." Smooth the images in the functional_smoothed directory as above. Create two separate results directories, "results" and "results_smoothed."

Create an identical design matrix and model in each results directory - one specifying the non-smoothed functionals, one specifying the smoothed functionals. Estimate both models.

Finally, create a new results directory, called "results_multifiltered." Copy the entire contents of one of the other results directories into this one (doesn't matter which one is the source), then delete all the beta*.img files from the multifiltered results directory.

Cd into the multifiltered results directory and start SPM. Hit "ImCalc". Choose the beta_0001.img from the results directory and the beta_0001.img from the results_smoothed directory. As output filename, put beta_0001.img - it's crucial the filename remains the same. For evaluated function, type (i1 + i2) ./ 2. This will create a new beta_0001.img in the results_multifiltered directory that is the average of the smoothed and unsmoothed results. Repeat these steps for each beta image. (You could also perform the averaging on the contrast images, which would be fewer images... but then every time you made a new contrast, you would have to make it first for the regular results and smoothed results, then repeat the averaging step. If you do this, you need to average both the con_.img files and the spmT_.img files.)

Now hit "Results" and go to the SPM.mat in the multifiltered directory. It should see the beta images in its current directory fine (the filenames are stored without paths for beta* and con* images), and you should be able to make contrasts that are multifiltered. Compare them to the same contrast you make in the results and results_smoothed directories, and see how they stack up!

Smooth my contrast images (or other non-functional images, or ROIs)?

Same way you smooth your functional images. Go to SPM, hit "Smooth" on the main interface, enter in a kernel value, and choose the images you like. The smoothing algorithm will produce s*.img files in the same directory, just as with functionals.

Figuring out kernels for non-functional images isn't too tricky; just use the same kernel you used on your functionals. If you didn't smooth your functionals, use the kernel you would have used or that seems appropriate for the size of activation you're hoping to detect. See SmoothingFaq, question #5 for help.


Smoothing Papers

Useful Papers - Smoothing


Hopfinger et. al (2000), "A study of analysis parameters that influence the sensitivity of event-related fMRI analyses," NeuroImage 11, 326-333 PDF

Summary: A group including SPM authors tests out different settings for their analysis and looks at how activation Z-scores vary as a function of spatial smoothing, temporal smoothing, different HRFs, and resampled voxel size. The authors divide the effects of spatial smoothing into two points: 1) suppressing high-frequency spatial noise, 2) directly improving p-values by introducing spatial correlation (which affects corrected p-values only).

Bottom line: In the cortex, 6mm smoothing gave the best results - but that was the smallest kernel they used. In subcortical structures, 10mm spatial smoothing gives the best results, but no info about spatial extent of activations is given, so the results are difficult to interpret.

Skudlarski et. al (1999), "ROC analysis of statistical methods used in functional MRI: individual subjects," NeuroImage 9, 311-329 PDF

Summary: We saw this paper way back in week 1, and we'll see it again. Skudlarski et. al use the receiver operating characteristic (ROC) to measure the sensitivity of their analysis, using fake activations in real single-subject data, as a function of various parameters including spatial smoothing. They highlight the relationship of cluster filtering to smoothing, and introduce a couple new smoothing schemes, like multifiltering (combining smoothed activations with unsmoothed).

Bottom line: Smoothing at the right kernel size for the activation works well, but with unknown activation sizes, a kernel size of 1-2 voxels worked best overall (although this includes quite small activations). Multifiltering preserved small activations well. Cluster filtering didn't improve sensitivity above spatial smoothing at all.


LaConte et. al (2003), "The evaluation of preprocessing choices in single-subject BOLD fMRI using NPAIRS performance metrics," NeuroImage 18, 10-27 PDF

Summary: Similar study to those above, but using a different (and much more complicated) performance stat, based on treating all the steps in analysis like parameters to be estimated and getting estimates of reproducibility from each iteration.

Bottom line: Smoothing is good... I think. Honestly, had a tough time making head or tail of the graphs in this one.

Kiebel & Friston (2002), "Anatomically informed basis functions in multisubject studies," Human Brain Mapping 16, 36-46 PDF

Summary: Kiebel & Friston describe a method of smoothing in which smoothing is done only within the cortical sheet - essentially, making the data smoother where you want it, without spreading signal out into areas you're not interested in. They extend this work to multisubject studies and show that it can increase sensitivity relative to standard smoothing approaches.

Bottom line: A nice look at where smoothing might be going and integrating with segmentation, etc.

Zarahn et. al (1997), "Empirical analyses of BOLD fMRI statistics I: Spatially unsmoothed data collected under null-hypothesis conditions," NeuroImage 5, 179-197 PDF

Summary: One of the original empirical papers examining things like true noise distribution in fMRI; the authors looked at things like spatial and temporal coherence in the data and noise profiles from real subjects at rest and from phantom data.

Bottom line: Spatial noise was mostly just noise, but did contain some coherence, which was far greater at lower temporal frequencies - one of the big sources of noise preventing low-temporal-frequency experiments in fMRI from being super effective.


White et. al (2001), "Anatomic and functional variability: the effects of filter size in group fMRI data analysis," NeuroImage 13, 577-588 PDF

Summary: A more focused study looking at smoothing filter sizes, testing out sizes between 0 and 18 mm. A nice look at exactly what happens to your activation clusters as you gradually increase or decrease filter size, including enlargement of activations and merging of apparently separate clusters...

Also see Desmond & Glover (2002) in DesignPapers, which shows that spatial smoothing at 5 mm reduced their within-subject variability estimates substantially.


Temporal Filtering FAQ

Frequently Asked Questions - Temporal Filtering

1. Why do filtering? What’s it buy you?

Filtering in time and/or space is a long-established method in any signal detection process to help "clean up" your signal. The idea is if your signal and noise are present at separable frequencies in the data, you can attenuate the noise frequencies and thus increase your signal to noise ratio.

One obvious way you might do this is by knocking out frequencies you know are too low to correspond to the signal you want - in other words, if you have an idea of how fast your signal might be oscillating, you can knock out noise that is oscillating much slower than that. In fMRI, noise like this can have a number of courses - slow "scanner drifts," where the mean of the data drifts up or down gradually over the course of the session, or physiological influences like changes in basal metabolism, or a number of other sources. This type of filtering is called "high-pass filtering," because we remove the very low frequencies and "pass through" the high frequencies. Doing this in the spatial domain would correspond to highlighting the edges of your image (preserving high-frequency information); in the temporal domain, it corresponds to "straightening out" any large bends or drifts in your timecourse. Removing linear drifts from a timecourse is the simplest possible high-pass filter.

Another obvious way you might do this would be the opposite - knock out the frequencies you know are too high to correspond to your signal. This removes noise that is oscillating much faster than your signal from the data. This type of filtering is called "low-pass filtering," because we remove the very high frequencies and "pass through" the low frequencies. Doing this in the spatial domain is simply spatial smoothing (see SmoothingFaq); in the temporal domain, it corresponds to temporal smoothing. Low-pass filtering is much more controversial than high-pass filtering, a controversy explored by the papers in TemporalFilteringPapers.

Finally, you could apply combinations of these filters to try and restrict the signal you detect to a specific band of frequencies, preserving only oscillations faster than a certain speed and slower than a certain speed. This is called "band-pass filtering," because we "pass through" a band of frequencies and filter everything else out, and is usually implemented in neuroimaging as simply doing both high-pass and low-pass filtering separately.

In all of these cases, the goal of temporal filtering is the same: to apply our knowledge about what the BOLD signal "should" look like in the temporal domain in order to remove noise that is very unlikely to be part of the signal. This buys us better SNR, and a much better chance of detecting real activations and rejecting false ones.

2. What actually happens to my signal when I filter it? How about the design matrix?

Filtering is a pretty standard mathematical operation, so all the major neuroimaging programs essentially do it the same way. We'll use high-pass as an example, as low-pass is no longer standard in most neuroimaging programs. At some point before model estimation, the program will ask the user to specify a cutoff parameter in Hz or seconds for the filter. If specified in seconds, this cutoff is taken to mean the period of interest of the experiment; frequencies that repeat over a timescale longer than the specified cutoff parameter are removed. Once the design matrix is constructed but before model estimation has begun, the program will filter each voxel's timecourse (the filter is generally based on some discrete cosine matrix) before submitting it to the model estimation - usually a very quick process. A graphical representation of the timecourse would show a "straightening out" of the signal timecourse - oftentime timecourses will have gradual linear drifts or quadratic drifts, or even higher frequency but still gradual bends, which are all flattened away after the filtering.

Other, older methods for high-pass filtering simply included a set of increasing-frequency cosines in the design matrix (see Holmes et. al below), allowing them to "soak up" low-frequency variance, but this is generally not done explicitly any more.

Low-pass filtering proceeds much the same way, but the design matrix is also usually filtered to smooth out any high frequencies present in it, as the signal to be detected will no longer have them. Low-pass filters are less likely to be specified merely with a lower-bound period-of-interest cutoff; oftentimes low-pass filters are constructed deliberately to have the same shape as a canonical HRF, to help highlight signal with that shape (as per the matched-filter theorem).

3. What’s good about high-pass filtering? Bad?

High-pass filtering is relatively uncontroversial, and is generally accepted as a good idea for neuroimaging data. One big reason for this is that the noise in fMRI isn't white - it's disproportionately present in the low frequencies. There are several sources for this noise (see PhysiologyFaq and BasicStatisticalModelingFaq for discussions of some of them), and they're expressed in the timecourses sometimes as linear or higher-order drifts in the mean of the data, sometimes as slightly faster but still gradual oscillations (or both). What's good about high-pass filtering is that it's a straightforward operation that can attenuate that noise to a great degree. A number of the papers below study the efficacy of preprocessing steps, and generally it's found to significantly enhance one's ability to detect true activations.

The one downside of high-pass filtering is that it can sometimes be tricky to select exactly what one's period of interest is. If you only have a single trial type with some inter-trial interval, then your period of interest of obvious - the time from one trial's beginning to the next - but what if you have three or four? Or more than that? Is it still the time from one trial to the next? Or the time from one trial to the next trial of that same type? Or what? Skudlarski et. al (TemporalFilteringPapers) point out that a badly chosen cutoff period can be significantly worse than the simplest possible temporal filtering, which would just be removing any linear drift from the data. If you try and detect an effect whose frequency is lower than your cutoff, the filter will probably knock it completely out, along with the noise. On the other hand, there's enough noise at low frequencies to almost guarantee that you wouldn't be able to detect most very slow anyways. Perfusion imaging does not suffer from this problem, one of its benefits - the noise spectrum for perfusion imaging appears to be quite flat.

4. What’s good about low-pass filtering? Bad?

Low-pass filtering is much more controversial in MRI, and even in the face of mounting empirical evidence that it wasn't doing much good, the SPM group long offered some substantial and reasonable arguments in favor of it. The two big reasons offered in favor of low-pass filtering broke down as:

  1. The matched-filter theorem suggests filtering our timecourse with a filter shaped like an HRF should enhance signals of that shape relative to the noise, and
  2. We need to modify our general linear model to account for all the autocorrelation in fMRI noise; one way of doing that is by conditioning our data with a low-pass filter - essentially 'coloring' the noise spectrum, or introducing our own autocorrelations - and assuming that our introduced autocorrelation 'swamps' the existing autocorrelations, so that they can be ignored. (See BasicStatisticalModelingFaq for more on this.) This was a way of getting around early ignorance about the shape of the noise spectrum in fMRI and avoiding the computational burden of approximating the autocorrelation function for each model. Even as those burdens began to be overcome, Friston et. al (TemporalFilteringPapers) pointed out potential problems with pre-whitening the data as opposed to low-pass filtering, relating to potential biases of the analysis.

However, the mounting evidence demonstrating the failure of low-pass filtering, as well as advances in computation speed enabling better ways of dealing with autocorrelation, seem to have won the day. In practice, low-pass filtering seems to have the effect of greatly reducing one's sensitivity to detecting true activations without significantly enhancing the ability to reject false ones (see Skudlarksi et. al, Della-Maggiore et. al on TemporalFilteringPapers). The problem with low-pass filtering seems to be that because noise is not independent from timepoint to timepoint in fMRI, 'smoothing' the timecourse doesn't suppress the noise but can, in fact, enhance it relative to the signal - it amplifies the worst of the noise and smooths the peaks of the signal out. Simulations with white noise show significant benefits from low-pass filtering, but with real, correlated fMRI noise, the filtering because counter-effective. Due to these results and a better sense now of how to correctly pre-whiten the timeseries noise, low-pass filtering is now no longer available in SPM2, nor is it allowed by standard methods in AFNI or BrainVoyager.

5. How do you set your cutoff parameter?

Weeeeelll... this is one of those many messy little questions in fMRI that has been kind of arbitrarily hand-waved away, because there's not a good, simple answer for it. You'd to like to filter out as much noise as possible - particularly in the nasty part of the noise power spectrum where the noise power increases abruptly - without removing any important signal at all. But this can be a little trickier than it sounds. Roughly, a good rule of thumb might be to take the 'fundamental frequency' of your experiment - the time between one trial start and the next - and double or triple it, to make sure you don't filter out anything closer to your fundamental frequency.

SPM99 (and earlier) had a formula built in that would try and calculate this number. But if you had a lot of trial types, and some types weren't repeated for very long periods of time, you'd often get filter sizes that were way too long (letting in too much noise). So in SPM2 they scrapped the formula and now issue a default filter size of 128 seconds for everybody, which isn't really any better of a solution.

In general, default sizes of 100 or 128 seconds are pretty standard for most trial lengths (say, 8-45 seconds). If you have particularly short trials (less than 10 seconds) you could probably go shorter, maybe more like 60 or 48 seconds. But this is a pretty arbitrary part of the process. The upside is that it's hard to criticize an exact number that's in the right ballpark, so you probably won't get a paper rejected purely because your filter size was all wrong.


Temporal Filtering Papers

Useful Papers - Temporal Filtering


Friston et. al (2000), "To smooth or not to smooth?: bias and efficiency in fMRI time-series analysis," NeuroImage 12, 196-208 PDF

Summary: The major last defense of low-pass filtering (aka temporal smoothing) mounted by the SPM group before abandoning it for a different mathematical tack in estimation, more amenable to pre-whitening. The case made is that pre-whitening, while more efficient at removing variance from the timecourses, can be more sensitive to bias due to errors in estimating the autocorrelation, and so band-pass filtering has its place.

Bottom line: Improvements in autocorrelation estimation and robustness of pre-whitening have rendered this paper basically obsolete, but it's a good look at the very detailed issues surrounding the apparently simple question of temporal filtering.

Purdon & Weiskoff (1998), "Effect of temporal autocorrelation due to physiological noise and stimulus paradigm on voxel-level false-positive rates in fMRI," Human Brain Mapping 6, 239-249 PDF

Summary: Clear and relatively concise look at exactly how noise autocorrelation can bias inferences in both directions, with specific references to existing models like the extended GLM of Friston et. al and the AR(1) model of Bullmore et. al. Proposes an AR + white noise model to model temporal autocorrelation (which ends up being largely adopted in SPM2)

Bottom line: One of the clearest and most intuitive overviews of what the issue is with temporal autocorrelation and modeling it.


Skudlarski et. al (1999), "ROC analysis of statistical methods used in functional MRI: individual subjects," NeuroImage 9, 311-329 PDF

Summary: Once again, Skudlarski et. al's look at single-subject simulated data, analyzing various preprocessing choices with receiver operating characteristic (ROC) curves, measuring both types of error at a given threshold. They show convincingly that low-pass filtering with real fMRI noise hurts sensitivity, while high-pass filtering can help - although only a precise cutoff threshold choice improved high-pass filtering above simple quadratic and linear trend removal.

Bottom line: Linear and quadratic trend removals are a must. Above that, if you can get in the ballpark of a good cutoff threshold, a full high-pass filter is good, too. Low-pass filtering's nothing but trouble.

Della-Maggiore et. al (2002), "An empirical comparison of SPM preprocessing parameters to the analysis of fMRI data," NeuroImage 17, 19-28 PDF

Summary: Another look at this paper, which examines various preprocessing choices from simulated data with the twin measures of power and false-positive rate. They find high-pass filtering to be useful with a fixed ISI (but not particularly for variable ISIs). They find both low-pass filtering and pre-whitening to decrease power but only pre-whitening to be especially good at protecting against false-positive increases.

Bottom line: High-pass is good for fixed-ISI studies but maybe useless in variable-ISI studies; low-pass filtering tends to decrease power without protecting against inferential bias particularly well.

Holmes et. al (1997), "Statistical modelling of low-frequency confounds in fMRI", NeuroImage 5, S480 (PDF not available yet - see Jeff for paper copies)

Summary: An oldie but a goodie, Holmes et. al shows the first published example of a design matrix including the high-pass filter that becomes a staple of later versions of SPM. The confounds are modeled in basic but effective fashion - a set of cosines of gradually increasing frequency, up to a certain cutoff point. They demonstrate that these regressors account for a good chunk of noise variance.

Bottom line: High-pass filtering works to absorb noise; modeling with cosines is effective and computationally tractable.


To 3D

  • Purpose: This command allows the user to take your raw images and convert them to AFNI format. Afni

    does however, now accept Analyze format, but I think overall there are some advantages to converting to the AFNI .BRIK & .HEAD format.

  • Usage: Below is an example of how to use this command. My lab has always taken the Pfile.7.mag and

    converted that to Analyze format first. So, the example below will show you how to go from Analyze format to AFNI format. If you'd like to learn how to convert to afni directly from the P.mag file, or if you have a completely different file structure, then I suggest you do one of three things: 1) go to the Afni website, 2) type "to3d -help" into a linux command prompt that has Afni installed, or 3) talk to Philippe Goldin who is the ultimate AFNI guru. I will give an explanation of my example below.

Functional Dataset Example

           to3d -prefix myfxnls -orient RPI -2swap -time:zt 12 210 2000 seq+z *hdr

Anatomical Dataset Example

           to3d -prefix anatomy -orient RPI seq+z anatomy*.hdr
  • The subcommand "-prefix" allows you to specify the output name for the files whether they be for an anatomical or functional dataset. It also tells to3d that you want to use the batch method of converting to afni, so a gui window will not open up. If you want the gui to open up, just leave out this option as well as the "-orient".
  • The "-orient RPI" option specifies the orientation of your 3D volumes. The code must be 3 letters, one from each of the pairs {R,I} {A,P} {I,S}. The first letter gives the orientation of the x-axis, the second the orientation of the y-axis, the third the z-axis:
      R = right-to-left              L = left-to-right
      A = anterior-to-posterior      P = posterior-to-anterior
      I = inferior-to superior       S = superior to inferior

As already mentioned with the "to3d" command you also have the option of having a GUI open up so you don't have to guess what direction your 3D volumes are in. If that is the case, just leave out the "- orient RPI" option and the "-prefix" option and the GUI will open automatically. Once you click on view images, you will want to go to the upper left corner of the window and select the orientation codes. At the bottom right corner you can put in your prefix name for the dataset.

  • For functional datasets, the "-time:zt 12 210 2000 seq+z" is used to specify the parameters of your scan. "-time:zt" specifies how your slices were collected in the time domain. Specifically, this subcommand tells "to3d" that your slices are input in the order z-axis first, then t-axis. The numbers "12 210 2000" are still a part of the -time:zt subcommand and refer to # of slices, # of timepoints collected, your TR. "seq+z" indicates that your slices were taken sequentially in the plus direction.
  • The "*hdr" refers to all of the analyze format images you collected for one particular run whether that run was for anatomical images or functional images.

fMRI Physics Links

Links - fMRI Physics

A couple nice links passed on from Philippe: