Nutshell SPM

SPM is a software package designed to analyze brain imaging data from PET or fMRI and output a variety of statistical and numerical measures that tell you, the researcher, what parts of your subjects' brains were signficantly "activated" by different conditions of your experiment. There are a couple of phases of analyzing data with SPM: spatial preprocessing, model estimation, and results exploration. This page aims to give you a nutshell explanation of what's actually happening in each of those phases (particularly model estimation) and what some of the files floating around your results directories are for. Links to pages with more detail about each aspect of the analysis are down at the bottom of this page.

Spatial preprocessing is conceptually the most straightforward part of SPM analysis. During this phase, you can align your images with each other, warp them (normalize) so that each subject's anatomy is roughly the same shape, correct them for differences in slice time acquisition, and smooth them spatially.

These steps are used for a couple of reasons. Registration and normalization aim to line images from a single subject up (since subjects' heads move slightly during the experiment) and normalization aims to stretch and squeeze the shape of the images so that their anatomy roughly matches a standard template; both of these aim to make localizing your activations easier and more meaningful, by making individual voxels' locations in a given image file match up in a standard way to a particular anatomical location. Slice timing correction and smoothing both enable SPM to make certain assumptions about the data images - that each whole image occurred at a particular point in time (as opposed to slices being taken over the course of an image acquisition, or TR), and that noise in an image is distributed in a relatively random and independent fashion (as opposed to being localized).

Model estimation is the heart of the SPM program, and it's also the most conceptually complex. What you as a researcher want to know from your data is essentially: what (if any) parts of my subject's brain were brighter during one part of my experiment relative to another part? (For details on why "brighter" is a measure of activation, check out our ?FmriFaq page). Another way of putting this question might be this: You as a researcher have a hypothesis or model of what happened in an experiment; you have a list of different conditions and when each of them took place, and your model of the person's brain is that there was some kind of reaction in the brain for every stimulus that happened. How good a fit, then, does your hypothetical model provide to the actual MRI data you saw from the person's brain? Specifically, are there particular locations in the brain where your model was a very good fit, and others where it wasn't a good fit? The main work SPM does is to try and find those locations, because locations where your hypothesis proves to be a good fit can be described as "responding" somehow to the conditions in your experiment.

When SPM estimates a model, what it's doing is essentially a huge multiple regression at each voxel of your subject's brain, to see how well the data across the experiment fits your hypothesis, which you describe to SPM as a design matrix. When you tell SPM what your conditions are, what your onset vectors are, etc., it sets up this matrix as a guess at what contribution each condition might make to every image in your experiment. As part of that guess, it automatically convolves the effects of the hemodynamic response function with your stimulus vectors, as well as doing some temporal filtering to make sure it ignores changes in the data that aren't relevant to your conditions (see ?FilteringFaq for more on this).

Once the design matrix has been set up, SPM walks through each voxel in the brain, and does a multiple regression on the data at that point that estimates how much of a contribution every condition in your experiment made to the data and how much error was left over after all the conditions you specified are taken into account. The fit of this regression line is important - how much error is left at the voxel tells you how good your model was - but for most purposes, the actual slope of the regression line is more important. A large positive or negative partial regression slope for a given condition tells us that that condition had a large influence in determining the data at that voxel. This slope, called the beta weight or parameter weight for that condition, is saved by SPM in the beta_* images - one for each column of the design matrix, where each voxel gives the beta weight for that condition at that point.

After the model estimation is complete, you now have a set of data telling you how big an effect each condition of your experiment had at each voxel. By itself, this information may not be useful - in most fMRI experiments, any condition by itself accounts for a tiny portion of the variance. What is useful to know, though, is if one condition made a significantly greater contribution than another condition did. This is where results analysis, and specifically contrast analysis, comes in. When you evaluate your results, SPM asks you to specify a contrast in terms of weights for each conditions. If you have only two conditions in your experiment, assuming your design matrix was (A B), then a contrast vector of (1 -1) would tell SPM you wanted to see at which voxels A had a signficantly larger contribution to brain activity than B did. SPM takes this contrast vector and literally uses it to make a weighted sum of the beta images it's just created; this new image, created by giving each beta image the weight you specified and adding them together, is a con_* image. SPM then looks across the con image at the distribution of weighted parameter values, and combines them with its estimate of the leftover variance from the model estimation, and assigns every voxel a T-statistic (creating an spm_T* image). When you ask SPM to only show you the voxels that are significantly active at a certain p-threshold, it looks at that T-stat image and finds only the voxels whose t-statistics are so large as to fit above that probability threshold - voxels where their weighted parameter values were so large as to be statistically unlikely at your specified level of significance.

Those voxels are where brain activity in your experiment was heavily influenced by one condition in your experiment more than another condition, so heavily influenced as to make it unlikely that activity was just noise. Those voxels were brighter / more intense in one condition of your experiment than they were in another with great reliability, and so they're considered active.

For more detail about a particular aspect of spatial preprocessing, check out the individual FAQ pages at:

For more detail about a particular aspect of model estimation or results analysis, check out some FAQ pages at:

  • ?FmriFaq
  • ?FilteringFaq