Temporal Filtering FAQ

Frequently Asked Questions - Temporal Filtering

1. Why do filtering? What’s it buy you?

Filtering in time and/or space is a long-established method in any signal detection process to help "clean up" your signal. The idea is if your signal and noise are present at separable frequencies in the data, you can attenuate the noise frequencies and thus increase your signal to noise ratio.

One obvious way you might do this is by knocking out frequencies you know are too low to correspond to the signal you want - in other words, if you have an idea of how fast your signal might be oscillating, you can knock out noise that is oscillating much slower than that. In fMRI, noise like this can have a number of courses - slow "scanner drifts," where the mean of the data drifts up or down gradually over the course of the session, or physiological influences like changes in basal metabolism, or a number of other sources. This type of filtering is called "high-pass filtering," because we remove the very low frequencies and "pass through" the high frequencies. Doing this in the spatial domain would correspond to highlighting the edges of your image (preserving high-frequency information); in the temporal domain, it corresponds to "straightening out" any large bends or drifts in your timecourse. Removing linear drifts from a timecourse is the simplest possible high-pass filter.

Another obvious way you might do this would be the opposite - knock out the frequencies you know are too high to correspond to your signal. This removes noise that is oscillating much faster than your signal from the data. This type of filtering is called "low-pass filtering," because we remove the very high frequencies and "pass through" the low frequencies. Doing this in the spatial domain is simply spatial smoothing (see SmoothingFaq); in the temporal domain, it corresponds to temporal smoothing. Low-pass filtering is much more controversial than high-pass filtering, a controversy explored by the papers in TemporalFilteringPapers.

Finally, you could apply combinations of these filters to try and restrict the signal you detect to a specific band of frequencies, preserving only oscillations faster than a certain speed and slower than a certain speed. This is called "band-pass filtering," because we "pass through" a band of frequencies and filter everything else out, and is usually implemented in neuroimaging as simply doing both high-pass and low-pass filtering separately.

In all of these cases, the goal of temporal filtering is the same: to apply our knowledge about what the BOLD signal "should" look like in the temporal domain in order to remove noise that is very unlikely to be part of the signal. This buys us better SNR, and a much better chance of detecting real activations and rejecting false ones.

2. What actually happens to my signal when I filter it? How about the design matrix?

Filtering is a pretty standard mathematical operation, so all the major neuroimaging programs essentially do it the same way. We'll use high-pass as an example, as low-pass is no longer standard in most neuroimaging programs. At some point before model estimation, the program will ask the user to specify a cutoff parameter in Hz or seconds for the filter. If specified in seconds, this cutoff is taken to mean the period of interest of the experiment; frequencies that repeat over a timescale longer than the specified cutoff parameter are removed. Once the design matrix is constructed but before model estimation has begun, the program will filter each voxel's timecourse (the filter is generally based on some discrete cosine matrix) before submitting it to the model estimation - usually a very quick process. A graphical representation of the timecourse would show a "straightening out" of the signal timecourse - oftentime timecourses will have gradual linear drifts or quadratic drifts, or even higher frequency but still gradual bends, which are all flattened away after the filtering.

Other, older methods for high-pass filtering simply included a set of increasing-frequency cosines in the design matrix (see Holmes et. al below), allowing them to "soak up" low-frequency variance, but this is generally not done explicitly any more.

Low-pass filtering proceeds much the same way, but the design matrix is also usually filtered to smooth out any high frequencies present in it, as the signal to be detected will no longer have them. Low-pass filters are less likely to be specified merely with a lower-bound period-of-interest cutoff; oftentimes low-pass filters are constructed deliberately to have the same shape as a canonical HRF, to help highlight signal with that shape (as per the matched-filter theorem).

3. What’s good about high-pass filtering? Bad?

High-pass filtering is relatively uncontroversial, and is generally accepted as a good idea for neuroimaging data. One big reason for this is that the noise in fMRI isn't white - it's disproportionately present in the low frequencies. There are several sources for this noise (see PhysiologyFaq and BasicStatisticalModelingFaq for discussions of some of them), and they're expressed in the timecourses sometimes as linear or higher-order drifts in the mean of the data, sometimes as slightly faster but still gradual oscillations (or both). What's good about high-pass filtering is that it's a straightforward operation that can attenuate that noise to a great degree. A number of the papers below study the efficacy of preprocessing steps, and generally it's found to significantly enhance one's ability to detect true activations.

The one downside of high-pass filtering is that it can sometimes be tricky to select exactly what one's period of interest is. If you only have a single trial type with some inter-trial interval, then your period of interest of obvious - the time from one trial's beginning to the next - but what if you have three or four? Or more than that? Is it still the time from one trial to the next? Or the time from one trial to the next trial of that same type? Or what? Skudlarski et. al (TemporalFilteringPapers) point out that a badly chosen cutoff period can be significantly worse than the simplest possible temporal filtering, which would just be removing any linear drift from the data. If you try and detect an effect whose frequency is lower than your cutoff, the filter will probably knock it completely out, along with the noise. On the other hand, there's enough noise at low frequencies to almost guarantee that you wouldn't be able to detect most very slow anyways. Perfusion imaging does not suffer from this problem, one of its benefits - the noise spectrum for perfusion imaging appears to be quite flat.

4. What’s good about low-pass filtering? Bad?

Low-pass filtering is much more controversial in MRI, and even in the face of mounting empirical evidence that it wasn't doing much good, the SPM group long offered some substantial and reasonable arguments in favor of it. The two big reasons offered in favor of low-pass filtering broke down as:

  1. The matched-filter theorem suggests filtering our timecourse with a filter shaped like an HRF should enhance signals of that shape relative to the noise, and
  2. We need to modify our general linear model to account for all the autocorrelation in fMRI noise; one way of doing that is by conditioning our data with a low-pass filter - essentially 'coloring' the noise spectrum, or introducing our own autocorrelations - and assuming that our introduced autocorrelation 'swamps' the existing autocorrelations, so that they can be ignored. (See BasicStatisticalModelingFaq for more on this.) This was a way of getting around early ignorance about the shape of the noise spectrum in fMRI and avoiding the computational burden of approximating the autocorrelation function for each model. Even as those burdens began to be overcome, Friston et. al (TemporalFilteringPapers) pointed out potential problems with pre-whitening the data as opposed to low-pass filtering, relating to potential biases of the analysis.

However, the mounting evidence demonstrating the failure of low-pass filtering, as well as advances in computation speed enabling better ways of dealing with autocorrelation, seem to have won the day. In practice, low-pass filtering seems to have the effect of greatly reducing one's sensitivity to detecting true activations without significantly enhancing the ability to reject false ones (see Skudlarksi et. al, Della-Maggiore et. al on TemporalFilteringPapers). The problem with low-pass filtering seems to be that because noise is not independent from timepoint to timepoint in fMRI, 'smoothing' the timecourse doesn't suppress the noise but can, in fact, enhance it relative to the signal - it amplifies the worst of the noise and smooths the peaks of the signal out. Simulations with white noise show significant benefits from low-pass filtering, but with real, correlated fMRI noise, the filtering because counter-effective. Due to these results and a better sense now of how to correctly pre-whiten the timeseries noise, low-pass filtering is now no longer available in SPM2, nor is it allowed by standard methods in AFNI or BrainVoyager.

5. How do you set your cutoff parameter?

Weeeeelll... this is one of those many messy little questions in fMRI that has been kind of arbitrarily hand-waved away, because there's not a good, simple answer for it. You'd to like to filter out as much noise as possible - particularly in the nasty part of the noise power spectrum where the noise power increases abruptly - without removing any important signal at all. But this can be a little trickier than it sounds. Roughly, a good rule of thumb might be to take the 'fundamental frequency' of your experiment - the time between one trial start and the next - and double or triple it, to make sure you don't filter out anything closer to your fundamental frequency.

SPM99 (and earlier) had a formula built in that would try and calculate this number. But if you had a lot of trial types, and some types weren't repeated for very long periods of time, you'd often get filter sizes that were way too long (letting in too much noise). So in SPM2 they scrapped the formula and now issue a default filter size of 128 seconds for everybody, which isn't really any better of a solution.

In general, default sizes of 100 or 128 seconds are pretty standard for most trial lengths (say, 8-45 seconds). If you have particularly short trials (less than 10 seconds) you could probably go shorter, maybe more like 60 or 48 seconds. But this is a pretty arbitrary part of the process. The upside is that it's hard to criticize an exact number that's in the right ballpark, so you probably won't get a paper rejected purely because your filter size was all wrong.