RapidPT
References
RapidPT is based on the work presented on this paper:
Accelerating Permutation Testing in Neuroimaging through Subspace Tracking: A new plugin for SnPM.
F. GutierrezBarragan, V.K. Ithapu, C. Hinrichs, C. Maumet, S.C. Johnson, T.E. Nichols, V. Singh.
Under Review
Table of Contents
Overview
Multiple hypothesis testing is a problem in neuroimaging studies. Permutation testing is a nonparametric method for estimating a threshold that can identify what brain regions that display statistically significant differences or activity. The computational burden of this method, however, for low thresholds and large datasets can be prohibitive.
RapidPT is a MATLAB toolbox for fast, reliable, hardware independent, permutation testing.
1. Fast: RapidPT has shown speedups ranging from 301000x faster than simple permutation testing implementations (left), and 240x faster than SnPM (right), a state of the art nonparametric testing toolbox in neuroimaging.
2. Reliable: RapidPT has been validated against SnPM and a simple permutation testing implementation. Three validation measurements were used: the KLDivergence between max null distributions, the corrected pvalues, and the resampling risk. Hundreds of validation runs have been done with various neuroimaging datasets composed from 50 up to 400 subjects. For more information on the performance of RapidPT refer to the reference paper.
3. Hardware Independent: It has been shown that with powerful enough hardware (highend GPUs or a cluster) and an efficient implementation, the permutation testing procedure can be spedup by orders of magnitude. These implementations rely on expensive hardware. RapidPT, however, takes advantage of the structure of the problem to speedup the algorithm, allowing it to be efficient even in regular workstations. Furthermore, the toolbox is able to leverage multicore environments when available.
A thorough analysis of the scenarios were RapidPT performs best is done in the paper.
Use cases
RapidPT can be used for the nonparametric statistical analysis of neuroimaging data. The permutation testing procedure modeled by RapidPT is a nonparametric combination of twosample ttest. Two sample ttest are typically used to determine if two population means are equal. In neuroimaging this procedure could be used in scenarios such as placebocontrol clinical trials or activation studies.
Setup
Simply clone the repository and add the path of the repository inside your program to be able to call the functions.
addpath('PATH_WHERE_YOU_CLONED_THE_REPOSITORY');
%% YOUR MATLAB PROGRAM GOES HERE.
If you don’t want to have the addpath
line in every program you make, you can have it in your startup.m
file for you MATLAB setup.
Usage
RapidPT only offer a function that performs Permutation Testing. It has no GUI, pre or post processing modules. We have prepared a plugin for SnPM that allows the user to take advantage of SnPM’s GUI, pre and post processing capabilities. Please refer to the Usage Within SnPM section.
There are two ways to use the core of RapidPT, either by calling the wrapper function TwoSampleRapidPT.m
or directly calling the core function RapidPT.m
. TwoSampleRapidPT
assigns some default inputs that have been extensively tested that produce an accurate recovery of the max null distribution and then calls RapidPT
. On the other hand if you call RapidPT
directly you will have to assign these parameters. Let’s first go through Example_TwoSampleRapidPT.m
:
Example_TwoSampleRapidPT.m

First add the path to where you cloned/downloaded the
RapidPT
repository, and also load the data you will be working with. The data matrix needs to be anNxV
, whereN
is the total number of subjects and V is the number of voxel statistics per subject.RapidPTLibraryPath = '.'; addpath(RapidPTLibraryPath); dataPath = 'PATH TO DATA'; load(dataPath);

Set number of permutations and the number of subjects in either group 1 or 2 (it does not matter which one you specify).
numPermutations = 5000; nGroup1 = 25; % You should what is the size of one of your groups prior.

Set
write
. If set to 0, outputs will only contain the constructed maximum null distribution. If set to 1, the outputs struct will contain the basis matrix,U
, and coefficient matrixW
.U*W
recover the permutation matrix. For an in depth explanation see the references. 
Call
TwoSampleRapidPT.m
.outputs
is a struct containingoutputs.MaxT
,outputs.U
, andoutputs.W
.timings
is a struct containing timing information of different part ofRapidPT
as well as the total timing.[outputs, timings] = TwoSampleRapidPT(Data, numPermutations, nGroup1, write, RapidPTLibraryPath);

Optionally save
outputs
andtimings
. 
Get the tthreshold estimate from the recovered maximum null distribution.
alpha_threshold = 1; % 1 percent t_threshold = GetTThresh(outputs.MaxT, alpha_threshold);

If you have the labels of the data available, you can calculate the twosample ttest and see compare the result of each voxel to the tthreshold and see which voxels exhibit statistically significant activity.
Example_RapidPT.m
Take a look at the header comments of RapidPT.m
and the comments in Example_RapidPT.m
to see how to directly call RapidPT.m
. It is recommended to use TwoSampleRapidPT.m
in order to avoid hyperparameter tuning.
Usage within SnPM
Prerequistes

SPM12  In order to be able to use RapidPT within SPM/SnPM you will need to have SPM12 setup (obviously). For an overview of how to install SPM please refer to their wiki. If you have spm setup, running
spm
in the MATLAB command line should launch a GUI such as the one shown in the section snpm usage. 
NiFTI  You will also need the NiFTI toolset. Make sure the NiFTI toolset path is added before you run SnPM.
addpath('NiFTI toolset path')
. 
Git (recommended)  The setup below uses git to clone the repositories. Instead of cloning them you can also download the zip files from the links given throughout the setup instructions.
SnPM + RapidPT Setup
Currently to use RapidPT within SnPM you will have to download the development version. To do this, go to wherever your SPM installation/folder is (mine is under my MATLAB folder) and do the following commands:
cd WHEREVER YOUR SPM DOWNLOAD IS
cd spm12/toolbox/
git clone https://github.com/nicholst/SnPMdevel
cd SnPMdevel/
SnPM has a flag that determine when RapidPT is used. Depending on the value of the flag SnPMdefs.RapidPT
RapidPT is:
 Always Used:
SnPMdefs.RapidPT = 2
.  Sometimes Used:
SnPMdefs.RapidPT = 1
. Used whennPerm >= 10000
.  Never Used:
SnPMdefs.RapidPT = 0
.
This flag should be set in snpm_defaults.m
on line 61. It is by default set to 0.
SnPM Usage
This would be a good time to read the important notes below.
Now that you have setup RapidPT within SnPM, SnPM will work very similar to before. Launch SPM,
spm
Now follow these steps:
 Go to SPM Menu window.
 Click on Batch and go to the batch window that just opened.
SPM GUI  SPM Batch 

 On the navigation bar click on SPM, then
tools/SnPM/Specify/2 Groups Two Sample T test; 1 scan per subject
.  Here you will be able to specifiy a folder where you want your outputs to be (
Analysis Directory
), your input data (.nii images of group1 and group2), and also the number of permutations you want to do.  Click the green run button. This creates an SnPM config file in the path where you want your outputs to be. This step should take a few seconds only.
 Go to SPM navigation bar again, then
Tools/SnPM/Compute
 Set the SnPM cfg file to the one you just made by clicking on the run button.
 Click the green run button again, and now SnPM will run with RapidPT.
 Once you are done, go to the directory that you selected as your
Analysis Directory
and look at the outputs.
Postprocessing and Outputs
Once you are done, inside your analysis
directory you will find a folder called outputs
. This folder contains the results. If you follow the results section of the (SnPM tutorial)[http://www2.warwick.ac.uk/fac/sci/statistics/staff/academicresearch/nichols/software/snpm/man/ex] you can see an example of the postprocessing capabilities of SnPM. The most important output files in the analysis directory will be:
SnPM.mat
: Refer to snpm_cp.m for a thorough explanations of the contents of this output file. But this file contains one of the objects of interest the variableMaxT
, which is an nPerm x 2 matrix of [max;min] tstatistic i.e it contains the Max null distribution.SnPMt.mat
: This is the resulting test statistic calculated using the original labels of the data.XYZ.mat
: This matrix has the x, y, z coordinates associated to each voxel.params.mat
: This structure contains the following parameters of the permutation testing run: nPerm (number of permutations), N (number of subjects), V (number of voxels after preprocessing), xdim, ydim, zdim.timings.mat
: Contains some timing from rapidpt.SnPMucp.mat
: Contains a 1 x NumVoxels matrix of the nonparametric P values of the statistic of interest supplied for all voxels at locations XYZ.
The following plot is a histogram of the Maxnull distribution in MaxT and a tthreshold associated to an alpha=0.05.
Improtant Notes (PLEASE READ BEFORE USING):

RapidPT is only available for Two Sample ttest right now because it is the procedure that has been extensively validated and benchmarked. Regular SnPM should run if you try running SnPM with any other tests.

For a thorough analysis of the ideas and the RapidPT algorithm please refer to the references
Code Organization
RapidPT
RapidPT.m
This is the core of RapidPT. This is where the main algorithm and math ideas described in the NIPS paper happen.
TwoSampleRapidPT.m
This is a wrapper of the core. This function will assign most of the hyperparameters that can be given to RapidPT.m
for you. The hyperparameters chosen have been extensively tested, and some of them are derived from the data dimensions and number of permutations chosen.
Example_TwoSampleRapidPT.m
This is an example script that uses TwoSampleRapidPT.m
wrapper program.
Example_RapidPT.m
This is an example script that directly uses RapidPT.m
. You will note that a lot more hyperparameters need to be passed to RapidPT.m
compared to TwoSampleRapidPT.m
.
TwoSampleGetLabelsMatrices.m
This is a function that given: numPermutations
(Number of Permutations to be done), N
(total number of subjects), nGroup1
(The number of subjects in group1), returns the labels for each subject in each group that will be used at each permutation.
include/
The include/
folder contains the library grasta. This library contains the online matrix completion method we use to accelerate permutation testing. For more information about grasta you can refer to the project website.
outputs/ & timings/
These are the default directories used to output the resulting maxnull distribution, and timings of different sections of the algorithm. If the flag inputs.write
is set to 1
the lowrank basis, U
, and the coefficient matrix, W
, that can recover the permutation matrix will also be saved.
util/
This directory contains various utility functions used by RapidPT for input validation and postprocessing. Separating these functions from the main code makes TwoSampleRapidPT.m
more concise.
Related Work
The main theorem motivating this work was presented in this conference paper:
C. Hinrichs, V. K. Ithapu, Q. Sun, S. C. Johnson, V. Singh.
Speeding up Permutation Testing in Neuroimaging
Advances in Neural Information Processing Systems (NIPS), 2013