Pipeline Cluster Configuration
Below are various configuration options that may require customization if a study is particularly long or complex. Please note that significant empirical study was done to optimize the run time and memory allocations of all of the pipeline processes. As such, modifications to these parameters may dramatically increase queuing time.
Note that some options have been left out as their modification is too complex or integral to the pipeline and as such cannot be changed. If you still wish to change an option not found on this page please either contact us or file an issue.
job_name & mfile_name
These fields support string swapping with batch_hfn and batch_dfn. See Batch Context for instructions on string swapping.
Passthrough Options
Options in this field are arguments to be directly passed to either srun
or sbatch
without abstraction.
Memory & Time
Memory and time (time_limit) are fields that support expressions to scale the number of minutes/hours or megabyte/gigabytes used for a given job based on the number of samples and number of channels in a data file.
Expressions contain variables s
and c
for samples and channels respectively. The expressions must be valid Matlab expressions that result in a scalar, but also must end in scaling factor.
For memory the expressions must end in either
- m for megabytes
- g for gigabytes
For time_limit the expressions must end in either
- s for seconds
- m for minutes
- h for hours
Example:
3 + c* 0.1g
The above expression will make jobs 3 gigabytes plus 0.1 gigabytes in size for each channel in the data.
Please note that this is too much memory and is only written for example purposes
num_processors
Number of tasks, processes, sometimes matches the number of processors.
For some jobs the number of processors can be increased. Do not do this for any of the Octave jobs as they will not benefit from more processors and will results in longer queue times and wasted cpu hours. For Amica jobs, additional processors will improve runtime performance. However, be aware that more processors will result in longer queue times and therefore balancing the number of processors based on the total job time (queue time + runtime) is critical. Parallel jobs that communicate between tasks have overhead associated them so there will be diminishing returns as you increase the number of processors on a parallel job in most cases.