Job Resource Limits & Prioritization
Resources Limits
Time Limits
Set a realistic time limit for your job. This time limit should be a little longer than you expect each slurm job step to run. After the time limit specified has been reached, the job step will be killed.
If you set a very long time limit, then your job may be stuck in the queue longer than necessary. By setting shorter time limits (in the range of 10 minutes to 2 hours), your jobs will be backfilled into the queue as resources become available, and will run sooner.
If your job steps take a long time to run, consider refactoring your code to break your work load in to smaller job steps. For example, split a "for" loop into individual job steps, or use the slurm dependency flag for long serial jobs. Conversely, if your job steps are under 10 minutes, your job could take longer to run due to the overhead work of the scheduler starting each step. In this case, consider adding a for loop to run multiple iterations serially within a job step.
Memory Allocations
Set realistic memory allocations. If a job step exceeds your memory allocation, it will be automatically killed, so be sure to allow some overhead. If an unnecessarily large allocation is set, then the job may be stuck in the queue longer than necessary, and it will unecessarily allocate resources that are not actually used by the job.
Unfortunately, there is no good way to determine ahead of time how much memory your job will consume, so some trial and error is necessary. Try running a single iteration of your job, then use sacct-detail to see how much memory was used.
Job Prioritization
The cluster queue is designed with two goals:
- Ensure that everyone has equal access to resources
- Minimize idle resources
These goals are inherently in conflict. Consider these scenarios:
- To ensure equal access, we could impose limits on the resources allocated to each user, for example limiting each job to a fixed percentage of the cluster's CPU cores. However, this method would result in idle resources when the cluster is underutilized, unnecessarily prolonging job runtimes.
- On the other hand, if no limits are imposed on resource allocation, then the first person to run a job when the cluster is idle could monopolize cluster usage for an extended period.
To balance these competing objectives, we try to ensure equal access to cluster resources over a long period of time (i.e. weeks, rather than hours or days). This is acheived by using several factors to determine job priority:
- Queue Wait Time: Jobs that have been stuck in the queue longer have higher priority
- Job Size: Smaller jobs have slightly higher priority.
- Recent Cluster Usage: Resource consumption is tracked over time for each user. As a user consumes more resources, their job priority decreases. Conversely, When a user's account is idle, their priority gradually increases.
In addition, our Partitions are designed to reserve some capacity for jobs that consume less 30GB of memory, or run for less than 12 hours in the short partition.