How to set the --ntasks or --ntasks-per-node argument in slurm for running multi-node distributed training?

Hi,
I want to run the official video classification script here.

I am not sure how to properly set the SBATCH argument --ntasks or --ntasks-per-node if I want to run this script on 2 nodes with 8 V100 GPUs each on slurm.
like --ntasks 16
or --ntasks-per-node 8 ?
Any advice? thanks

1 Like