NLP torchtext multiple cuda - the best strategy for a large imbalanced dataset


I am working on the problem of multiclass classification on a large imbalanced dataset. Next few weeks, I will have the opportunity to use 4 Cuda devices. The structure of the dataset is simple: TEXT, LABEL.
What is the best strategy in using torchtext for such a situation? Unfortunately, I could not find an example for processing on multiple cuda devices and with such a dataset (large imbalanced dataset - 20 Gb).

Thanks to everyone for the help and links to some working examples

You should be able to use DistributedDataParallel to utilize all GPUs in your model training. I’m not aware of any torchtext-specific limitations for this.

@ptrblck Thank you for your reply.
I was trying to find some example for multiclass classification on multiple cudas but without success. Do you know any of such examples?

Kind regards

You could start by creating a training script for the multi-class classification on a single device and then add the distributed training on top of it using e.g. this tutorial.