Hi, I am using nn.DataParallel module to use multiple GPUs for my Pegasus Model.
The problem I am facing is that when DataParallel divides a batch onto multiple GPUs then the length of paraphrases produced by the PegasusForConditionalGeneration
on each GPU doesn’t have equal length as length of output depends upon input.
I can not force model to produce a fixed-length output (by truncating longer ones and padding smaller ones) as I don’t want to truncate bigger paraphrases or unnecessarily pad paraphrases to a large number
GPU:0 produces output of 48 length while GPU:1 of 26. Is there a way to solve this problem
/torch/nn/parallel/comm.py", line 235, in gather
return torch._C._gather(tensors, dim, destination)
RuntimeError: Input tensor at index 1 has invalid shape [15, 48], but expected [15, 26]