I need to do some operations on tensors like torch.bmm, torch.eig and torch.ger. Somehow if i convert the tensors to cpu and do the calculations, it works faster. Is it expected?
There are generally a set of cases where running on CPU is faster than running on GPU: one common case where this happens if the input sizes are small. How large are your inputs?
I am doing some operations on batch size 100 and tensors of size (784, 300), (300, 100), (100, 10). but mostly i work on the average overall the batches. so i use torch.mean to take the average of tensors on the batch and then use torch.bmm and torch.ger and torch.eig on tensors of size (784, 300), (300, 100), (100, 10).