Why torch.sparse.cuda() is much faster than torch.cuda()?

The opt torch.sparse.cuda() is much faster than torch.cuda(), why?


I don’t think torch.cuda() is a function? Could you be clearer on what you compare and how please?

Thanks for your replying.



b.cuda() is much faster than a.cuda() even though a.size() is roughly equal to b.size().


在 Alban D via PyTorch Forums noreply@discuss.pytorch.org,2019年10月28日 22:08写道:


How many non-zero elements are in the sparse Tensor? The whole point of the sparse tensor is to only save the non-zero values, so there is potentially much less things to transfer to the gpu.

Hi, i have another question. When i use sparse tensor cuda() in dfferent models, the same tensor takes different time (one is based on nn.Module, another is an C++ cuda extension i wrote). What can affect the time ?

I have known the reason. The cuda() is asynchronous.

I have a kernel that needs one array of floats (for input) and one array of ints (for labels)? How to do that?


I’m not sure what the problem is. Just pass these as arguments? There is no restriction about types.
Do you have a code sample that shows what you’re trying to do?