From RAM to GRAM

Dear friends,
I am using pytorch mainly for linear algebra tasks and I have question about transfer of some tensor section from RAM to GRAM. For example

for iter in range(40):
    if flag == 2:
        tmp1[:] = A[iter*1000:(iter+1)*1000,:]
        flag = 1
    elif flag == 1:
        c = c + tmp1[:]
        tmp2[:] = A[iter*1000:(iter+1)*1000,:]
        flag = 0
    elif flag == 0:
        c = c + tmp2[:]
        tmp1[:] = A[iter*1000:(iter+1)*1000,:]
        flag = 1

A tensor is very big and located in RAM, tmp1, tmp2 and c tensors are in GPU RAM. I want to load parts of a tensor to GPU and do some calculations with them, as you can see I want do split calculation part and loading part. And the question is it done automatically asynchronous or I need to do additional configuration?

https://pytorch.org/tutorials/intermediate/pinmem_nonblock.html#a-guide-on-good-usage-of-non-blocking-and-pin-memory-in-pytorch

This tutorial would interest you.

1 Like