How can I implement an environment run purely on GPU?

ShawnGuo · February 27, 2019, 2:02pm

I was wondering about how can I implement an environment purely on GPU. Say, if all the variables in the environment class are torch.Tensor, well they stay on GPU during run-time?

Take the following environment for example:

class ENV_GPU(object):
    def __init__(self, a=3):
        self.num = torch.zeros((a,), dtype=torch.int8)
    def step(action):
        self.num[action] += 1

Kushaj · February 27, 2019, 2:13pm

ENV_GPU().to(torch.device('cuda')

It will move your ENV_GPU object to GPU.

ShawnGuo · March 2, 2019, 1:20pm

Thanks for your answering. However, as there is no such method called “to” in ENV_GPU, an AttributeError raised.

ptrblck · March 2, 2019, 1:39pm

.to() is implemented in nn.Module.
If you derive your class from nn.Module and define step as forward it should work:

class ENV_GPU(nn.Module):
    def __init__(self, a=3):
        super(ENV_GPU, self).__init__()
        num = torch.zeros((a,), dtype=torch.int8)
        self.register_buffer('num', num)        
        
    def forward(self, action):
        self.num[action] += 1

model = ENV_GPU()
model.to('cuda')
model(0)
print(model.num)
> tensor([1, 0, 0], device='cuda:0', dtype=torch.int8)

ShawnGuo · March 2, 2019, 1:48pm

Thanks for your answering. May I ask what is "self.register_buffer(‘num’, num) " for?

ptrblck · March 2, 2019, 1:52pm

The attributes of a module will be moved to the device, if they are registered as buffers (i.e. they don’t need gradients) or as an nn.Parameter (i.e. they should be updated).
Since you defined num as torch.int8, an nn.Parameter won’t work, as only floating point tensors can require gradients.
If you just register num as self.num = torch.zeros((a,), dtype=torch.int8) it won’t be moved to the device.

ShawnGuo · March 2, 2019, 1:54pm

OK, I see. Thanks very much for your help.

ShawnGuo · March 2, 2019, 1:58pm

By the way, what if I need to calculate some intermediate results during every step? How can I make sure that there is no date moved between CPU and GPU? Take the following for example,

    def forward(self, action):
        self.knapsack_num += self.action2shift[action]
        self.knapsack_num = torch.clamp(self.knapsack_num, max=self.knapsack_max)
        reward = torch.zeros((1,), dtype=torch.float)
        if action == self.num_food_types:
            if torch.equal(self.expected_num, self.knapsack_num + self.warehouse_num):
                reward = 100 * torch.ones((1,), dtype=torch.float)
            else:
                reward = -100 * torch.ones((1,), dtype=torch.float)
            return self.knapsack_num, reward, True
        else:
            return self.knapsack_num, reward, False

ptrblck · March 2, 2019, 2:01pm

If you would like to create new tensors inside forward, you should pass the device using the device of an already registered tensor:

...
reward = torch.zeros((1,), dtype=torch.float, device=self.num.device)

ShawnGuo · March 2, 2019, 2:08pm

Wow, I see, this is really convenient. Is the “buffers” like an “no update required” “parameters” in a nn.Module?

ptrblck · March 2, 2019, 2:10pm

Yeah, basically just tensors registered with the module, so that they will be moved to the host or device and saved in the state_dict in case you would like to serialize your model.
The running estimates of nn.BatchNorm layers are a good example. While they don’t need gradients to be updated, they should still be moved with the layer and saved to disc.

ShawnGuo · March 2, 2019, 2:13pm

Cool! Thanks very much for your help.