How to write compact states initialization?

Dawid_S · March 19, 2018, 8:31am

Because my code is in the class method and after two ifs, it looks this way:

        if self.is_cuda:
            states = (torch.autograd.Variable(
                torch.zeros(self.model.num_layers, batch_size,
                            self.model.hidden_size)).cuda(),
                      torch.autograd.Variable(
                torch.zeros(self.model.num_layers, batch_size,
                            self.model.hidden_size)).cuda())
            inputs = torch.autograd.Variable(
                torch.from_numpy(context)).cuda()
        else:
            states = (torch.autograd.Variable(
                torch.zeros(self.model.num_layers, batch_size,
                            self.model.hidden_size)),
                      torch.autograd.Variable(
                torch.zeros(self.model.num_layers, batch_size,
                            self.model.hidden_size)))

which I find really awful. How can I refactor it to more compact look?

EDIT: I am initializing states for LSTMs for language models

Dawid_S · March 19, 2018, 8:43am

Maybe states could be (None, None)?

jpeg729 · March 19, 2018, 9:32am

If you use nn.LSTM and you set states = None, then it will automatically initialise the hidden state.

Dawid_S · March 20, 2018, 10:51am

How can it know that it will execute on cuda then?

jpeg729 · March 20, 2018, 10:54am

It create the hidden state using input.data.new() which basically makes a new tensor of the same type and on the same device as input.data.

So if your LSTM input is on cuda, then the hidden state will be created on cuda too.