Simple RNN example not working

Maverobot · September 21, 2019, 4:32pm

Hello,

I was trying to learn the RNN interfaces which libtorch provides. Sadly, a simple code like the following throws out runtime error

#include <torch/torch.h>
int main(int /*argc*/, char* /*argv*/[]) {
  // Use GPU when present, CPU otherwise.
  torch::Device device(torch::kCPU);
  if (torch::cuda::is_available()) {
    device = torch::Device(torch::kCUDA);
    std::cout << "CUDA is available! Training on GPU." << std::endl;
  }

  torch::nn::Sequential time_serie_detector(
      torch::nn::RNN(torch::nn::RNNOptions(1, 10).dropout(0.2).layers(2).tanh()));
  time_serie_detector->to(device);
  std::cout << time_serie_detector << std::endl;

  auto x = torch::ones(1).toBackend(c10::Backend::CUDA);
  auto a = torch::ones(10).toBackend(c10::Backend::CUDA);
  std::cout << "x = " << x << std::endl;
  std::cout << "a = " << a << std::endl;

  time_serie_detector->forward(x, a);
  time_serie_detector->zero_grad();

  return 0;
}

The error is as follows:

terminate called after throwing an instance of 'c10::IndexError'
  what():  Dimension out of range (expected to be in range of [-1, 0], but got 2) (maybe_wrap_dim at ../../c10/core/WrapDimMinimal.h:20)

I am pretty new to libtorch and pytorch in general. I guess I made some terribly stupid mistakes here… It would be nice if someone could help me out.

By the way, I was trying to find C++ example using torch::nn::RNN online but failed to find any. I would really appreciate some existing examples.

Cheers

tom · October 1, 2019, 7:49am

I think you are not passing in the right shapes - RNNs expect a seq x batch x feature input and output something of that shape (with the number of features potentially different). The hidden state would be batch x num_hidden (or a tuple of two things).
Bidirectional RNNs double output features and num_hidden.

We all started somewhere, but yes, it may be wise to follow one of the RNN-based tutorials, e.g. Name generation one.
Given that the C++ library follows the Python one so closely, the Python tutorials are applicable here.

Best regards

Thomas

Maverobot · October 1, 2019, 9:44pm

I have got a working code snippet in case anyone else faces the same issue.

#include <torch/torch.h>

template <typename T>
void pretty_print(const std::string& info, T&& data) {
  std::cout << info << std::endl;
  std::cout << data << std::endl << std::endl;
}

int main(int /*argc*/, char* /*argv*/[]) {
  // Use GPU when present, CPU otherwise.
  torch::Device device(torch::kCPU);
  if (torch::cuda::is_available()) {
    device = torch::Device(torch::kCUDA);
    std::cout << "CUDA is available! Training on GPU." << std::endl;
  }

  const size_t kSequenceLen = 1;
  const size_t kInputDim = 1;
  const size_t kHiddenDim = 5;
  const size_t kOuputDim = 1;
  auto time_serie_detector = torch::nn::LSTM(torch::nn::LSTMOptions(kInputDim, kHiddenDim)
                                                 .dropout(0.2)
                                                 .layers(kSequenceLen)
                                                 .bidirectional(false));
  time_serie_detector->to(device);
  std::cout << time_serie_detector << std::endl;

  torch::Tensor input = torch::empty({kSequenceLen, kInputDim});
  torch::Tensor state = torch::zeros({2, kSequenceLen, kHiddenDim});
  auto input_acc = input.accessor<float, 2>();
  size_t count = 0;
  for (float i = 0.1; i < 0.4; i += 0.1) {
    input_acc[count][0] = i;
    count++;
  }
  input = input.toBackend(c10::Backend::CUDA);
  state = state.toBackend(c10::Backend::CUDA);
  std::cout << "input = " << input << std::endl;
  time_serie_detector->zero_grad();

  auto i_tmp = input.view({input.size(0), 1, -1});
  auto s_tmp = state.view({2, state.size(0) / 2, 1, -1});

  pretty_print("input: ", i_tmp);
  pretty_print("state: ", s_tmp);

  auto rnn_output = time_serie_detector->forward(i_tmp, s_tmp);
  pretty_print("rnn_output/output: ", rnn_output.output);
  pretty_print("rnn_output/state: ", rnn_output.state);

  return 0;
}