I was trying to learn the RNN interfaces which libtorch provides. Sadly, a simple code like the following throws out runtime error
#include <torch/torch.h>
int main(int /*argc*/, char* /*argv*/[]) {
// Use GPU when present, CPU otherwise.
torch::Device device(torch::kCPU);
if (torch::cuda::is_available()) {
device = torch::Device(torch::kCUDA);
std::cout << "CUDA is available! Training on GPU." << std::endl;
}
torch::nn::Sequential time_serie_detector(
torch::nn::RNN(torch::nn::RNNOptions(1, 10).dropout(0.2).layers(2).tanh()));
time_serie_detector->to(device);
std::cout << time_serie_detector << std::endl;
auto x = torch::ones(1).toBackend(c10::Backend::CUDA);
auto a = torch::ones(10).toBackend(c10::Backend::CUDA);
std::cout << "x = " << x << std::endl;
std::cout << "a = " << a << std::endl;
time_serie_detector->forward(x, a);
time_serie_detector->zero_grad();
return 0;
}
The error is as follows:
terminate called after throwing an instance of 'c10::IndexError'
what(): Dimension out of range (expected to be in range of [-1, 0], but got 2) (maybe_wrap_dim at ../../c10/core/WrapDimMinimal.h:20)
I am pretty new to libtorch and pytorch in general. I guess I made some terribly stupid mistakes here⦠It would be nice if someone could help me out.
By the way, I was trying to find C++ example using torch::nn::RNN online but failed to find any. I would really appreciate some existing examples.
I think you are not passing in the right shapes - RNNs expect a seq x batch x feature input and output something of that shape (with the number of features potentially different). The hidden state would be batch x num_hidden (or a tuple of two things).
Bidirectional RNNs double output features and num_hidden.
We all started somewhere, but yes, it may be wise to follow one of the RNN-based tutorials, e.g. Name generation one.
Given that the C++ library follows the Python one so closely, the Python tutorials are applicable here.