I cannot find a way put, and keep the model on the CUDA device. I cannot send a tensor that is already on CUDA through a model without getting the “found at least two devices, cpu and cuda” error. All I can do is put a tensor, and a model that are both only on the CPU onto the CUDA device, but I cannot be putting tensors that are already on CUDA onto the CPU only to put them back on the CUDA device, there is no point in even using CUDA at that point, right?
The full, reproducible example is below, but the lines in question are quite simple as show below…
I have a tensor that is on CUDA I want to send it though a model. This causes an error
auto the_tensor = torch::rand({42, 427}).to(device);
std::cout << net.forward(the_tensor).to(device);
terminate called after throwing an instance of 'c10::Error'
what(): Expected all tensors to be on the same device, but found at least two devices, cpu and
cuda:0! (when checking argument for argument mat1 in method wrapper_addmm)
If I do NOT put the tensor on CUDA I can run the tensor and the model both on CUDA like so
auto the_tensor = torch::rand({42, 427});
std::cout << net.forward(the_tensor).to(device);
I can also send the tensor back to the CPU and this also does NOT create an error. But, I have a large script with a lot of tensors that will already be on the CUDA device I DO NOT want to be sending the tensors from CUDA back to the CPU and then back to the CUDA device. This is why I call it a bug. How do I put the model on the CUDA device and keep it there other then putting .to(device) on the end of the model only when it is being called with forward net.forward(tensor)
auto the_tensor = torch::rand({42, 427}).to(device);
std::cout << net.forward(the_tensor.to(torch::kCPU)).to(device);
I have tried permanently putting the model on the device but nothing I try works.
net.to(device);
net->to(device);
Critic_Net().to(device);
I’ve tried many variations like these above to put the model on the CUDA device and keep it on the CUDA device but nothing works but to put the model on the CUDA device with net.forward(the_tensor).to(device);
The full, reproducible example.
#include <torch/torch.h>
using namespace torch::indexing;
torch::Device device(torch::kCUDA);
struct Critic_Net : torch::nn::Module {
torch::Tensor next_state_batch__sampled_action;
public:
Critic_Net() {
lin1 = torch::nn::Linear(427, 42);
lin2 = torch::nn::Linear(42, 286);
lin3 = torch::nn::Linear(286, 1);
}
torch::Tensor forward(torch::Tensor next_state_batch__sampled_action) {
auto h = next_state_batch__sampled_action;
h = torch::relu(lin1->forward(h));
h = torch::tanh(lin2->forward(h));
h = lin3->forward(h);
return torch::nan_to_num(h);
}
torch::nn::Linear lin1{nullptr}, lin2{nullptr}, lin3{nullptr};
};
auto net = Critic_Net();
int main() {
net.to(device);
auto the_tensor = torch::rand({42, 427}).to(device);
std::cout << net.forward(the_tensor).to(device);
}
Versions
I’m on Ubuntu 22.04
My CUDA version is 11.7
Using Libtorch 1.12.1+cu116