Please find below exact steps and code to reproduce this issue. In my original code, I have transferred my model weights from a caffe2 network, so below python code is simulating this thing. First we will create our model file from this python code:
My execution environment:
Python torch version: 1.7.0+cu101
Libtorch version: 1.7.0 cuda 10.1
class TestNet(nn.Module):
def __init__(self):
super(TestNet, self).__init__()
self.conv1 = nn.Conv2d(2,10,3,padding=1)
def forward(self,x):
x = self.conv1(x)
return x
def get_arr():
arr = np.random.rand(10,3,3,2)
arr = np.transpose(arr,(0,3,1,2))
return arr
model = network.TestNet()
arr = get_arr()
model._modules["conv1"].weight.data = torch.from_numpy(arr.astype(np.float32))
print (arr.shape)
print (arr.strides)
print (arr.data.contiguous)
model.eval()
arr = np.zeros((1,2,256,256)).astype(np.float32)
inp_T = torch.from_numpy(arr)
traced_script_module = torch.jit.trace(model, inp_T)
traced_script_module.save("TestNet.pt")
We have created our model file from above code, now we will execute below C++ code:
torch::jit::script::Module model = torch::jit::load("TestNet.pt");
model.to(at::kCUDA);
model.eval();
torch::NoGradGuard no_grad;
torch::Tensor tensor_in;
float* in_data = new float[2 * 256 * 256];
tensor_in = torch::from_blob(in_data, { 1, 2, 256, 256 }, torch::kFloat);
tensor_in = tensor_in.to(at::kCUDA);
tensor_in.set_requires_grad(0);
cout << tensor_in.is_contiguous() << endl;
cout << tensor_in.strides() << endl;
cout << tensor_in.sizes() << endl;
std::vector<torch::jit::IValue> inputs;
inputs.push_back(tensor_in);
torch::Tensor pred_out = model.forward(inputs).toTensor();
cout << pred_out.is_contiguous() << endl;
cout << pred_out.strides() << endl;
cout << pred_out.sizes() << endl;
Output of python code:
(10, 2, 3, 3)
(144, 8, 48, 16)
False
Output of C++ code:
1
131072,65536,256,1
1,2,256,256
0
655360,1,2560,10
1,10,256,256
We can see from above C++ output that although our input tensor is contiguous, output tensor is not. Interestingly, I found that get_arr() function in python code is responsible for all this. For example, if we replace above get_arr() function with below one, we see that now our output tensor is also contiguous.
Replace above get_arr() with this:
def get_arr():
arr = np.random.rand(10,3,3,2)
arr = np.transpose(arr,(0,3,1,2))
arr = np.ascontiguousarray(arr)
return arr
Now. output of python code:
(10, 2, 3, 3)
(144, 72, 24, 8)
True
Output of C++ code:
1
131072,65536,256,1
1,2,256,256
1
655360,65536,256,1
1,10,256,256
Observe that our output tensor is contiguous now. How is this behaviour connected to our model weights? We can infer that if our model weights are not contiguous then output tensor is also not contiguous. But, how do we explain below behaviour then:
If get_arr() is like this:
def get_arr():
arr = np.random.rand(10,2,3,3)
arr = np.transpose(arr,(0,1,3,2))
return arr
Output of python code:
(10, 2, 3, 3)
(144, 72, 8, 24)
False
Output of C++ code:
1
131072,65536,256,1
1,2,256,256
1
655360,65536,256,1
1,10,256,256
So, we can see that in this case model weights are not contiguous but output tensor is contiguous. It would be great if someone can explain this weird behaviour.