Permuting the output is the right approach.
What do you mean by “not performing well”? Is the model bad regarding the speed or accuracy?
You could use the channels_last memory format as described here, which internally permutes the data to the channels last format. Note that the shape of the tensor would still indicate the standard contiguous format (channels first).
Note that changing the memory layout will not fix the accuracy issue, so you might want to fix this first e.g. by playing around with some hyperparameters.
So I’m using the channel_last memory format in my custom dataloader but i got this error RuntimeError: Given groups=1, weight of size 64 1 3 6, expected input[2048, 10, 6, 1] to have 1 channels, but got 10 channels instead
so apparently is not working
so right now what i’m doing is to mimic your code like that:
model = Network()
model = model.to(memory_format=torch.channels_last)
model = model.double()
criterion = nn.BCELoss()
optimizer = optim.Adam(params = model.parameters(), lr = 0.01)
…
for epoch in range(epochs): # loop over the dataset multiple times
model.train()
for j, data in enumerate(trainloader, 0):
# Get the inputs; data is a list of [inputs, labels]
inputs, labels = data
print(inputs.stride())
inputs = inputs.to(memory_format=torch.channels_last)
print(inputs.stride())
# Zero the parameter gradients
optimizer.zero_grad()
# Forward + Backward + Optimize
outputs = model(inputs.double())
loss = criterion(outputs, labels.double())
loss.backward()
optimizer.step()
print("epoch\t{}\t\tbatch\t{}\nloss\t{}\n---".format(epoch, j, loss.item()))
and is still not working, i even tried to print out the stride before and after the channel_last operation and this is what i get:
(60, 6, 1, 1)
(60, 6, 1, 1)
Error:
RuntimeError: Given groups=1, weight of size 64 1 3 6, expected input[2048, 10, 6, 1] to have 1 channels, but got 10 channels instead
As described before: your input has to be in the shape [batch_size, channels, height, width] before using to(memory_format=torch.channels_last) (have another look at my code snippet).
You should not manually permute the tensor to the channels-last format, the to() operation will internally handle it for you.
Since the conv layer has in_channels=1, your data should have the shape [2048, 1, 10, 6].