I am trying to write a custom CNN layer that applies softmax to each convolution operation. So each pixel in the output image is gonna be valued between [0, 1] and it is the sum of the convolved pixel. An example of TensorFlow implementation can be seen here.
Ideally, this should be trained with binary cross-entropy loss. I tried below but it does not train.
self.weight = nn.Parameter(torch.Tensor(1, 1, 3, 3))
self.softmax = nn.Softmax(dim=1)
def forward(self, s, test=False):
return F.conv2d(s, self.softmax(self.weight), padding=1)
model = TransitionModel()
pred_s_p = model(s)
pred_s_p = pred_s_p.squeeze()
loss = F.binary_cross_entropy(pred_s_p, s_p)
s.shape = [1, 1, 32, 32]
pred_s_p.shape = [32, 32]
It throws the following error:
RuntimeError: reduce failed to synchronize: cudaErrorAssert: device-side assert triggered
Could you check, if your code runs on the CPU?
F.binary_cross_entropy might throw an error, as it expects probabilities as the model output, while you are passing the raw output as
Try to use
The model and data are on GPU.
F.binary_cross_entropy_with_logits works but the loss does not change. Since I am applying softmax, the values should be probabilities in the output that’s why I thought I should use
You are applying the softmax on the weights, not the output.
Depending on the distribution of your input you will not get probabilities as the output, which would raise an error as:
RuntimeError: Assertion `x >= 0. && x <= 1.' failed. input value should be between 0~1, but got -1.429985
I see, any idea how I can apply softmax to the output of each conv operation?
Thanks for quick responses btw.
You could just call it directly of the output of
return self.softmax(F.conv2d(s, self.weight, padding=1))
Doesn’t this apply the softmax over all the pixels in the output image? I need to apply it to each conv operation. Let’s say we have a
kernel with size of [3, 3] and
image [10, 10].
The layer I want should do:
softmax(image[0:3, 0:3] * kernel), softmax(image[0:3, 1:4] * kernel) ...
softmax(image[1:4, 0:3] * kernel), softmax(image[1:4, 1:4] * kernel) ...
Does this makes sense?