I would like to define a custom layer which works a bit like MaxPooling, but is different in the sense that it doesn’t have a constant kernel size. Let me try to explain through an example.
Given input
of shape (1, 7)
, I would like to perform MaxPooling, but not with a fixed window size, however on a custom set of windows. For example: [0, 1, 2], [1, 2, 3], [2, 4], [3, 5], [4, 5, 6].
It would have been ideal if I could have a max
activation function. In that case, I would have had a weight matrix which looked as follows:
[[1, 1, 1, 0, 0, 0, 0]
[0, 1, 1, 1, 0, 0, 0]
[0, 0, 1, 0, 1, 0, 0]
[0, 0, 0, 1, 0, 1, 0]
[0, 0, 0, 0, 1, 1, 1]]
This would have resulted in 5 neurons which compute
max(input[0], input[1], input[2])
max(input[1], input[2], input[3])
max(input[2], input[4])
max(input[3], input[5])
max(input[4], input[5], input[6])
However, since max
is non-differentiable, I do not know how to implement such a custom layer with max
as an activation function.
I would be extremely grateful if anyone could share some pointers regarding this.