I am using custom audio data for model training. The time-series data is between ~-1.5 and ~+1.5 with the mean being around 0. The model I am using consists of Conv1d layers with ReLU activation. After the ReLU, I am left with only positive side of the signal. Is this OK? Or should the input signal be normalized in a way that it consists of only positive values? I feel like I am losing half of the information now.
The Rectified Linear Activation function (aka ReLU), like most activation functions, sets a cut off to reset model thru-data to zero that falls below a certain threshold. Think of it as a way for a model to make clear cut decisions, based on dynamic variables.
However, it probably is not ideal to place an activation prior to a learnable layer as you will lose certain information that may be needed by the model.
So does that mean that the input data should be shifted to include only positive values or not? I understand what ReLU does (thanks for the explanation, anyways!), but I am not sure if this means that the input data should be positive-only by default. Previously, I was working with images (and ReLU), where this question was not relevant as all pixel values are positive anyways.
Whether you normalize the values between -1 to 1 or between 0 and 1 makes no difference, because of the nature of a convolution operation(plus a possible bias).
For a really simple example, suppose we have a sequence with the following values
[-1.0, -1.0, 1.0] and the Conv1d kernel at channel 0 is
[-1.0, -1.0, -1.0]. Multiply those elementwise and sum and you get
[1.0]. Now suppose the kernel at channel 1 is
[1.0, 1.0, 1.0]. The same operation gives
[-1.0]. So before any activation, an appropriately trained kernel will decide whether that information is important to pass along or not, for that given channel. However, another channel might pass that information along as it may be trained to identify a different feature. But even if you had a sequence between 0 and 1, the kernel can have negative elements and change the output to be outside of that range.
So, as long as your activation is after a learnable layer(s), then it should be fine with either initial normalization, and as long as you stay consistent.
I see, thanks for the insight! I am using activation as a part of a convblock sequence: Conv1d, BatchNorm1d, ReLU, which is very standard I believe, so the order is fine. Do you maybe know if there’s a list of what filters are used in Conv1d? Or are they just random? Could not find anything online.
The filters are a learnable component. That’s why it doesn’t matter if numbers out of the activation layer are positive, because the next filter can make them negative again, if necessary.