Sample time binary classification/multi-target regression

Hi !
first of all, thank you to anyone who posts and helps in this forum! I am new in deep learning and pytorch and this forum has been one of the best guides for my learning period.

I am trying to develop a net that is able to predict to each sample time the presence or absence of the beginning of my positive class (we can think about this problem like in segmentation but, only in one dimension). Samples with label 1 correspond to those targets which have an onset beginning, while samples with label 0, the target is a flat line (all zeros).
I have as inputs some spectrograms. I use sampler for balancing the classes during the DataLoader. I have built a CNN+RNN net, where the output_size = number_of_time_samples (i.e. the predictions are of size Batch size x time samples). I am using as activation function ReLu. I have tried different cost functions, like DICE (thinking about this problem as a segmentation problem) and MSE (regression model). In both cases, the net always tries to find a peak, although in those cases where it shouldn’t.
So, is there any way to train the net for a multi-target regression problem?

Thank you!