Hello, i want to add a swallow classifier (SVM, RF, etc) as the last layer of my NN and back propagate. My initial thought is to use `predict_proba`

of sklearn which can be viewed as the output of a softmax layer.

Lets say i have my output tensor from forwarding the input data through the basic network and then i feed it to sklearn. After that i change the output tensor’s values to match the sklearn predictions. Will this mess the computation graph? I mean i can’t use the new tensor to a BCE loss and back propagate, right?

Maybe instead of replacing the network’s output tensor with sklearn’s predictions i could piecewise multiply the networks tensor with the appropriate vector in order to match sklearn’s predictions. Any ideas?

Thanks in advance!