I don’t think you can even customize your backprop because the graphs were already detached (i.e. it’s already lost, nothing remains). Backprop needs the graphs to calculate the gradients. That’s why I asked why did you detach in the middle of the forward function.
I think the XGBClassifier you use here is not implemented on tensor, yes?
2 Likes