How to AMP with BCE that does **not** directly operate on `sigmoid` results

In that case you might want to disable autocast for these operations, but note that even without amp nn.BCELoss has less numerical stability than nn.BCEWithLogitsLoss.
See e.g. this recent topic.