BCELoss + Automatic Mixed Precision

SofiaCP · May 5, 2022, 10:43am

Dear all,

I wanted to use automatic mixed precision to train my model.

The output of my model is a weighted average of the output of several components, Y= w1y1 + w2y2 + ... + wkyk,

where y1,...yk is the output of each component after applying the sigmoid function (like a weighted averaged ensemble).

For this reason, I cannot use BCEWithLogitsLoss since simply taking the sigmoid of Y is not equivalent to the weighted average of the sigmoid outputs.

Any suggestions?
Many thanks!

ptrblck · May 5, 2022, 1:31pm

From the docs:

The backward passes of torch.nn.functional.binary_cross_entropy() (and torch.nn.BCELoss, which wraps it) can produce gradients that aren’t representable in float16 . In autocast-enabled regions, the forward input may be float16 , which means the backward gradient must be representable in float16 (autocasting float16 forward inputs to float32 doesn’t help, because that cast must be reversed in backward). Therefore, binary_cross_entropy and BCELoss raise an error in autocast-enabled regions.

Assuming you can guarantee the numerical stability, remove the loss calculation (and maybe the weighting operations beforehand) from the autocast region to use float32 for them.

SofiaCP · May 5, 2022, 1:43pm

Sounds good. Will give it a try. Thank you!