Dear all,

I wanted to use automatic mixed precision to train my model.

The output of my model is a weighted average of the output of several components, `Y= w1y1 + w2y2 + ... + wkyk`

,

where `y1,...yk`

is the output of each component **after applying the sigmoid function** (like a weighted averaged ensemble).

For this reason, I cannot use BCEWithLogitsLoss since simply taking the sigmoid of Y is not equivalent to the weighted average of the sigmoid outputs.

Any suggestions?

Many thanks!

From the docs:

The backward passes of `torch.nn.functional.binary_cross_entropy()`

(and `torch.nn.BCELoss`

, which wraps it) can produce gradients that aren’t representable in `float16`

. In autocast-enabled regions, the forward input may be `float16`

, which means the backward gradient must be representable in `float16`

(autocasting `float16`

forward inputs to `float32`

doesn’t help, because that cast must be reversed in backward). Therefore, `binary_cross_entropy`

and `BCELoss`

raise an error in autocast-enabled regions.

Assuming you can guarantee the numerical stability, remove the loss calculation (and maybe the weighting operations beforehand) from the `autocast`

region to use `float32`

for them.

1 Like

Sounds good. Will give it a try. Thank you!