Loss averaging in VAE example

Are there any (theoretical) reasons for not taking the batch average loss in the VAE example?
Right now both the KL divergence and the BCE aren’t being averaged.

I dont think there are strong theoretical reasons. Joost (original author of that code) was porting some code over exactly.

Alright, I was just wondering, thanks!