When inference(so there are no weights update), model's half-weights are reused or calculated every forward

treasureqqm · July 28, 2021, 2:34am

I use @autocast() to decorate my module forward functions and I wonder that, when inference(so there are no weights update), model’s half-weights are reused or calculated every forward? (i.e. will float2half for each weight be called every time?)

ptrblck · July 28, 2021, 4:21am

You could wrap the entire inference evaluation into autocast, which would then use the internal caching to avoid transforming the parameters every time.

treasureqqm · July 28, 2021, 5:20am

So, if i use @autocast() to decorate a forword function (without wraping the entire model), float2half will be called every time when calling that forward?

ptrblck · July 28, 2021, 5:21am

Yes, I think exiting the outermost autocast context would clear the cache.

treasureqqm · July 28, 2021, 5:25am

@ptrblck Thank you for your quick reply, it helps a lot