The placeholder to be memory efficient

I have found AdamW by LiyuanLucasLiu.

If I compare the implementation with the Adam, one thing is that I wonder…

Why AdamW implmentation used p_data_fp32 = and later on

Is this the placeholder trick for the optim to be memory efficient?

Will this improve the original Adam implementation, or this is not needed?