Modifying the data type of model parameters or buffers

iammano · September 4, 2023, 6:15am

Hi,
I’m trying to give a model as input to Pytorch-DDP for distributed training. Here my script throws an error of invalid scalar type. when I look into the model(GPT-NEO 1.3B) buffers it is boolean and loop through the buffer and modify the data type to int8. Now DDP is working fine.

problem:
After training, I’m not getting any response from the model. The model is not generating any inference/prediction here. I don’t know whether what am I doing is correct or wrong.

Kindly clarify my understanding with your reply, Thanks.

Note: For synchronisation of weights among the nodes DDP need a datatype of float32 or int. Boolean is not accepted.