I started looking into the state_dict of Adam optimizer, found this:
d['optimizerG']['state'].keys()
returns
dict_keys([140266712220800, 140266712220224, 140266712159432, 140266712219792, 140266712162168, 140266712220008, 140266712158784, 140266712219864, 140266712162096, 140267508645536, 140266712160656, 140266712220584, 140266712161376, 140267508643376, 140267508644456, 140266712161664, 140266712159648, 140266712220728, 140266712220152, 140266712159936, 140266712220368, 140266712161592])
After printing them out, I found out that keyvalues refer to the weights and biases in the model, and allow access to the stored exp_avg
and exp_avg_sq
for each weight/bias. What I don’t quite understand is where the values above come from. Is this a time of creating the reference to the tensor?