Hi all,
I am using a DataLoader to feed x and y to my model during training. However, the y object is a built-in python dictionary containing 2 types of labels y = {‘target1’ : Tensor1 , ‘target2’: Tensor2}.
I want to load y into the gpu. However, this is not possible directly on the dict. I know that I could extract target 1 and 2, load them separately into cuda and provide this data to the model as such:
target1 = y[‘target1’].to(device)
target2 = y[‘target2’].to(device)
model(x, target1, target2)
But, for many reasons, I have made the design decision to feed the dict to the model directly:
Model(x,y)
Is there an elegant solution for this ? I see 2 options and for both I have no clue if they are sound:
-
untangle Y within the .forward() method of my model. Meaning that I would extract the targets from the Y dictionary inside the .forward method and send them to the gpu inside.
-
Sending the values in my Y dictionary to the gpu separately but keep them in the dict structure.
y[‘target1’] = y[‘target1’].to(device)
y[‘target2’] = y[‘target2’].to(device)
model(x, y)
For both methods I am worried about unwanted inneficiency. Could anyone help me with understanding if those methods would make sense and if not, why not and what else can I try ?
Thanks !