I have got a problem when I use pytorch lightning in ddp setting. My pl.LightningModule class is like as follows.
class MyClass(pl.LightningModule):
def _init_(self):
self.id2emb = {} # video id : tensor
def validation_step(self):
…
self.id2emb[vid] = self(x)
def validation_epoch_end(self):
…
for vid in video_list:
emb = self.id2emb[vid] # key error!
Since different ranks have their own dictionaries, they are not shared between ranks.
How can I solve this problem?
What I want to do is only rank 0 can access entire video id list and only rank 0 has self.id2emb, which is mapping to entire video id : each embedding.
Please help me…