DDP Error: torch.distributed.elastic.agent.server.api:Received 1 death signal, shutting down workers

How do you submit job? I met the same problem when using nohup command (affected by terminal shutting down?). Now, I am trying to use screen command.

1 Like