Layer identifier in powerSGD_hook

Hello everyone,

I’m looking for an identifier that specifies a neural net’s layer in this.

Also, I would like to know if this loop iterates over all layers?

Cheers,

Hello hamidreza_ramezani

I’m looking for an identifier that specifies a neural net’s layer in this.

That is ddp communication hook. It is used during the backward pass.

Also, I would like to know if this loop iterates over all layers?

That loop iterates over all the parameters stored in the bucket. The parameters in the bucket are determined at construction time.

ddp doc Distributed Data Parallel — PyTorch 2.1 documentation

ddp communication hook doc DDP Communication Hooks — PyTorch 2.1 documentation

1 Like

Hey Garrett,

Thanks for the reply.

Blockquote That is ddp communication hook. It is used during the backward pass.

I see. I guess there should be a variable that represents a layer like this. I’m not sure if bucket_index represents a layer though. Are bucket and layer the same thing in this context? I printed the value of bucket_index and noticed that it only takes two values 0 and 1 (the application was training RN20 on CIFAR10).

bucket.get_index() does not necessarily correspond to layer indices.

We use GradBuckets to store gradients in DDP’s reducer which is passed to the gradient communication hook. Buckets can contain gradients from 1 or more parameters that correspond to 1 or more layers, and the index is just the index of this bucket in the list of all buckets.

In addition, there is an API bucket.get_per_parameter_tensors() (pytorch/powerSGD_hook.py at master · pytorch/pytorch · GitHub) that will allow you to get the tensors for a given parameter.

1 Like

I see. Thank you very much.