For most Self-supervised learning algorithms: SimCLR, MoCo, BYOL, SimSiam, SwAV, etc., its common to have a projection head after the base encoder (which in most cases is a vanilla ResNet-50 CNN). An example of such a projection (taken from SwAV) is:

```
projection_head = nn.Sequential(
nn.Linear(2048, 512),
nn.BatchNorm1d(512),
nn.ReLU(inplace=True),
nn.Linear(512, 128),
)
```

The output of this projection head is L2-normalized:

```
x = projection_head(x)
x = nn.functional.normalize(x, dim = 1, p = 2)
```

I am trying to initialize a layer after the projection head as:

```
wts = nn.Parameter(data = torch.empty(40 * 40, 128), requires_grad = True)
# The projection head outputs weights in the range [-1, 1], so initialize SOM weights to be in that range-
wts.data.uniform_(-1.0, 1.0)
```

Since the output of the projection head is L2-normalized, I am assuming that the input range to “wts” ∈ [-1, 1] and therefore use the uniform initialization above.

Is this a correct approach or am I missing something?