After randomly initializing the weight matrices of a GRU (let’s call them `W`

) I need to transform them by means of some function. For the sake of this discussion, let’s simplify and say I want to multiply `W`

by a scalar:

`W <- alpha * W`

where `alpha`

is a scalar parameter that I want my optimization algorithm to optimize.

Is there a way for me to accomplish this without rewriting the whole GRU code from scratch?

What I have tried so far:

```
def __init__(self):
self.gru = ...
alpha = torch.Tensor([0.5])
self.alpha = nn.Parameter(alpha, requires_grad=True)
def forward(self, x):
self.gru.weight_hh_l0.data = self.alpha * self.gru.weight_hh_l0
... normal forward code ...
```

If I optimize all parameters, this doesn’t work. If I optimize all parameters except `gru.weight_hh_l0`

, it still doesn’t work.