Monkey-patching the `forward()` pass of an `nn.Module`

Sayak_Paul · March 28, 2023, 3:21pm

I am trying to monkey-patch the forward() method of an nn.Module. Here’s my nn.Module:

import torch.nn as nn 

class GPT5(nn.Module):
    embed_dim = 768
    num_heads = 12
    q_proj = nn.Linear(embed_dim, embed_dim)
    head_dim = embed_dim // num_heads
    scale = head_dim**-0.5

    def forward(self, hidden_states):
        return self.q_proj(hidden_states) * self.scale

The following works as usual:

import torch 

gpt5 = GPT5()
gpt5(torch.randn(1, 10, 768)).size()

Do monkey-patching:

gpt5 = GPT5()
new_forward = lambda x: l.forward(x) + 1
gpt5.forward = new_forward

The following then raises an error:

gpt5(torch.randn(1, 10, 768)).size()

─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ in <module>:1                                                                                    │
│                                                                                                  │
│ /usr/local/lib/python3.9/dist-packages/torch/nn/modules/module.py:1194 in _call_impl             │
│                                                                                                  │
│   1191 │   │   # this function, and just call forward.                                           │
│   1192 │   │   if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks o  │
│   1193 │   │   │   │   or _global_forward_hooks or _global_forward_pre_hooks):                   │
│ ❱ 1194 │   │   │   return forward_call(*input, **kwargs)                                         │
│   1195 │   │   # Do not call functions when jit is used                                          │
│   1196 │   │   full_backward_hooks, non_full_backward_hooks = [], []                             │
│   1197 │   │   if self._backward_hooks or _global_backward_hooks:                                │
│ in <lambda>:2                                                                                    │
│ in <lambda>:2                                                                                    │

soulitzer · March 28, 2023, 4:26pm

Does the following work for you?

class GPT5(nn.Module):
    embed_dim = 768
    num_heads = 12
    q_proj = nn.Linear(embed_dim, embed_dim)
    head_dim = embed_dim // num_heads
    scale = head_dim**-0.5

    def forward(self, hidden_states):
        return self.q_proj(hidden_states) * self.scale

gpt5 = GPT5()
old_forward = gpt5.forward

new_forward = lambda x: old_forward(x) + 1
gpt5.forward = new_forward

gpt5(torch.randn(1, 10, 768))

Sayak_Paul · March 29, 2023, 3:51am

Thanks! It works. Couldn’t realize the fix would be about separately storing the forward in a variable.