# From math to Pytorch

I know its really simple but I am new to NN and I am having a lot of difficulties understanding the relation between the math and the PyTorch code. I am trying to replicate a paper that uses attention weights and I need to implement this feed forward neural network with two inputs.

$$c_i = W_1 tanh(W_2m_i + W_3v_a + b_i$$


$m_i$ is a vector of embeddings of a single word 1 for each in the sentence and $v_a$ is the embedding vector of
The model parameters are: $$W_1 \in R^{1xd}, W_2 \in R^{dxd}, W_3 \in R^{dxd}, b_1 \in {}$$

The resulting ${c_1, c_2,..., c_N}$ after being passed through a softmax will represent the weight that is given to each word in the sentence

Hi, I guess your model will be like:

import torch
import torch.nn as nn
import torch.nn.functional as F

class YourModel(nn.Module):
def __init__(self):
super(YourModel, self).__init__()
self.lin1 = nn.Linear(5, 5)
self.lin2 = nn.Linear(5, 5)

def forward(self, m_i, v_a):
y1 = self.lin1(m_i)
y2 = self.lin2(v_a)
y = F.tanh(y1+y2)
y = F.softmax(y)
return y


Then,

model = YourModel()
m_i = Variable(torch.Tensor([1,2,3,4,5]))
v_a = Variable(torch.Tensor([6,7,8,9,10]))
output = model.forward(m_i, v_a)


you can get output as follows:

Variable containing:
0.0563
0.4156
0.4156
0.0563
0.0563
[torch.FloatTensor of size 5]


1 Like

Thanks Ken, thats a nice example, i got the intuition now!

One quick question, i think i should also pass the the y variable through a linear layer since there is W_1 multiplying it in the original function.

$$c_i = W_1( tanh(W_2m_i + W_3v_a + b_i )$$

You think this is necesary or just applying the softmax straight away is ok?

def forward(self, m_i, v_a):
y1 = self.lin1(m_i)
y2 = self.lin2(v_a)
y = F.tanh(y1+y2)
y = self.lin3(y)
y = F.softmax(y)
return y


Yes, you need to add lin3