Hi everyone,
I was wondering whether there is a way to build a bare-bones multi-layer perceptron, i.e. a network with a single hidden layer, without using nn.Module. I am interested in this because I am writing a PyTorch tutorial and I would really like to start as low-level as possible - my opinion is that the audience will be thrilled to see that PyTorch really lets you understand the training details intimately.
Here’s what I have so far:
w1 = torch.rand(28*28, 100) # I am doing MNIST, hence 28*28
b1 = torch.ones(100)
o1 = F.sigmoid(torch.matmul(x.view(batch_size, 1, -1), w1) + b1)
w2 = torch.rand(100, 10)
b2 = torch.ones(10)
o2 = F.log_softmax(F.sigmoid(torch.matmul(o1, w2) + b2))
loss = torch.mean(-sum(torch.matmul(o2, y)))
(full gist here)
Now I would like to train the model, or at the very least show a few iterations using loss.backward()
to compute gradients and update my weights – I would also like to avoid the optim
module.
However, the usual way I did my forward pass was by using something like model(data)
by having a model
object that is an instance of some nn.Module
subclass. I would like to show a bare-bones example before introducing nn.Module
, the forward(x)
function and the whole functional-like APIs, though…
Thank you in advance!