Using linear layers? New user transfering from keras

Thank you for responding

In my set up I would like a set of linearities with a nonlinear but continuously differentiable activation function, so
layer 1: sigmoid(w_1^T x+b_1)
layer 2: softmax(w^T_2 y_1 +b_2) etc. etc.

Am I doing this wrong in the code? Instead of nn.Linear should I use nn.sigmoid etc.?
And what should the F. function be in the forward pass for the linear part?

this is what I was going by, it is the only example of pytorch multilayer perceptron

thanks