Kindly understand that I really can’t figure out how to get this neural network concept to work as code. Anyway, has anyone figured out how to create the Python source code implementation for this?
It corresponds to the following code:
from math import e
unk = -1
# Activations
def h_act(x):
return 1.0 / (1.0 + e**-x)
def y_act(x):
return unk
# Inputs
x0 = 2
x1 = 3
x2 = 4
# Hidden layer weights and biases
h_w00 = 0.1
h_w01 = 0.5
h_w02 = 0.9
h_b0 = -2.0
h_w10 = unk
h_w11 = unk
h_w12 = unk
h_b1 = unk
h_w20 = unk
h_w21 = unk
h_w22 = unk
h_b2 = unk
h_w30 = unk
h_w31 = unk
h_w32 = unk
h_b3 = unk
# Output layer weights and biases
y_w00 = 1.3
y_w01 = unk
y_w02 = unk
y_w03 = unk
y_b0 = unk
y_w10 = unk
y_w11 = unk
y_w12 = unk
y_w13 = unk
y_b1 = unk
# Forward
h0 = h_act(h_w00 * x0 + h_w01 * x1 + h_w02 * x2 + h_b0)
h1 = h_act(h_w10 * x0 + h_w11 * x1 + h_w12 * x2 + h_b1)
h2 = h_act(h_w20 * x0 + h_w21 * x1 + h_w22 * x2 + h_b2)
h3 = h_act(h_w30 * x0 + h_w31 * x1 + h_w32 * x2 + h_b3)
y0 = y_act(y_w00 * h0 + y_w01 * h1 + y_w02 * h2 + y_w03 * h3 + y_b0)
y1 = y_act(y_w10 * h0 + y_w11 * h1 + y_w12 * h2 + y_w13 * h3 + y_b1)
To make it work properly, you have to change unk to the actual weights of the network. They are not given in the figure.
@Bjorn_Lindqvist
You really do understand my circumstances - I’m so grateful for you not bashing me for posting that! (tears of joy emojis) Or worse - referring me links to more tutorials that further confuse my already fried brain (if not yet done in by COVID…)
I hope you won’t mind me asking these additional questions:
-
If I want to increase the number of inputs, I see that they have to be declared beforehand. Is the same true in making the number of hidden layers? (…especially if I need to have 15-levels of hidden layers…)
-
When making additional hidden layers, must I really have to put in the code for each hidden layer? Or, is it advisable to put the codes for number of hidden layers as a loop?
(That is, if the number of hidden layers is 15, do I have to loop the inputs to the hidden layer code 15 times?) -
Regarding the inputs, can a column of values stored in an array pass continuously stream through it?
Like…all values inside array [N] go through input x0
Or must the ALL the individual values in that array be summed up then pass through an input?
Sum of all values inside array [N] go through input x0
Thanks again for your earlier reply and please bear with me on my follow-up questions…
What you have is a conceptual diagram that shows individual “neurons”. PyTorch treats the neurons in the context of “layers” or “modules”. Here how you would convert that diagram into a functioning python (and pytorch) code.
- The first layer of neurons (with green bubbles) are just input neurons – you can look at it as just an input. It seems that in your case evey input has 3 features labeled x0, x1, and x2
- There are 2 compuational layers in your network, one labeled “hidden” and the other is “output”. Notice that the hidden layer has 3 inputs and 4 ouputs. The output layer has 4 inputs and 2 outputs.
Simple example
Suppose your input layer has 3 elements [x0, x1, x2]. Your output layer has 2 neurons with outputs [y0, y1]. In that case your computation is
y0 = x0 * w00 + x1 * w01 + x2 * w02
y1 = x0 * w10 + x1 * w11 + x2 * w12
You can rewrite the equations above as vector (or matrix) multiplication:
[ w00 w10]
[y0 y1] = [x0 x1 x2] | w01 w11|
[ w02 w12]
=>
Y = X W, (where Y = [y0 y1], X = [x0, x1, x2], and W = [weights...]
(ignore the transposes and non-linearity functions for now)
Your example
Without writing the actual numbers, you can describe the network on your drawing in terms of matrix multiplications:
[ w00 w10 w20 w30 ]
[h0 h1 h2 h3] = [x0 x1 x2] | w01 w11 w21 w31 |
[ w02 w12 w22 w23 ]_ih <= All these are hidden weights
[ w00 w10 ]
[o0 o1] = [h0 h1 h2 h3] | w01 w11 |
| w02 w12 |
[ w03 w13 ]_h0 <= All these are output weights
H = I W_ih, (where I is the input, W_ih is the weight, and H is the result of hidden)
O = H W_ho, (where O is the outut, W_oh is the weight, and H is the result of hidden)
On your diagram, it actually says that every layer is represented by phi(sum(x * w) + b)
. Excluding the phi
, the above description is exactly the sum and bias (bias is just a row of all ones). The function phi
is called a “nonlinear activation function” or simply “nonlinearity”. In the simplest terms it is used to separate the consecutive matrix multiplications into collapsing into a single layer.
Python implementation
I am assuming you don’t want to implement the matrix multiplication, so you can use numpy
import numpy as np
def phi(x):
return 1.0 / (1.0 + np.exp(-x))
in_features = 3
hidden_features = 4
outputs = 2
# Just a random input, but you can replace with `[2.00, 3.00, 4.00]`
x = np.random.randn((1, in_features)) # Replace "1" with any number of samples to compute simultaneously
# Just random weights
w_ih = np.random.randn((in_features, hidden_features))
w_oh = np.random.randn((hidden_features, outputs))
# You can add a random bias too
b_ih = np.random.randn((hidden_features,))
b_oh = np.random.randn((outputs,))
# Compute the result
hidden = phi(np.dot(x, w_ih) + b_ih)
output = phi(np.dot(hidden, w_oh) + b_oh)
At this point you are done. However, if you want your network to “learn” you must implement the backpropagation. Doing it in python might prove cumbersome, so you can use pytorch
PyTorch
from torch import nn
in_features = 3
hidden_features = 4
outputs = 2
# It's a sequential model, meaning it goes layer-by-layer without loops and feedbacks
model = nn.Sequential(
nn.Linear(in_features, hidden_features) # First layer of neurons (hidden)
nn.Sigmoid(), # "phi" nonlinearity
nn.Linear(hidden_features, outputs) # Second layer of neurons (output)
nn.Sigmoid() # "phi" nonlinearity
)
# Just a random input, but you can replace with `[2.00, 3.00, 4.00]`
x = torch.randn((1, in_features)) # Replace "1" with any number of samples to compute simultaneously
output = model(x)
To train the network you just run the training loop on the model, and autograd will take care of all the mathematics behind it (lookup the tutorials)
Thanks so much, Zafar - that really further expounded what @Bjorn_Lindqvist earlier posted in his reply…
I hope you won’t mind me asking these additional questions:
“To train the network you just run the training loop on the model…”
1) You mentioned “training loop” here - I assume that’s referring to the hidden layers, right?
2) Regarding the inputs, can a column of values stored in an array be made to stream continuously through those?
Like for example:
all values inside array [N] go through input x0
Or must the ALL the individual values in that array be summed up then pass through an input? Like below for example:
Sum of all values inside array [N] go through input x0
“I am assuming you don’t want to implement the matrix multiplication…”
3) If there’s a need to implement matrix multiplication, how would the respective Python and Pytorch source code implementations would look like?
By the way, I forgot this additional question:
4) I am running a 32-bit Windows 7 machine with 2GBs of memory. What Pytorch configuration and the appropriate machine learning libraries is best for my machine that will not tax my machine much but also run AI or machine learning programs, at least almost at par with better machines?
Thanks again for your earlier reply and please bear with me on my follow-up questions…
- The training loop is a loop over the “training data” that you use to show your neural network what the correspondence between the input and out should be. It is in the context of the whole model, not an individual hidden layer. A real life example of a training loop would be teaching colors to a child – you show a color, and say its name. The child tries to mimick it, and gets it wrong initially. You repeat it again, over and over, until the child starts getting it right. This is what the training loop is.
- I am not sure I understand what you mean by that. “All values inside array N go through input x0”? If this is the design you need, you can certainly do it – just write a for-loop, iterating over the array N, and pass its elements into x0 one-by-one. However, from the diagram that I see, x0 is an input for some feature at its appropriate position. You can have any number of elements in that column. Linear algebra will take care of it.
- If there is a need to implement the matrix multiplication from scratch, you can just write a loop. If you need an optimized implementation, there is a good explanation in the CLRS book on “Strassen algorithm” that you should read on (Strassen algorithm - Wikipedia). Either way, I don’t really see the point of writing your own matrix multiplication – there are so many libraries that do it very well.
- I will be honest here, 2GB memory might not be enough to train a large network. One might argue that if you have a GPU with a lot of memory, you might pull it off, but if it is a 32-bit OS with 2GB RAM, I doubt it has a very powerful GPU. You might be able to train simple networks, but nothing super-fancy. If you need a powerful machine to train a deep network, you can use https://colab.research.google.com. It limits the amount of time you can run the training loop for, but I think it will still give you better results than the machine you currently have.
With that said, although the machine you described might not be good for training, it is perfectly fine for inference. You can download a lot of pretrained models from the PyTorch Hub | PyTorch and use them on that machine.
Gentlemen, kindly permit this last set of questions:
In fine-tuning a neural network, does that solely rely on…
1)…adjusting the training weights?
2)…reducing or increasing the hidden layers?
3)…a method to prune neurons from the neural network? (if possible… Since a neural network is said to mimic the human brain there are some useless neurons that would be discarded, right?)
4)…all of the above-mentioned factors?
So these are clearly homework questions. No more answers for you.
@Bjorn_Lindqvist
That’s okay - I’m cool with that…
Thanks for your earlier explanation, as well as for @Zafar’s explanations too.
Very grateful, in spite of it all, and my apologies for wasting each of your time…