Torch::nn::module::foward Procudes very similar results irrigardless of input

Kallinteris-Andreas · April 8, 2021, 7:59pm

My Module:

struct Net : torch::nn::Module {
	Net(int numIn, int numOut, int numHid, const size_t hid_count=1) {
		assert(hid_count > 0);
		first = register_parameter("inputW", torch::rand({numIn, numHid}))/numHid;
		middle = new torch::Tensor[hid_count];
		h_c = hid_count;
		for (int i = 1; i != hid_count; i++)
			middle[i] = register_parameter("hidW"+std::to_string(i), torch::rand({numHid, numHid}))/numHid;
		last = register_parameter("outputW", torch::rand({numHid, numOut}))/numOut;
	}

	torch::Tensor forward(torch::Tensor input) {
		torch::Tensor output_layer,h;
		h = torch::sigmoid(torch::mm(input, first));
		for (int i = 1; i != h_c; i++)
			h = torch::sigmoid(torch::mm(h, middle[i]));
		output_layer = torch::sigmoid(torch::mm(h, last));
		return output_layer;
	}
	torch::Tensor first, last, *middle;
	size_t h_c;
};

my unit test:

test_foward(){
	const int input_nodes = 10, output_nodes=1, hidden_count = 1;
	Net nn (input_nodes, output_nodes, hidden_count);

	std::cout << "Number of hidden layers: " << hidden_count << std::endl;
	for (int i = 0; i != 20; i++)
		std::cout << nn.forward(torch::rand({output_nodes,input_nodes})).item<float>() << std::endl;
}

sample terminal output

Number of hidden layers: 2
0.570727
0.570719
0.570772
0.570827
0.570737
0.570785
0.570691
0.570735
0.570691
0.570765
0.570737
0.57077
0.570822
0.570722
0.570809
0.570712
0.570836
0.570773
0.570783
0.570827

Number of hidden layers: 11
0.965095
0.965136
0.965416
0.964702
0.964923
0.96534
0.964738
0.965263
0.965538
0.965113
0.965482
0.965526
0.965461
0.965383
0.965067
0.964979
0.965134
0.965376
0.965572
0.96514

Number of hidden layers: 51
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1

The title and the sample Outputs should illustrate the problem problem, if not please ask for clarification

Changing the activation faction does not remove the problem (it just changes the bounds of the output)
Removing the hidden Layer does not help
The Nature of the Input does not matter even all 0s produces the same result
Same for all 1s

 torch::Tensor fir = torch::rand({10, 5});
 for (int i = 0; i != 15; i++)//this works fine
 	std::cout << torch::mm(fir, torch::rand({5,1})) << std::endl;


Me and my project partener have spent hours on this issue, are we missing something or is there a bug
Note: you can answer if you do not know C++ anything helps

ptrblck · April 9, 2021, 7:09am

You are initializing all parameters and the inputs with torch::rand, which will sample values from a uniform distribution on the interval [0, 1) and will thus yield only positive numbers.
Increasing the layer size will thus easily saturate the activation and thus yield these saturated values.
Use e.g. torch::randn to sample values from a Normal distribution and rerun the code.

Kallinteris-Andreas · April 9, 2021, 7:12am

@ptrblck Thanks for the reply, ~~but that is not my issue, (i want numbers bounded to [0.1))~~

~~My issue is that irrigardless of the input the module::foward() gives the same(ish) output~~

Kallinteris-Andreas · April 9, 2021, 7:30am

@ptrblck
Hmm the variance appears to be much higher you were right
Thanks man

Number of hidden layers: 2
0.626091
0.644468
0.631702
0.592448
0.654189
0.61956
0.615086
0.634765
0.616084
0.642732
0.667842
0.637472
0.633968
0.647593
0.674249

Kallinteris-Andreas · April 9, 2021, 7:42am

Though that does not explain why feeding the NN all zeros does not provide all zeroes on the output (shouldn’t 0*x=0 (where 0 and x are tensors))

	for (int i = 0; i != 15; i++)
		std::cout << nn.forward(torch::zeros({output_nodes,input_nodes})).item<float>() << std::endl;

Number of hidden layers: 1
0.534954
0.534954
0.534954
0.534954
0.534954
0.534954
0.534954
0.534954
0.534954
0.534954
0.534954
0.534954
0.534954
0.534954
0.534954

ptrblck · April 9, 2021, 7:43am

If you are assigning zeros to the bias tensors, then yes. Otherwise the output would be the bias of the first layer and would then act as the output activation.

Kallinteris-Andreas · April 9, 2021, 7:49am

Net(int numIn, int numOut, int numHid, const size_t hid_count=1) {
		first = register_parameter("inputW", torch::randn({numIn, numHid}))/numHid;
		middle = new torch::Tensor[hid_count-1];
		for (int i = 1; i != hid_count; i++)
			middle[i] = register_parameter("hidW"+std::to_string(i), torch::randn({numHid, numHid}))/numHid;
		last = register_parameter("outputW", torch::randn({numHid, numOut}))/numOut;
	}

When i am Initing the module (and in extent it’s underling parameters/tensors) I do not specify anything about biases, what is the default behavior

I have been searching in the docks the other and could find an answer

you were probably reffering to the bias of sigmoid?

anyway i think i need to go back to drawing board after reading more

Matej_Kompanek · April 9, 2021, 9:28am

Even though output of the first layer is all zeros, sigmoid of 0 is 0.5, middle and last layers then work with non-zero values.

ptrblck · April 9, 2021, 6:47pm

That’s another good point.

@Kallinteris-Andreas by default trainable layers (such as nn.Linear) use a bias parameter as described in the docs.