Common class of Linear, Conv, etc


(Afshin Oroojlooy) #1

Hi,

Is there any class that all torch::nn::Linear, torch::nn::Conv1d, torch::nn::Conv2d, ... torch::nn::GRU, .... all inherit from that? torch::nn::Module seems be a good option, though there is a middle class, called torch::nn::Cloneable, so that torch::nn::Module does not work. Also, torch::nn::Cloneable itself is a template so that needs type in the declaration.
I want to create a general class model, which has std::vector<the common class> layers, so that later I can fill layers with any type of layer that I want, e.g., Linear, LSTM, etc. Is there such a capability in the current API? This can be done easily in python, though here we need declaration and this hinders the python’s easiness.

Thanks,
Afshin


(Will Feng) #2

torch::nn::AnyModule should be the class you are looking for as a container for any type of layers. Please see https://github.com/pytorch/pytorch/blob/e312801453d1121f929b956c2862dfa58d6b3ac3/torch/csrc/api/include/torch/nn/modules/any.h#L116-L128 for its supported constructors.


(Afshin Oroojlooy) #3

Thanks for the reply. I’ll check this one too.
Beside, I found that nn::sequential also can be used for a same purpose, though it does not need a forward implementation, which can be a positive point and at a same time a negative point.

Thanks,
Afshin


(Will Feng) #4

Yes and indeed nn::sequential just uses a std::vector<AnyModule> as its underlying module list. https://github.com/pytorch/pytorch/blob/master/torch/csrc/api/include/torch/nn/modules/sequential.h#L326
nn::sequential should already require each module to have a forward implementation. Is that what you mean?


(Afshin Oroojlooy) #5

Yes, nn::sequential already requires each module to have a forward implementation, and calls the forward functions in a sequence that they have added in. So, one cannot create an ad-hock non-usual forward pass like Dense-Net with that, though it is good enough for general usages.
BTW, thanks for the new link, it was interesting to see the actual implementation of nn::sequential.


(Afshin Oroojlooy) #6

One more question,
If I define std::vector<torch::nn::Linear> linear_layers; and fill this vector with some torch::nn::Linear, then I can access the weight and bias values by linear_layers[k].weight and linear_layers[k].bias. Same feature is available with other layer types, e.g., torch::nn::Conv2d.
Now, my question is how can I access the weight and bias values of each layer when I have used nn::sequential?

Thanks,
Afshin


(Martin Huber) #7

Hi @afshin67,

what do you want to access the parameters for? So what you could do is

#include <torch/torch.h>

using namespace torch;
using namespace torch::nn;

int main()
{
	auto net = Sequential(Conv2d(3 /*input channels*/, 16 /*output channels*/, 3 /*kernel size*/),
                              Conv2d(16, 32, 3));

	for (auto& p : net->parameters()) {
	
		NoGradGuard no_grad;

		// I assume you want to do some initialization
		p.normal_(0. /*mean*/, 1. /*standard deviation*/);
	}

	return 0;
}

Does it help?


(Afshin Oroojlooy) #8

Hi @mhubii,

Thanks for the reply. That is really helpful. I assume paramteres() returns the layers in sequence that they are added in sequential, right?
I need weight and bias values to obtain some statistics on the weights of the network and how they change over time, something like what tensor-board provides us in python.

One more question, can I know (without checking dim()) if a tensor p is bias or weight of network? How can I get its name? And, how can I obtain the layer type, e.g. Linear, Conv2d, etc. ?
I checked named_children() and named_module() and they do not provide the name. It seems that named_parameters() has the name in p.key(), though I am note sure how I can to set the value. (I see that p.value() returns the value).
I appreciate any help or comment.

Thanks,
Afshin


(Martin Huber) #9

hey @afshin67,

glad you ask :slight_smile:. Since torch::nn::Sequential is a
std::vector of torch::nn::AnyModule, we can access all function of the torch::nn::Module class.

For example, to access the names and parameters, do

#include <torch/torch.h>

using namespace torch;
using namespace torch::nn;

int main()
{
	auto net = Sequential(Conv2d(1 /*input channels*/, 1 /*output channels*/, 2 /*kernel size*/),
                              Conv2d(1, 1, 2));

	for (auto& p : net->named_parameters()) {
	
		NoGradGuard no_grad;

		// Access name.
		std::cout << p.key() << std::endl;

		// Access weigth and bias.
		p.value().zero_(); // set all zero
		std::cout << p.value() << std::endl;
	}

	return 0;
}

The layers of a sequential, have the following naming convention: <number layer>.<weight/bias>, e.g. see the console output

0.weight # name of the layer
(1,1,.,.) = 
  0  0
  0  0
[ Variable[CPUFloatType]{1,1,2,2} ]
0.bias
 0
[ Variable[CPUFloatType]{1} ]
1.weight
(1,1,.,.) = 
  0  0
  0  0
[ Variable[CPUFloatType]{1,1,2,2} ]
1.bias
 0
[ Variable[CPUFloatType]{1} ]

(Afshin Oroojlooy) #10

Thanks, it works well as you described.