How to make parameters of pytorch module functions searchable?

marsggbo · March 13, 2021, 2:30pm

When we define a Pytorch module, we have to specify the values for all parameters.

But now, I want to make the parameters searchable. For example, maybe we can define a new class ValueChoice, then we can define a convolution as follows:

out_channels = ValueChoice([16,24])
kernel_size = ValueChoice([3,5,7])
convBN = nn.Sequential(
    nn.Conv2d(3, out_channels , kernel_size),
    nn.BatchNorm2d(out_channels)
)

my question is can we implement a class like ValueChoice? If so, how should we do?

Valerio_Biscione · March 13, 2021, 10:29pm

I am not sure what you mean by sercheable, could you explain it again or give more examples?

marsggbo · March 14, 2021, 4:37am

Thanks for your reply.

I haven’t thought too much about the implementation details of ValueChoice. My idea is that ValueChoice receives a list of candidate values, and it can return a value selected from the candidates. Therefore, we can use it to search an optimal architecture.

Searchable means that ValueChoice([3,5,7]) can return a value of 3, 5, or 7 according to the predefined search strategy. For example, nn.conv2d(3, 16, ValueChoice([3,5,7])) can be decoded as nn.conv2d(3, 16, 3), nn.conv2d(3, 16, 5), or nn.conv2d(3, 16, 7).

I have two questions about ValueChoice implementation, and I wish you can give me some suggestions:

how to decode ValueChoice into a value
during the search process, ValueChoice returns a different value at different epoch. My second question is how to change or inherit weights of modules. For example, at the 1-st epoch, ValueChoice returns 3, then we obtain nn.conv2d(3, 16, 3); but if ValueChoice returns 5 at the 2-nd epoch, then we obtain nn.conv2d(3, 16, 5). Can these two convolutional operations share weights?

Valerio_Biscione · March 14, 2021, 6:14am

There are many ways of doing this

import numpy as np
import torch.nn as nn 
class ValueChoice():
    def __init__(self, values):
        self.values = values
        np.random.shuffle(self.values)

    def __iter__(self):
        for i in self.values:
            yield i

a = ValueChoice([1, 2, 3])
i = iter(a)
print(nn.Conv2d(next(i), 2, 3)) # Conv2d(2, 2, kernel_size=(3, 3), stride=(1, 1))
print(nn.Conv2d(next(i), 2, 3)) # Conv2d(3, 2, kernel_size=(3, 3), stride=(1, 1))
print(nn.Conv2d(next(i), 2, 3)) # Conv2d(1, 2, kernel_size=(3, 3), stride=(1, 1))
print(nn.Conv2d(next(i), 2, 3)) # this will give StopIteration error: not enough values.

However, this will do what you ask, but not what you want. If you have something like
kernel = ValueChoice([3, 5, 7])
out_ch = ValueChoice([16, 32, 64])

You may want to search all possible permutations of these values, and my solution won’t do that.

Instead of coding something yourself I would suggest you use a library for hyperparameter tuning, something like that Tune: Scalable Hyperparameter Tuning — Ray v2.0.0.dev0
Reinventing the wheel can be very fun and instructive, but also error prone and inefficient in the long run

You can find some ways of doing that, but not easily and probably it wouldn’t make much sense. The kernels have different size, so it isn’t clear how one would translate in the other.
If however you have a different number of filters, let’s say nn.Conv2d(3, 16, 3) and nn.Conv2d(3, 5, 3) and want to copy the first into the second, you could copy 5 random sampled filter from the nn.Conv2d(3, 16, 3) into nn.Conv2d(3, 5, 3). However, again, it seems quite a weird policy and I don’t personally think it’s a fruitful decision. But you are free to try.
Also, doing this copy is not going to be easy. Standard torch uses state_dict to check that all values matches, and in your case it will throw an error because, well, they don’t! So you need to write your own copy function. You will need to access the parameters through module.named_parameters() and then manually copying them.