How to make parameters of pytorch module functions searchable?

When we define a Pytorch module, we have to specify the values for all parameters.

But now, I want to make the parameters searchable. For example, maybe we can define a new class ValueChoice, then we can define a convolution as follows:

out_channels = ValueChoice([16,24])
kernel_size = ValueChoice([3,5,7])
convBN = nn.Sequential(
    nn.Conv2d(3, out_channels , kernel_size),
    nn.BatchNorm2d(out_channels)
)

my question is can we implement a class like ValueChoice? If so, how should we do?

I am not sure what you mean by sercheable, could you explain it again or give more examples?

Thanks for your reply.

I haven’t thought too much about the implementation details of ValueChoice. My idea is that ValueChoice receives a list of candidate values, and it can return a value selected from the candidates. Therefore, we can use it to search an optimal architecture.

Searchable means that ValueChoice([3,5,7]) can return a value of 3, 5, or 7 according to the predefined search strategy. For example, nn.conv2d(3, 16, ValueChoice([3,5,7])) can be decoded as nn.conv2d(3, 16, 3), nn.conv2d(3, 16, 5), or nn.conv2d(3, 16, 7).

I have two questions about ValueChoice implementation, and I wish you can give me some suggestions:

  1. how to decode ValueChoice into a value
  2. during the search process, ValueChoice returns a different value at different epoch. My second question is how to change or inherit weights of modules. For example, at the 1-st epoch, ValueChoice returns 3, then we obtain nn.conv2d(3, 16, 3); but if ValueChoice returns 5 at the 2-nd epoch, then we obtain nn.conv2d(3, 16, 5). Can these two convolutional operations share weights?
  1. There are many ways of doing this
import numpy as np
import torch.nn as nn 
class ValueChoice():
    def __init__(self, values):
        self.values = values
        np.random.shuffle(self.values)

    def __iter__(self):
        for i in self.values:
            yield i

a = ValueChoice([1, 2, 3])
i = iter(a)
print(nn.Conv2d(next(i), 2, 3)) # Conv2d(2, 2, kernel_size=(3, 3), stride=(1, 1))
print(nn.Conv2d(next(i), 2, 3)) # Conv2d(3, 2, kernel_size=(3, 3), stride=(1, 1))
print(nn.Conv2d(next(i), 2, 3)) # Conv2d(1, 2, kernel_size=(3, 3), stride=(1, 1))
print(nn.Conv2d(next(i), 2, 3)) # this will give StopIteration error: not enough values.

However, this will do what you ask, but not what you want. If you have something like
kernel = ValueChoice([3, 5, 7])
out_ch = ValueChoice([16, 32, 64])

You may want to search all possible permutations of these values, and my solution won’t do that.

Instead of coding something yourself I would suggest you use a library for hyperparameter tuning, something like that Tune: Scalable Hyperparameter Tuning — Ray v2.0.0.dev0
Reinventing the wheel can be very fun and instructive, but also error prone and inefficient in the long run :slight_smile:

  1. You can find some ways of doing that, but not easily and probably it wouldn’t make much sense. The kernels have different size, so it isn’t clear how one would translate in the other.
    If however you have a different number of filters, let’s say nn.Conv2d(3, 16, 3) and nn.Conv2d(3, 5, 3) and want to copy the first into the second, you could copy 5 random sampled filter from the nn.Conv2d(3, 16, 3) into nn.Conv2d(3, 5, 3). However, again, it seems quite a weird policy and I don’t personally think it’s a fruitful decision. But you are free to try.
    Also, doing this copy is not going to be easy. Standard torch uses state_dict to check that all values matches, and in your case it will throw an error because, well, they don’t! So you need to write your own copy function. You will need to access the parameters through module.named_parameters() and then manually copying them.
1 Like