Sending arguments to Model(nn.Module)

Hi I’m trying to iterate through different combinations of bit widths for my QuantLinear layers and QuantReLU layers. My current approach is to do something like this:

#Define Model Class
class Model(nn.Module):
    def __init__(self, *args, **kwargs):
        super(Model, self).__init__(*args, **kwargs)
        self.layer1 = nn.Sequential(
            QuantLinear(40, 256, bias=True, weight_bit_width=ql_bits),

    def forward(self, x):
        out = self.layer1(x)
        return out

#Creating each model within a nested loop in my training function
    max_ql_bits = 8
    max_relu_bits = 8

    # Cycle through different model quantisations
    for ql_bits in range(max_ql_bits): 
        for relu_bits in range(max_relu_bits):

            model = Model(ql_bits,relu_bits)

However when I run this I get the error: TypeError: init() takes 1 positional argument but 3 were given

Is it possible to send parameters in this way to an inherited class?

You can send parameters to an iunherited class, the problem is you are passing those parameters to the parent class which, in fact, recieves none (self only).

In short:

class Model(nn.Module):
    def __init__(self, *args, **kwargs):
        super(Model, self).__init__()

Thanks Juan! I just got it working by changing the code to:

# Define Model Class
class Model(nn.Module):
    def __init__(self, ql_bits, relu_bits):
        self.ql_bits = ql_bits
        self.relu_bits = relu_bits
        super(Model, self).__init__()

And creating the model like this:
model = Model(ql_bits=ql_bits,relu_bits=relu_bits)

This worked while I was testing creating the model separately, however, when I try to implement this with the rest of my training code I get:

model = Model(ql_bits= ql_bits, relu_bits=relu_bits)
SyntaxError: invalid syntax

I’m also getting errors related to multiprocessing:

ERROR:torch.distributed.elastic.multiprocessing.api:failed (exit code: 1)


The strange thing is creating the model originally worked with my full training code for the 1st process but then failed for the 2nd but then it stopped working altogether… For calling my train function I use:
mp.spawn(train, nprocs=args.gpus, args=(args,))

Any ideas on what’s going wrong? Thanks

Hmmm in general using multiprocessing and cuda leads to lots of problems.
are you using pytorch multiprocessing library or python’s one?

anyway I never used multiprocessing with pytorch so it’d be better to open a new post.
Specially explain if you are using quantization somewhere as it may be important.