Why? RuntimeError: expected scalar type Double but found Float

Silvio_Junior · April 11, 2023, 4:31pm

tl;dr: I am very new to Pytorch and this is really important for me. I know is a very long question but I am only providing the whole code to contextualize everything. The problem itself is probably very straightforward to solve and to understand, but I tried in another forum and to read some references but could’t solve. I am more used to tensorflow, but I am learning pytorch and I need this code to run and I tried a lot of things. Probably anyone who works with torch can solve this in a few minutes. Any help will be highly appreciated. It’s very important for me. Thanks in advance.

The whole thing:

When I run the following code:

class Generator(nn.Module):
    
    def __init__(self, hilbert_size, num_points, noise=None):
        super(Generator, self).__init__()
    
        self.initializer = nn.init.normal_

        ops = nn.Parameter(torch.empty(1, hilbert_size, hilbert_size, num_points * 2))
        inputs = torch.zeros((1, num_points), requires_grad=True)
        inputs = torch.nn.init.uniform_(inputs, a=0.0, b=1.0)

        layer = nn.Linear(num_points, 16 * 16 * 2, bias=False)
        init.normal_(layer.weight, mean=0.0, std=0.02)

        self.x = nn.Sequential(
            layer,
            nn.LeakyReLU(),
            nn.Unflatten(1, (2,16,16))
            )

        self.conv_transpose_1 = nn.Sequential(
            nn.ConvTranspose2d(2, 64, kernel_size=4, stride=1, padding=1, bias=False),
            nn.InstanceNorm2d(64),
            nn.LeakyReLU(),
        )

        self.conv_transpose_2 = nn.Sequential(
            nn.ConvTranspose2d(64, 64, kernel_size=4, stride=1, padding=2, bias=False),
            nn.InstanceNorm2d(64),
            nn.LeakyReLU(),
        )

        self.conv_transpose_3 = nn.Sequential(
            nn.ConvTranspose2d(64, 32, kernel_size=4, stride=1, padding=1, bias=False),
        )

        self.conv_transpose_4 = nn.Sequential(
            nn.ConvTranspose2d(32, 2, kernel_size=4, stride=1, padding=2, bias=False),
        )

        self.density_matrix = DensityMatrix()
        self.expectation = Expectation()
    
    def forward(self, ops, inputs):
        x = self.x(inputs)
        x = self.conv_transpose_1(x)
        x = self.conv_transpose_2(x)
        x = self.conv_transpose_3(x)
        x = self.conv_transpose_4(x)
        x = self.density_matrix(x)
        complex_ops = convert_to_complex_ops(ops)
        prefactor = 1.0
        x = self.expectation(complex_ops, x, prefactor)
    
        return x

class Discriminator(nn.Module):
    def __init__(self, hilbert_size, num_points):
        super(Discriminator, self).__init__()

        initializer = nn.init.normal_

        self.inp = nn.Identity(num_points)
        self.tar = nn.Identity(num_points)
        self.ops = nn.Identity(hilbert_size, hilbert_size, num_points*2)

        self.fc1 = nn.Linear(num_points*2, 128)
        self.lrelu1 = nn.LeakyReLU()

        self.fc2 = nn.Linear(128, 128)
        self.lrelu2 = nn.LeakyReLU()

        self.fc3 = nn.Linear(128, 64)
        self.relu3 = nn.ReLU()
        self.fc4 = nn.Linear(64, 1)

        initializer(self.fc1.weight, mean=0.0, std=0.002)
        initializer(self.fc2.weight, mean=0.0, std=0.002)
        initializer(self.fc3.weight, mean=0.0, std=0.002)
        initializer(self.fc4.weight, mean=0.0, std=0.002)

    def forward(self, ops, inp, tar):
        x = torch.cat([inp, tar], dim=1)
        x = self.fc1(x)
        x = self.lrelu1(x)

        x = self.fc2(x)
        x = self.lrelu2(x)

        x = self.fc3(x)
        x = self.relu3(x)
        x = self.fc4(x)

        return x

    def train_step(A, x):
        gen_output = generator(A, x)

        disc_real_output = discriminator([A, x, x])
        disc_generated_output = discriminator([A, x, gen_output])

        gen_total_loss, gen_gan_loss, gen_l1_loss = generator_loss(
          disc_generated_output, gen_output, x, lam=lam
        )
       disc_loss = discriminator_loss(disc_real_output, disc_generated_output)

       generator.zero_grad()
      discriminator.zero_grad()

      gen_total_loss.backward(retain_graph=True)
      disc_loss.backward()

      generator_optimizer.step()
      discriminator_optimizer.step()

And:

generator = Generator(hilbert_size, num_measurements, noise=0.)
discriminator = Discriminator(hilbert_size, num_measurements)

density_layer_idx = None

for i, (name, layer) in enumerate(generator.named_modules()):
    if "density_matrix" in name:
        density_layer_idx = i
        break

model_dm = nn.Sequential(*list(generator.children())[:density_layer_idx + 1])

initial_learning_rate = 0.0002
decay_steps = 10000
decay_rate = 0.96
lam = 100.0

generator_optimizer = optim.Adam(generator.parameters(), lr=initial_learning_rate, betas=(0.5, 0.5))
discriminator_optimizer = optim.Adam(discriminator.parameters(), lr=initial_learning_rate, betas=(0.5,0.5))

lr_scheduler_G = optim.lr_scheduler.StepLR(generator_optimizer, step_size=decay_steps, gamma=decay_rate)
lr_scheduler_D = optim.lr_scheduler.StepLR(discriminator_optimizer, step_size=decay_steps, gamma=decay_rate)

max_iterations = 1000

pbar = tqdm(range(max_iterations))
for i in pbar:
    train_step(A, x)
    density_matrix = model_dm([A, x])
    f = tf_fidelity(density_matrix, rho_tf)[-1]
    fidelities.append(f)
    pbar.set_description("Fidelity {} | Gen loss {} | L1 loss {} | Disc loss {}".format(f, loss.generator[-1], loss.l1[-1], loss.discriminator[-1]))

I’m giving this bunch of code just to contextualize the whole thing. The problems are here:

train_step(A,x)
model_dm([A, x])

The first raises the following error:

In [81]: train_step(A,x)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[81], line 1
----> 1 train_step(A,x)

Cell In[10], line 313, in train_step(A, x)
  266 def train_step(A, x):
267     """Takes one step of training for the full A matrix representing the
268     measurement operators and data x.
269 
   (...)
311     >> density_matrix = model_dm([A, x])    
312     """
--> 313     gen_output = generator([A, x])
    315     disc_real_output = discriminator([A, x, x])
    316     disc_generated_output = discriminator([A, x, gen_output])

File ~/.virtualenvs/cgan/lib/python3.10/site-packages/torch/nn/modules/module.py:1194, in Module._call_impl(self, *input, **kwargs)
   1190 # If we don't have any hooks, we want to skip the rest of the logic in
   1191 # this function, and just call forward.
   1192 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1193         or _global_forward_hooks or _global_forward_pre_hooks):
-> 1194     return forward_call(*input, **kwargs)
   1195 # Do not call functions when jit is used
   1196 full_backward_hooks, non_full_backward_hooks = [], []

TypeError: Generator.forward() missing 1 required positional argument: 'inputs'

I really do not understand why: the inputs isn’t defined inside the Generator class as input to the forward function?

The second raises the following error:

In [82]: model_dm([A, x])
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[82], line 1
----> 1 model_dm([A, x])

File ~/.virtualenvs/cgan/lib/python3.10/site-packages/torch/nn/modules/module.py:1194, in Module._call_impl(self, *input, **kwargs)
   1190 # If we don't have any hooks, we want to skip the rest of the logic in
   1191 # this function, and just call forward.
   1192 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1193         or _global_forward_hooks or _global_forward_pre_hooks):
-> 1194     return forward_call(*input, **kwargs)
   1195 # Do not call functions when jit is used
   1196 full_backward_hooks, non_full_backward_hooks = [], []

File ~/.virtualenvs/cgan/lib/python3.10/site-packages/torch/nn/modules/container.py:204, in Sequential.forward(self, input)
    202 def forward(self, input):
    203     for module in self:
--> 204         input = module(input)
    205     return input

File ~/.virtualenvs/cgan/lib/python3.10/site-packages/torch/nn/modules/module.py:1194, in Module._call_impl(self, *input, **kwargs)
   1190 # If we don't have any hooks, we want to skip the rest of the logic in
   1191 # this function, and just call forward.
   1192 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1193         or _global_forward_hooks or _global_forward_pre_hooks):
-> 1194     return forward_call(*input, **kwargs)
   1195 # Do not call functions when jit is used
   1196 full_backward_hooks, non_full_backward_hooks = [], []

File ~/.virtualenvs/cgan/lib/python3.10/site-packages/torch/nn/modules/container.py:204, in Sequential.forward(self, input)
    202 def forward(self, input):
    203     for module in self:
--> 204         input = module(input)
    205     return input

File ~/.virtualenvs/cgan/lib/python3.10/site-packages/torch/nn/modules/module.py:1194, in Module._call_impl(self, *input, **kwargs)
   1190 # If we don't have any hooks, we want to skip the rest of the logic in
   1191 # this function, and just call forward.
   1192 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1193         or _global_forward_hooks or _global_forward_pre_hooks):
-> 1194     return forward_call(*input, **kwargs)
   1195 # Do not call functions when jit is used
   1196 full_backward_hooks, non_full_backward_hooks = [], []

File ~/.virtualenvs/cgan/lib/python3.10/site-packages/torch/nn/modules/linear.py:114, in Linear.forward(self, input)
    113 def forward(self, input: Tensor) -> Tensor:
--> 114     return F.linear(input, self.weight, self.bias)

TypeError: linear(): argument 'input' (position 1) must be Tensor, not list

Which I also can’t understand. What I want is to access the output of the specific layer density_matrix.

If I replace gen_output = generator([A, x]) in the train_step function to gen_output = generator(A, x) then a different error is raised:

In [90]:     gen_output = generator(A, x)
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
Cell In[90], line 1
----> 1 gen_output = generator(A, x)

File ~/.virtualenvs/cgan/lib/python3.10/site-packages/torch/nn/modules/module.py:1194, in Module._call_impl(self, *input, **kwargs)
   1190 # If we don't have any hooks, we want to skip the rest of the logic in
   1191 # this function, and just call forward.
   1192 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1193         or _global_forward_hooks or _global_forward_pre_hooks):
-> 1194     return forward_call(*input, **kwargs)
   1195 # Do not call functions when jit is used
   1196 full_backward_hooks, non_full_backward_hooks = [], []

Cell In[73], line 88, in Generator.forward(self, ops, inputs)
     87 def forward(self, ops, inputs):
---> 88     x = self.x(inputs)
     89     x = self.conv_transpose_1(x)
     90     x = self.conv_transpose_2(x)

File ~/.virtualenvs/cgan/lib/python3.10/site-packages/torch/nn/modules/module.py:1194, in Module._call_impl(self, *input, **kwargs)
   1190 # If we don't have any hooks, we want to skip the rest of the logic in
   1191 # this function, and just call forward.
   1192 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1193         or _global_forward_hooks or _global_forward_pre_hooks):
-> 1194     return forward_call(*input, **kwargs)
   1195 # Do not call functions when jit is used
   1196 full_backward_hooks, non_full_backward_hooks = [], []

File ~/.virtualenvs/cgan/lib/python3.10/site-packages/torch/nn/modules/container.py:204, in Sequential.forward(self, input)
    202 def forward(self, input):
    203     for module in self:
--> 204         input = module(input)
    205     return input

File ~/.virtualenvs/cgan/lib/python3.10/site-packages/torch/nn/modules/module.py:1194, in Module._call_impl(self, *input, **kwargs)
   1190 # If we don't have any hooks, we want to skip the rest of the logic in
   1191 # this function, and just call forward.
   1192 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1193         or _global_forward_hooks or _global_forward_pre_hooks):
-> 1194     return forward_call(*input, **kwargs)
   1195 # Do not call functions when jit is used
   1196 full_backward_hooks, non_full_backward_hooks = [], []

File ~/.virtualenvs/cgan/lib/python3.10/site-packages/torch/nn/modules/linear.py:114, in Linear.forward(self, input)
    113 def forward(self, input: Tensor) -> Tensor:
--> 114     return F.linear(input, self.weight, self.bias)

RuntimeError: expected scalar type Double but found Float

ptrblck · April 11, 2023, 4:38pm

The Generator.forward method is defined as:

def forward(self, ops, inputs):

so expects two input arguments while you are passing a single list to it:

gen_output = generator([A, x])

so you might want to pass both objects separately to this method.

Related to the first issue also: you are again passing a list to the forward method:

model_dm([A, x])

and are trying to eventually pass it to the linear layer instead of a tensor, which would be expected.

I also don’t know where the dtype mismatch mentioned in the title is raised.

Silvio_Junior · April 11, 2023, 5:14pm

Thanks for the answer.

The part that the generator expects two input arguments:

def forward(self, ops, inputs):

is clear but I think I am misusing it: is the .forward method which expects the inputs but the inputs were defined some lines above as:

        inputs = torch.zeros((1, num_points), requires_grad=True)
        inputs = torch.nn.init.uniform_(inputs, a=0.0, b=1.0)

Is it missing something?

I don’t know if it is necessary, but the original code was in tensorflow:

def Generator(hilbert_size, num_points, noise=None):
initializer = tf.random_normal_initializer(0.0, 0.02)

ops = tf.keras.layers.Input(
    shape=[hilbert_size, hilbert_size, num_points * 2], name="operators"
)
inputs = tf.keras.Input(shape=(num_points), name="inputs")

x = tf.keras.layers.Dense(
    16 * 16 * 2,
    use_bias=False,
    kernel_initializer=tf.random_normal_initializer(0.0, 0.02),
)(inputs)
x = tf.keras.layers.LeakyReLU()(x)
x = tf.keras.layers.Reshape((16, 16, 2))(x)

x = tf.keras.layers.Conv2DTranspose(
    64, 4, use_bias=False, strides=1, padding="same", kernel_initializer=initializer
)(x)
x = tfa.layers.InstanceNormalization(axis=3)(x)
x = tf.keras.layers.LeakyReLU()(x)
x = tf.keras.layers.Conv2DTranspose(
    64, 4, use_bias=False, strides=1, padding="same", kernel_initializer=initializer
)(x)
x = tfa.layers.InstanceNormalization(axis=3)(x)
x = tf.keras.layers.LeakyReLU()(x)
x = tf.keras.layers.Conv2DTranspose(
    32, 4, use_bias=False, strides=1, padding="same", kernel_initializer=initializer
)(x)

x = tf.keras.layers.Conv2DTranspose(
    2, 4, use_bias=False, strides=1, padding="same", kernel_initializer=initializer
)(x)
x = DensityMatrix()(x)
complex_ops = convert_to_complex_ops(ops)
# prefactor = (0.25*g**2/np.pi)
prefactor = 1.0
x = Expectation()(complex_ops, x, prefactor)
x = tf.keras.layers.GaussianNoise(noise)(x)

return tf.keras.Model(inputs=[ops, inputs], outputs=x)

And for the model_dm: how can I pass it as a tensor, since it is the type expected? What i want is to get the output of the desired layer and save it to the model_dm, but I don’t know how to extract this information. I wrote:

model_dm = nn.Sequential(*list(generator.children())[:density_layer_idx + 1])

and I thought it would be saved as a model and not as a list.

As for the dtype problem mentioned in the title, I edited the original question.

ptrblck · April 11, 2023, 8:59pm

You are creating inputs in the __init__ method and are deleting it after leaving the method.
The forward method expects two input arguments and won’t try to use a tensor defined in another method as an input.
Here is a small example:

def fun(arg1, arg2):
    pass

# works
fun(arg1=1, arg2=2)

# fails
fun(arg1=1)
# TypeError: fun() missing 1 required positional argument: 'arg2'

# still fails
arg2 = 2
fun(arg1=1)
# TypeError: fun() missing 1 required positional argument: 'arg2'

# also fails if arguments are passed as a list
fun([1, 2])
# TypeError: fun() missing 1 required positional argument: 'arg2'

Rewrapping modules into an nn.Sequential container assumes all child modules are returned in the same order they are executed and also assumes a strict sequential execution path.
Additionally, only single inputs are supported, so you would need to make sure multiple inputs are accepted as lists or tuples.

The dtype mismatch seems to be raised as x is initialized as a DoubleTensor so transform it to float32 and it might work: x = x.to(torch.float32).

Silvio_Junior · April 12, 2023, 2:17pm

Again. Thanks for the answer and for your time.

Regarding the first point:

"You are creating inputs in the __init__ method and are deleting it after leaving the method. The forward method expects two input arguments and won’t try to use a tensor defined in another method as an input."

I didn’t knew it was going to be deleted. A self.inputs as argument to the forward function will fix the problem? Those inputs are just initializer inputs to the network. In the original code in tensorflow it was:

inputs = tf.keras.Input(shape=(num_points), name="inputs")

x = tf.keras.layers.Dense(
    16 * 16 * 2,
    use_bias=False,
    kernel_initializer=tf.random_normal_initializer(0.0, 0.02),
)(inputs)

I tried to translate it to pytorch but I only created problems. As I said, I am learning torch so the structure is not exactly clear to me yet.

I don’t know if in torch I need the inputs just like I need in tf. I’m afraid that if I use these inputs in the forward function they will be rewritten everytime and what I want is to update it when I train the network. It is easy and clear in tensorflow but not as easy and clear in torch, not yet.

As for the small example you gave, I understand it. My problem was that since the inputs were defined before the forward i thought they will be used but I understand now.

This part of your answer "Rewrapping modules into an nn.Sequential container assumes all child modules are returned in the same order they are executed and also assumes a strict sequential execution path. Additionally, only single inputs are supported, so you would need to make sure multiple inputs are accepted as lists or tuples." I didn’t get it.

Again, I will sample a code from tensorflow:

model_dm = tf.keras.Model(inputs=generator.input, outputs=generator.layers[density_layer_idx].output)

It gives me a model and it saves the output from the layer I want, in this case the density_layer_idx which is the 17or the DensityMatrix layer. What I want in torch is to access this layer and get its output, that is why naively I wrote this piece of code:

model_dm = nn.Sequential(*list(generator.children())[:density_layer_idx + 1])

but it doesn’t work as I thought it would.

I tried, and I will keep on trying the rest of this week to fix all of this and get this network to work properly. The code in tensorflow works fine but this translation to torch is important to me, that is why I am insisting but after some days I feel like I am getting nowhere so I need an extra push.

ptrblck · April 12, 2023, 5:30pm

Assigning inputs to an attribute via self.input will allow you to use it in the forward method, but it seems you want to use this tensor to initialize module parameters which won’t work using this approach.
Instead create the module in the __init__ method e.g. via:
``python
self.fc1 = nn.Linear(10, 10)

and initialize it via methods from `torch.nn.init`:
```python
torch.nn.init.normal_(self.fc1.weight)

either directly inside the __init__ method or outside of the model.

PyTorch doesn’t use placeholder objects but allows you to pass the tensor containing the actual data into the modules directly. I think it’s debatable which approach is easier and cleaner

This tutorial explains the concepts in more detail.