Good morning guys.
I am trying to run a bunch of (complicated) code but I keep getting the following weird error:
INFO:root:Random state initialized with seed 28794055
INFO:root:Ani1 will be loaded...
INFO:root:cached statistics was loaded...
Traceback (most recent call last):
File "schnetpack_ani1.py", line 296, in <module>
train(args, model, train_loader, val_loader, device)
File "schnetpack_ani1.py", line 156, in train
File "/home/kim.a.nicoli/Projects/Schnetpack_release/src/schnetpack/train.py", line 175, in train
File "/home/kim.a.nicoli/Projects/Schnetpack_release/src/schnetpack/train.py", line 118, in train
loss = self.loss_fn(train_batch, result)
File "schnetpack_ani1.py", line 149, in loss
diff = batch[args.property] - result
==TypeError: sub() received an invalid combination of arguments - got (map), but expected one of:
* (Tensor other, float alpha)
* (float other, float alpha)==
What I don’t get is that if I insert some print statement at the level where it crashes I get that the types of the objects are e.g. Tensors, and not map.
Does any of you have any suggestion on what to look for or what might be the problem? I didn’t find too much about issues with
Thanks in advance for the help
Could you post some more code where you created
result comes from here
y = self.atom_pool(yi, atom_mask)
result = [y]
props = ['y']
at_func = namedtuple('atomwise', props)
as for the batch is just a tensor of a given batch_size created from the Dataloader.
I feel, anyway, that the error might be related to how
result is created.
Thanks for the update. Could you print out the type of
result0] just before the error is thrown?
I did it already.
That’s actually why I was confused. This is the output, although it doesn’t make sense to me that it raises the error since result seems to be of type tensor!
I assume you’ve already checked the type of
Could you create a small executable code snippet so that I could run it on my machine?
It seems that there are conflict with the
If I run the code on a CPU on a single GPU it works.
When I try to go parallel on multiple GPUs it raises the error I reported previously.
I doubled checked and my guess was right. It seems the issue is related to the type returned in the code I showed you.
This means that if I simply return
result as a list it works (with DataParallel on multiple GPUs) whereas when I return a tuple through the named tuple method it breaks.
Nonetheless to return a list it not that nice, I would like to keep returning the namedtuple object. Any clue on how I could overcome the issue?