Hi! I have been trying to draw some samples from a weight tensor and do some following stuff. This part is in a very large project, and it sometimes breaks down after being operated millions of iterations.
Therefore, I tried to print out the sample_op and weight tensor to debug. However, I got another bug for the sentence “print(sample_op)”, which looks like I even cannot print out the tensor of sample_op. I wish to get some help, s. Thanks for your attention and help!
The code is like this:
# sample_weight and op_weight are all tensors
sample_op = torch.multinomial(sample_weight, 2, replacement=False)
try:
probs_slice = F.softmax(torch.stack([
op_weight.data[i, idx] for idx in sample_op]),
dim=0)
except RuntimeError:
print(sample_op)
print(op_weight)
exit(0)
The error log is like this:
File "/cache/user-job-dir/nas-branch/models/model_search.py", line 109, in binarize
print(sample_op)
File "/home/work/anaconda/lib/python3.6/site-packages/torch/tensor.py", line 114, in __repr__
return torch._tensor_str._str(self)
File "/home/work/anaconda/lib/python3.6/site-packages/torch/_tensor_str.py", line 311, in _str
tensor_str = _tensor_str(self, indent)
File "/home/work/anaconda/lib/python3.6/site-packages/torch/_tensor_str.py", line 209, in _tensor_str
formatter = _Formatter(get_summarized_data(self) if summarize else self)
File "/home/work/anaconda/lib/python3.6/site-packages/torch/_tensor_str.py", line 83, in __init__
value_str = '{}'.format(value)
File "/home/work/anaconda/lib/python3.6/site-packages/torch/tensor.py", line 361, in __format__
return self.item().__format__(format_spec)
RuntimeError: CUDA error: device-side assert triggered