Hello,
I’m trying to implement neural networks pruning strategies, the code was working fine until I converted it from numpy to pytorch (only the pruning part) to leverage GPU speed.
The code is
The function that does the sparsification:
def magnitude_pruning(A, drop=0.2):
if drop<0:
drop=0
shape_a = A.shape
n_elem = A.numel()
A = A.view(-1)
n_drop = int(n_elem * drop)
drop_idxs = torch.topk(A, n_drop, largest=False, sorted=False)[-1]
mask = torch.ones(n_elem, device=device)
mask[drop_idxs] = 0
A = A * mask
A = A.view(shape_a)
return A, mask.view(shape_a)
And here is a snippet where I use the function
os.environ["CUDA_LAUNCH_BLOCKING"]="1"
for key,p_drop in p.items():
attributepath = key.split(".")
cur = models["resnet18_srn"]
for attr in attributepath[:-1]:
if attr.isdecimal():
cur = cur[int(attr)]
else:
cur = getattr(cur, attr)
W_ = magnitude_pruning(cur.weight.detach(),p_drop)[0]
I get the following error
RuntimeError Traceback (most recent call last)
<ipython-input-13-899364821043> in <module>()
9 else:
10 cur = getattr(cur, attr)
---> 11 W_ = magnitude_pruning(cur.weight.detach(),value)[0]
<ipython-input-8-4d745d9774f3> in magnitude_pruning(A, drop)
7 A = A.view(-1)
8 n_drop = int(n_elem * drop)
----> 9 drop_idxs = torch.topk(A, n_drop, largest=False, sorted=False)[-1]
10 mask = torch.ones(n_elem, device=device)
11 mask[drop_idxs] = 0
RuntimeError: cuda runtime error (710) : device-side assert triggered at /pytorch/aten/src/THC/generic/THCTensorTopK.cu:188
I’m running on colab and I used both
os.environ["CUDA_LAUNCH_BLOCKING"]="1"
!export CUDA_LAUNCH_BLOCKING=1
to ensure cuda’s error is in the spot where it happens,
the value of the p dictionary (which represents the drop percentage) is as follows
{'conv1.weight': 0.5248447873548604,
'last_linear.0.weight': 0.004661282702045355,
'layer1.0.conv1.weight': 0.37808877777400185,
'layer1.0.conv2.weight': 0.06504905686978213,
'layer1.1.conv1.weight': 0.34116854079977843,
'layer1.1.conv2.weight': -7.333395402042697e-08,
'layer2.0.conv1.weight': 0.07128420402834179,
'layer2.0.conv2.weight': -2.1180268428011573e-08,
'layer2.0.downsample.0.weight': 0.5666850700441095,
'layer2.1.conv1.weight': -2.1861888077623348e-08,
'layer2.1.conv2.weight': 0.9691538018553951,
'layer3.0.conv1.weight': 0.4329534168219621,
'layer3.0.conv2.weight': 0.06542427263283024,
'layer3.0.downsample.0.weight': 0.3568506078785666,
'layer3.1.conv1.weight': 0.09328176622263307,
'layer3.1.conv2.weight': 0.6633195976051018,
'layer4.0.conv1.weight': 0.9914340620049251,
'layer4.0.conv2.weight': 0.8091775692583867,
'layer4.0.downsample.0.weight': 0.5552448512140766,
'layer4.1.conv1.weight': 0.9907790148874497,
'layer4.1.conv2.weight': 0.8827941427912476}
Note the error does not occur when using cpu, and when using different p dictionary like this one the error does not occur as well:
{'conv1.weight': 0.7392659868638237,
'last_linear.0.weight': 0.10501561799105052,
'layer1.0.conv1.weight': 0.6349508179361485,
'layer1.0.conv2.weight': 0.44357375136506516,
'layer1.1.conv1.weight': 0.6290701837014776,
'layer1.1.conv2.weight': 0.36933085729608905,
'layer2.0.conv1.weight': 0.4509790372596437,
'layer2.0.conv2.weight': 0.42935426060761517,
'layer2.0.downsample.0.weight': 0.7065802232955889,
'layer2.1.conv1.weight': 0.3707469940086635,
'layer2.1.conv2.weight': 0.7227469103776935,
'layer3.0.conv1.weight': 0.7189300890084274,
'layer3.0.conv2.weight': 0.46585970510469366,
'layer3.0.downsample.0.weight': 0.6160370089895277,
'layer3.1.conv1.weight': 0.5312890654313192,
'layer3.1.conv2.weight': 0.8089039876828459,
'layer4.0.conv1.weight': 0.8378576175338739,
'layer4.0.conv2.weight': 0.7549824163148917,
'layer4.0.downsample.0.weight': 0.6779744785558353,
'layer4.1.conv1.weight': 0.9224107770405527,
'layer4.1.conv2.weight': 0.7217466674953182}
Note that to ensure non-negative percentage in the function magnitude_pruning
I zero negative values of drop.