Segfault: Pytorch 0.4.0 on Raspberry Pi 3B

Ashutosh_Mishra · February 25, 2019, 3:11pm

I installed Pytorch 0.4.0a0+3749c58 from source on Raspberrry Pi 3B from this link. I created a swap memory of 10 GB.

When I am running simple mnist example on the board, it runs randomly less than for 100 iterations and then it gives

Segmentation Fault: Core dumped

When I see memory usage using htop, the 4 cores get utilized about 100% and the the segfault happens.

Any suggestions to counter this segfault ?

Kushaj · February 25, 2019, 5:17pm

As you said it stops after some 100 iterations, my best guess is the history object is growing. See this example for the explanation

output = model(input)
loss = myLossFunction(output, labels)
optimizer.zero_grad()
loss.backward()
optimizer.step()

# Here is the problem I am talking about
# I think you are something similar to this
running_loss += loss

When you add loss in this way, you are not detaching the loss first from the graph and as a result pytorch keeps a trace of it. As you keep iterating, this trace grows in size.

Ashutosh_Mishra · February 26, 2019, 7:47am

Hey Kushraj,

Thanks for the reply.

This

running_loss += loss

is being done during inference.

Do you suggest me not to do this ??

BTW, I am following the MNIST example given on the pytorch repo.

Kushaj · February 26, 2019, 12:37pm

It is a bit hard to tell without the complete code why you are getting segmentation error. During inference are you calling model.eval()