M1 pytorch jupyter notebook kernel dead

insung3511 · January 5, 2023, 3:57am

Hey guys, I running alexnet on my macbook pro m1 I have a issue with this error.


[I 12:52:49.616 NotebookApp] Starting buffering for 7919a2ac-1921-469f-838c-71c37fa7b337:71ee38e5-509c-4bb0-a69c-ae46406b8456
[I 12:52:49.623 NotebookApp] Restoring connection for 7919a2ac-1921-469f-838c-71c37fa7b337:71ee38e5-509c-4bb0-a69c-ae46406b8456
info 12:53:35.239: Restart kernel execution
info 12:53:35.239: Restart requested file:///Users/bahk_insung/Documents/Github/torch-study/conv1d.ipynb
info 12:53:35.239: Restarting 7919a2ac-1921-469f-838c-71c37fa7b337
[I 12:53:35.242 NotebookApp] Creating new notebook in 
info 12:53:35.249: installMissingDependencies /Users/bahk_insung/miniconda3/bin/python, ui.disabled=false for resource '/Users/bahk_insung/Documents/Github/torch-study/conv1d.ipynb'
info 12:53:35.250: Got env vars with python /Users/bahk_insung/miniconda3/bin/python, with env var count 67 and custom env var count 63 in 1ms
info 12:53:35.252: Process Execution: > ~/miniconda3/bin/python -c "import ipykernel"
> ~/miniconda3/bin/python -c "import ipykernel"
info 12:53:35.418: Spec argv[0] updated from '/Users/bahk_insung/miniconda3/bin/python' to '/Users/bahk_insung/miniconda3/bin/python'
info 12:53:35.419: Got env vars with python /Users/bahk_insung/miniconda3/bin/python, with env var count 67 and custom env var count 63 in 1ms
[I 12:53:35.446 NotebookApp] Kernel started: 707c1d61-5656-4eed-afaa-0e1f46b25298, name: pythonjvsc74a57bd0cbd03b52000256fffc5622fb1d5afa03ae770321afbfaac74e08013d54c137c2
[W 12:53:35.458 NotebookApp] delete /conv1d-jvsc-b30b9bca-0e40-410d-88a2-1396d6efd81457160c03-a7a1-4995-9268-10657842773d.ipynb
info 12:53:36.045: Got new session 707c1d61-5656-4eed-afaa-0e1f46b25298
info 12:53:36.045: Started new restart session
[I 12:53:36.049 NotebookApp] Kernel shutdown: 7919a2ac-1921-469f-838c-71c37fa7b337
info 12:53:36.073: UpdateWorkingDirectoryAndPath in Kernel
[I 12:53:41.434 NotebookApp] KernelRestarter: restarting kernel (1/5), new random ports
error 12:53:41.437: Error in waiting for cell to complete Error: Canceled future for execute_request message before replies were done
    at t.KernelShellFutureHandler.dispose (/Users/bahk_insung/.vscode/extensions/ms-toolsai.jupyter-2022.11.1003412109/out/extension.node.js:2:32353)
    at /Users/bahk_insung/.vscode/extensions/ms-toolsai.jupyter-2022.11.1003412109/out/extension.node.js:2:26572
    at Map.forEach (<anonymous>)
    at v._clearKernelState (/Users/bahk_insung/.vscode/extensions/ms-toolsai.jupyter-2022.11.1003412109/out/extension.node.js:2:26557)
    at /Users/bahk_insung/.vscode/extensions/ms-toolsai.jupyter-2022.11.1003412109/out/extension.node.js:2:29000
    at processTicksAndRejections (node:internal/process/task_queues:96:5)
warn 12:53:41.437: Cell completed with errors {
  message: 'Canceled future for execute_request message before replies were done'
}
info 12:53:41.437: Cancel all remaining cells true || Error || undefined
[I 12:53:41.437 NotebookApp] Starting buffering for 707c1d61-5656-4eed-afaa-0e1f46b25298:439c0a18-ff80-4866-a6c0-166d700928b3
[I 12:53:41.445 NotebookApp] Restoring connection for 707c1d61-5656-4eed-afaa-0e1f46b25298:439c0a18-ff80-4866-a6c0-166d700928b3

and here the codes.

criterion = nn.CrossEntropyLoss()
alexnet = AlexNet().to(device)
optimizer = optim.Adam(alexnet.parameters(), lr=1e-3)

loss = list()
n = len(trainloader)

for epoch in range(50):
    running_loss = 0.0
    for data in trainloader:
        inputs, labels = data[0].to(device), data[1].to(device)

        # Forward
        optimizer.zero_grad()
        outputs = alexnet(inputs)
        loss = criterion(outputs, labels)

        # Backward
        loss.backward()
        optimizer.step()
        
        running_loss += loss.item()
    
    loss.append(running_loss / n)
    print(f"{epoch + 1} loss : {running_loss / len(trainloader)}")

I installed pytorch by conda, pytorch version is ‘1.13.1’.
Please let me know, anyone who know how to fix it. Thanks you.

Jarek_Production · March 14, 2023, 12:36pm

I’ve encountered the same problem. Can you confirm you macOS system, I’m currently using macOS Ventura. Torch works fine on my macOS Monterey.

MJK · March 19, 2023, 4:00pm

Encountering the same issue, kernel crashing when I set the data.to(device)
Really hope this gets fixed!

My code:

github.com

MJKingsbury/xray-classifier/blob/main/chest_xray_classifier.ipynb

{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Platform: macOS-13.0-arm64-arm-64bit\n",
      "PyTorch version: 2.1.0.dev20230318\n",
      "Is MPS (Metal Performance Shader) built? True\n",
      "Is MPS available? True\n",
      "Using device: mps\n"
     ]
    }
   ],
   "source": [

This file has been truncated. show original

@Jarek_Production I am on Ventura, but you’re saying this wasn’t a problem on Monterey? I’ve looked around and it might be a problem with PyTorch version, I am going to try an older version of PyTorch and report back with results.

Information of program currently not working:

Platform: macOS-13.0-arm64-arm-64bit
PyTorch version: 2.1.0.dev20230318
Is MPS (Metal Performance Shader) built? True
Is MPS available? True
Using device: mps

insung3511 · June 19, 2023, 9:15am

Some of problems with model in training with error, in CPU has worked as well. In CPU they return executions in outputs but in GPU just getting kernel dead. I don’t know why does it happened.