I am really not sure whats going on. Iv installed pytorch into a conda enviroment with pip and rocm. I have all of the rocm packages installed.
Simple code in both python file and jupyter notebook (vscode).
import torch
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu') # Uncomment this to run on GPU
print(device)
When executing it with python file.py, it prints ‘cuda’ but in jupyter it just crashes the whole kernel.
Canceled future for execute_request message before replies were done
The Kernel crashed while executing code in the the current cell or a previous cell. Please review the code in the cell(s) to identify a possible cause of the failure. Click here for more info. View Jupyter log for further details.
jupyter log:
error 00:32:55.842: Disposing session as kernel process died ExitCode: undefined, Reason: /home/user/miniconda3/envs/ml/lib/python3.10/site-packages/traitlets/traitlets.py:2544: FutureWarning: Supporting extra quotes around strings is deprecated in traitlets 5.0. You can use 'hmac-sha256' instead of '"hmac-sha256"' if you require traitlets >=5.
warn(
/home/user/miniconda3/envs/ml/lib/python3.10/site-packages/traitlets/traitlets.py:2495: FutureWarning: Supporting extra quotes around Bytes is deprecated in traitlets 5.0. Use 'd5e9a513-9ebc-4563-8e95-6ab9113d9732' instead of 'b"d5e9a513-9ebc-4563-8e95-6ab9113d9732"'.
warn(
info 00:32:55.842: Dispose Kernel process 103962.
error 00:32:55.842: Raw kernel process exited code: undefined
error 00:32:55.843: Error in waiting for cell to complete [Error: Canceled future for execute_request message before replies were done
at t.KernelShellFutureHandler.dispose (/home/user/.vscode/extensions/ms-toolsai.jupyter-2023.2.1000592019/out/extension.node.js:2:33213)
at /home/user/.vscode/extensions/ms-toolsai.jupyter-2023.2.1000592019/out/extension.node.js:2:52265
at Map.forEach (<anonymous>)
at y._clearKernelState (/home/user/.vscode/extensions/ms-toolsai.jupyter-2023.2.1000592019/out/extension.node.js:2:52250)
at y.dispose (/home/user/.vscode/extensions/ms-toolsai.jupyter-2023.2.1000592019/out/extension.node.js:2:45732)
at /home/user/.vscode/extensions/ms-toolsai.jupyter-2023.2.1000592019/out/extension.node.js:17:127079
at ee (/home/user/.vscode/extensions/ms-toolsai.jupyter-2023.2.1000592019/out/extension.node.js:2:1552543)
at lh.dispose (/home/user/.vscode/extensions/ms-toolsai.jupyter-2023.2.1000592019/out/extension.node.js:17:127055)
at ph.dispose (/home/user/.vscode/extensions/ms-toolsai.jupyter-2023.2.1000592019/out/extension.node.js:17:134354)
at process.processTicksAndRejections (node:internal/process/task_queues:96:5)]
warn 00:32:55.843: Cell completed with errors {
message: 'Canceled future for execute_request message before replies were done'
}
info 00:32:55.844: Cancel all remaining cells true || Error || undefined
i didnt think that this would be fit for a bug report since im kind of using an unsupported gpu. gfx1010. setting an enviroment variable with HSA_OVERRIDE_GFX_VERSION=10.3.0 makes it work. Iv tested it with an actual code and it does utilize the gpu(python file).