Installation > Compute Platform: ROCm 6.1

dcpetit · October 17, 2024, 2:55am

Hello! I am trying to install PyTorch. How do I upgrade my ROCm to 6.1 (to install PyTorch)?

I have an Ubuntu linux with the following specifications:

Hardware Model: Lenovo ThinkPad X13 Gen 4
…
Processor: AMD Ryzen™ 7 PRO 7840U w/ Radeon™ 780M Graphics × 16
Graphics: AMD Radeon™ Graphics
Firmware Version: R29ET56W (1.30 )
…
OS Name: Ubuntu 24.04.1 LTS
OS Build: (null)
OS Type: 64-bit
GNOME Version: 46
Windowing System: Wayland
Kernel Version: Linux 6.8.0-47-generic

I seem to have an older version(s) of ROCm. When I use “apt show rocm…” in the command line I get:

1.)
Package: rocminfo
Version: 5.7.1-3build1
Priority: optional
Section: universe/devel
Origin: Ubuntu
Installed-Size: 95.2 kB
Depends: libc6 (>= 2.34), libgcc-s1 (>= 3.3.1), libhsa-runtime64-1 (>= 5.7.1~), libstdc++6 (>= 11), python3, pciutils, kmod
Download-Size: 25.6 kB

2.)
Package: rocm-cmake
Version: 6.0.0-1
Priority: optional
Section: universe/devel
Origin: Ubuntu
Installed-Size: 138 kB
Download-Size: 25.6 kB

3.)
Package: rocm-device-libs-17
Version: 6.0+git20231212.5a852ed-2
Priority: optional
Section: universe/libs
Source: rocm-device-libs
Origin: Ubuntu
Installed-Size: 3,324 kB
Download-Size: 549 kB

4.)
Package: rocm-smi
Version: 5.7.0-1
Priority: optional
Section: universe/utils
Source: rocm-smi-lib
Origin: Ubuntu
Installed-Size: 227 kB
Depends: python3:any, librocm-smi64-1 (= 5.7.0-1)
Download-Size: 52.9 kB

None of the version numbers at 6.1 or higher, so I think I should upgrade before installing, yes? Anyone familiar with upgrading ROCm that can help me here

Maksym_Oberemchenko · December 7, 2024, 1:13pm

780M Graphics is not supported! See the ROCm docs.
You may want to try 7600 it looks like it’s on the list.

dcpetit · January 6, 2025, 5:25am

Hello Maksym_Oberemchenko,

Thank you for your reply!

I am looking at the ROCm docs and not finding relevant information on what hardware is supported; is it in these (AMD) GPUs here?

How can I try 7600? I don’t see it on any of the lists now, nor know how to use it for installing ROCm and then PyTorch

fluidnumerics_joe · March 3, 2025, 9:02pm

Hey @dcpetit ,

There’s a good deal of information on installing ROCm at Quick start installation guide — ROCm installation (Linux)

Definitely check out the system requirements at System requirements (Linux) — ROCm installation (Linux)

We’ve had varied success with GPUs that are not in the “Supported GPUs” list; we’ve found that making sure your OS and the Linux kernel version are supported is definitely critical. From what you’ve shared, Ubuntu 24.04 with Linux kernel 6.8.0-47-generic is supported.

I’d recommend uninstalling uninstalling the existing rocm and amdgpu-dkms packages you have currently installed. From there, use the amdgpu-install helper script to install the latest ROCm 6.1 release (6.1.4)

wget https://repo.radeon.com/amdgpu-install/6.1.4/ubuntu/jammy/amdgpu-install_6.1.60104-1_all.deb
sudo apt install ./amdgpu-install_6.1.60104-1_all.deb
sudo apt update
sudo apt install amdgpu-dkms rocm

Once installed, reboot your system. Check to see if your GPUs are picked up by rocm-smi and rocminfo

Hope this helps

dcpetit · March 4, 2025, 4:19am

Hi Joe! Thanks for your response. I’ve taken your advice and gotten some results, I think they are good. My original post is quite old, so ROCm’s latest version seems to be 6.3 now, and I edited your recommended code to reflect that. After a long installation I get this:

d...@...:~$: rocm-smi
=========================================== ROCm System Management Interface ===========================================
===================================================== Concise Info =====================================================
Device  Node  IDs              Temp    Power     Partitions          SCLK  MCLK    Fan  Perf  PwrCap       VRAM%  GPU%  
              (DID,     GUID)  (Edge)  (Socket)  (Mem, Compute, ID)                                                     
========================================================================================================================
0       1     0x15bf,   26266  43.0°C  12.073W   N/A, N/A, 0         None  800Mhz  0%   auto  Unsupported  73%    1%    
========================================================================================================================
================================================= End of ROCm SMI Log ==================================================
(base) d...@...:~$

and

:~$ rocm-smi -i
============================ ROCm System Management Interface ============================
=========================================== ID ===========================================
GPU[0]		: Device Name: 		Phoenix1
GPU[0]		: Device ID: 		0x15bf
GPU[0]		: Device Rev: 		0xdd
GPU[0]		: Subsystem ID: 	0x50d0
GPU[0]		: GUID: 		26266
==========================================================================================
================================== End of ROCm SMI Log ===================================

and

(base) d...@...:~$:~$ rocminfo
ROCk module version 6.10.5 is loaded
=====================    
HSA System Attributes    
=====================    
Runtime Version:         1.14
Runtime Ext Version:     1.6
System Timestamp Freq.:  1000.000000MHz
Sig. Max Wait Duration:  18446744073709551615 (0xFFFFFFFFFFFFFFFF) (timestamp count)
Machine Model:           LARGE                              
System Endianness:       LITTLE                             
Mwaitx:                  DISABLED
DMAbuf Support:          YES

==========               
HSA Agents               
==========               
*******                  
Agent 1                  
*******                  
  Name:                    AMD Ryzen 7 PRO 7840U w/ Radeon 780M Graphics
  Uuid:                    CPU-XX                             
  Marketing Name:          AMD Ryzen 7 PRO 7840U w/ Radeon 780M Graphics
  Vendor Name:             CPU                                
  Feature:                 None specified                     
  Profile:                 FULL_PROFILE                       
  Float Round Mode:        NEAR                               
  Max Queue Number:        0(0x0)                             
  Queue Min Size:          0(0x0)                             
  Queue Max Size:          0(0x0)                             
  Queue Type:              MULTI                              
  Node:                    0                                  
  Device Type:             CPU                                
  Cache Info:              
    L1:                      32768(0x8000) KB                   
  Chip ID:                 0(0x0)                             
  ASIC Revision:           0(0x0)                             
  Cacheline Size:          64(0x40)                           
  Max Clock Freq. (MHz):   5132                               
  BDFID:                   0                                  
  Internal Node ID:        0                                  
  Compute Unit:            16                                 
  SIMDs per CU:            0                                  
  Shader Engines:          0                                  
  Shader Arrs. per Eng.:   0                                  
  WatchPts on Addr. Ranges:1                                  
  Memory Properties:       
  Features:                None
  Pool Info:               
    Pool 1                   
      Segment:                 GLOBAL; FLAGS: FINE GRAINED        
      Size:                    31452120(0x1dfebd8) KB             
      Allocatable:             TRUE                               
      Alloc Granule:           4KB                                
      Alloc Recommended Granule:4KB                                
      Alloc Alignment:         4KB                                
      Accessible by all:       TRUE                               
    Pool 2                   
      Segment:                 GLOBAL; FLAGS: EXTENDED FINE GRAINED
      Size:                    31452120(0x1dfebd8) KB             
      Allocatable:             TRUE                               
      Alloc Granule:           4KB                                
      Alloc Recommended Granule:4KB                                
      Alloc Alignment:         4KB                                
      Accessible by all:       TRUE                               
    Pool 3                   
      Segment:                 GLOBAL; FLAGS: KERNARG, FINE GRAINED
      Size:                    31452120(0x1dfebd8) KB             
      Allocatable:             TRUE                               
      Alloc Granule:           4KB                                
      Alloc Recommended Granule:4KB                                
      Alloc Alignment:         4KB                                
      Accessible by all:       TRUE                               
    Pool 4                   
      Segment:                 GLOBAL; FLAGS: COARSE GRAINED      
      Size:                    31452120(0x1dfebd8) KB             
      Allocatable:             TRUE                               
      Alloc Granule:           4KB                                
      Alloc Recommended Granule:4KB                                
      Alloc Alignment:         4KB                                
      Accessible by all:       TRUE                               
  ISA Info:                
*******                  
Agent 2                  
*******                  
  Name:                    gfx1103                            
  Uuid:                    GPU-XX                             
  Marketing Name:          AMD Radeon Graphics                
  Vendor Name:             AMD                                
  Feature:                 KERNEL_DISPATCH                    
  Profile:                 BASE_PROFILE                       
  Float Round Mode:        NEAR                               
  Max Queue Number:        128(0x80)                          
  Queue Min Size:          64(0x40)                           
  Queue Max Size:          131072(0x20000)                    
  Queue Type:              MULTI                              
  Node:                    1                                  
  Device Type:             GPU                                
  Cache Info:              
    L1:                      32(0x20) KB                        
    L2:                      2048(0x800) KB                     
  Chip ID:                 5567(0x15bf)                       
  ASIC Revision:           9(0x9)                             
  Cacheline Size:          64(0x40)                           
  Max Clock Freq. (MHz):   2700                               
  BDFID:                   49920                              
  Internal Node ID:        1                                  
  Compute Unit:            12                                 
  SIMDs per CU:            2                                  
  Shader Engines:          1                                  
  Shader Arrs. per Eng.:   2                                  
  WatchPts on Addr. Ranges:4                                  
  Coherent Host Access:    FALSE                              
  Memory Properties:       APU
  Features:                KERNEL_DISPATCH 
  Fast F16 Operation:      TRUE                               
  Wavefront Size:          32(0x20)                           
  Workgroup Max Size:      1024(0x400)                        
  Workgroup Max Size per Dimension:
    x                        1024(0x400)                        
    y                        1024(0x400)                        
    z                        1024(0x400)                        
  Max Waves Per CU:        32(0x20)                           
  Max Work-item Per CU:    1024(0x400)                        
  Grid Max Size:           4294967295(0xffffffff)             
  Grid Max Size per Dimension:
    x                        4294967295(0xffffffff)             
    y                        4294967295(0xffffffff)             
    z                        4294967295(0xffffffff)             
  Max fbarriers/Workgrp:   32                                 
  Packet Processor uCode:: 40                                 
  SDMA engine uCode::      21                                 
  IOMMU Support::          None                               
  Pool Info:               
    Pool 1                   
      Segment:                 GLOBAL; FLAGS: COARSE GRAINED      
      Size:                    15726060(0xeff5ec) KB              
      Allocatable:             TRUE                               
      Alloc Granule:           4KB                                
      Alloc Recommended Granule:2048KB                             
      Alloc Alignment:         4KB                                
      Accessible by all:       FALSE                              
    Pool 2                   
      Segment:                 GLOBAL; FLAGS: EXTENDED FINE GRAINED
      Size:                    15726060(0xeff5ec) KB              
      Allocatable:             TRUE                               
      Alloc Granule:           4KB                                
      Alloc Recommended Granule:2048KB                             
      Alloc Alignment:         4KB                                
      Accessible by all:       FALSE                              
    Pool 3                   
      Segment:                 GROUP                              
      Size:                    64(0x40) KB                        
      Allocatable:             FALSE                              
      Alloc Granule:           0KB                                
      Alloc Recommended Granule:0KB                                
      Alloc Alignment:         0KB                                
      Accessible by all:       FALSE                              
  ISA Info:                
    ISA 1                    
      Name:                    amdgcn-amd-amdhsa--gfx1103         
      Machine Models:          HSA_MACHINE_MODEL_LARGE            
      Profiles:                HSA_PROFILE_BASE                   
      Default Rounding Mode:   NEAR                               
      Default Rounding Mode:   NEAR                               
      Fast f16:                TRUE                               
      Workgroup Max Size:      1024(0x400)                        
      Workgroup Max Size per Dimension:
        x                        1024(0x400)                        
        y                        1024(0x400)                        
        z                        1024(0x400)                        
      Grid Max Size:           4294967295(0xffffffff)             
      Grid Max Size per Dimension:
        x                        4294967295(0xffffffff)             
        y                        4294967295(0xffffffff)             
        z                        4294967295(0xffffffff)             
      FBarrier Max Size:       32                                 
*** Done ***             
(base) d...@...:~$

Looks like it works. Is the best place to learn how to install PyTorch here? It is suggesting that I use pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm6.2.4, however I did just install 6.3 intead of 6.2(.4). Does it make a difference?

fluidnumerics_joe · March 4, 2025, 4:04pm

@dcpetit ,
This is looking good. Yes, Start Locally | PyTorch is the easiest way to get started. Definitely match the major.minor version of your ROCm installation with the appropriate build of pytorch.

Since you’re working with ROCm 6.3, you can use the nightly installs of pytorch

pip3 install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/rocm6.3

Note that your GPU is not on AMD’s supported GPU list. What this means is that AMD does not regularly test ROCm against that GPU. In my experience, this simply means your mileage will vary.

For gfx1103 GPUs, like yours, you can try setting the environment variable HSA_OVERRIDE_GFX_VERSION to 11.0.0, e.g.

export HSA_OVERRIDE_GFX_VERSION=11.0.0

before running pytorch applications ( you can even add this to your ~/.bashrc so you don’t have to remember this every time ). This will force pytorch to view your GPU as a gfx1100 card. It’d be interesting to see what this does with some minimal pytorch examples. Let me know if you run into any more trouble.

dcpetit · March 5, 2025, 12:46am

It’s looking rather good now Joe, thanks for your help. I’ve gotten PyTorch up and running, but I think I’m running into some trouble though with ROCm. I’ve made a very simple script to test this; it has:

import torch

if torch.version.hip is not None:
    device = torch.device("hip")
    a = torch.randn(3, 3, device=device)
    print(a)
else:
    print("ROCm/HIP not available.")

I was hoping it would print ‘a’ without any issues, but I get a very long error message when I do this. Do you know what not run aten::empty.memory_format' with arguments from the 'HIP' backend. means by chance? The full error message is pasted below.

---------------------------------------------------------------------------
NotImplementedError                       Traceback (most recent call last)
Cell In[11], line 7
      5 if torch.version.hip is not None:
      6     device = torch.device("hip")
----> 7     a = torch.randn(3, 3, device=device)
      8     print(a)
      9 else:

NotImplementedError: Could not run 'aten::empty.memory_format' with arguments from the 'HIP' backend. This could be because the operator doesn't exist for this backend, or was omitted during the selective/custom build process (if using custom build). If you are a Facebook employee using PyTorch on mobile, please visit https://fburl.com/ptmfixes for possible resolutions. 'aten::empty.memory_format' is only available for these backends: [CPU, CUDA, Meta, QuantizedCPU, QuantizedCUDA, QuantizedMeta, MkldnnCPU, SparseCPU, SparseCUDA, SparseMeta, SparseCsrCPU, SparseCsrCUDA, SparseCsrMeta, BackendSelect, Python, FuncTorchDynamicLayerBackMode, Functionalize, Named, Conjugate, Negative, ZeroTensor, ADInplaceOrView, AutogradOther, AutogradCPU, AutogradCUDA, AutogradHIP, AutogradXLA, AutogradMPS, AutogradIPU, AutogradXPU, AutogradHPU, AutogradVE, AutogradLazy, AutogradMTIA, AutogradPrivateUse1, AutogradPrivateUse2, AutogradPrivateUse3, AutogradMeta, AutogradNestedTensor, Tracer, AutocastCPU, AutocastMTIA, AutocastXPU, AutocastMPS, AutocastCUDA, FuncTorchBatched, BatchedNestedTensor, FuncTorchVmapMode, Batched, VmapMode, FuncTorchGradWrapper, PythonTLSSnapshot, FuncTorchDynamicLayerFrontMode, PreDispatch, PythonDispatcher].

CPU: registered at /pytorch/build/aten/src/ATen/RegisterCPU_1.cpp:2517 [kernel]
CUDA: registered at /pytorch/build/aten/src/ATen/RegisterCUDA_0.cpp:9209 [kernel]
Meta: registered at /pytorch/build/aten/src/ATen/RegisterMeta_0.cpp:5433 [kernel]
QuantizedCPU: registered at /pytorch/build/aten/src/ATen/RegisterQuantizedCPU_0.cpp:311 [kernel]
QuantizedCUDA: registered at /pytorch/build/aten/src/ATen/RegisterQuantizedCUDA_0.cpp:194 [kernel]
QuantizedMeta: registered at /pytorch/build/aten/src/ATen/RegisterQuantizedMeta_0.cpp:116 [kernel]
MkldnnCPU: registered at /pytorch/build/aten/src/ATen/RegisterMkldnnCPU_0.cpp:230 [kernel]
SparseCPU: registered at /pytorch/build/aten/src/ATen/RegisterSparseCPU_0.cpp:843 [kernel]
SparseCUDA: registered at /pytorch/build/aten/src/ATen/RegisterSparseCUDA_0.cpp:881 [kernel]
SparseMeta: registered at /pytorch/build/aten/src/ATen/RegisterSparseMeta_0.cpp:187 [kernel]
SparseCsrCPU: registered at /pytorch/build/aten/src/ATen/RegisterSparseCsrCPU_0.cpp:737 [kernel]
SparseCsrCUDA: registered at /pytorch/build/aten/src/ATen/RegisterSparseCsrCUDA_0.cpp:835 [kernel]
SparseCsrMeta: registered at /pytorch/build/aten/src/ATen/RegisterSparseCsrMeta_0.cpp:702 [kernel]
BackendSelect: registered at /pytorch/build/aten/src/ATen/RegisterBackendSelect.cpp:792 [kernel]
Python: registered at /pytorch/aten/src/ATen/core/PythonFallbackKernel.cpp:194 [backend fallback]
FuncTorchDynamicLayerBackMode: registered at /pytorch/aten/src/ATen/functorch/DynamicLayer.cpp:479 [backend fallback]
Functionalize: registered at /pytorch/aten/src/ATen/FunctionalizeFallbackKernel.cpp:349 [backend fallback]
Named: registered at /pytorch/aten/src/ATen/core/NamedRegistrations.cpp:7 [backend fallback]
Conjugate: fallthrough registered at /pytorch/aten/src/ATen/ConjugateFallback.cpp:21 [kernel]
Negative: fallthrough registered at /pytorch/aten/src/ATen/native/NegateFallback.cpp:22 [kernel]
ZeroTensor: fallthrough registered at /pytorch/aten/src/ATen/ZeroTensorFallback.cpp:90 [kernel]
ADInplaceOrView: fallthrough registered at /pytorch/aten/src/ATen/core/VariableFallbackKernel.cpp:100 [backend fallback]
AutogradOther: registered at /pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:20142 [autograd kernel]
AutogradCPU: registered at /pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:20142 [autograd kernel]
AutogradCUDA: registered at /pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:20142 [autograd kernel]
AutogradHIP: registered at /pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:20142 [autograd kernel]
AutogradXLA: registered at /pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:20142 [autograd kernel]
AutogradMPS: registered at /pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:20142 [autograd kernel]
AutogradIPU: registered at /pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:20142 [autograd kernel]
AutogradXPU: registered at /pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:20142 [autograd kernel]
AutogradHPU: registered at /pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:20142 [autograd kernel]
AutogradVE: registered at /pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:20142 [autograd kernel]
AutogradLazy: registered at /pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:20142 [autograd kernel]
AutogradMTIA: registered at /pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:20142 [autograd kernel]
AutogradPrivateUse1: registered at /pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:20142 [autograd kernel]
AutogradPrivateUse2: registered at /pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:20142 [autograd kernel]
AutogradPrivateUse3: registered at /pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:20142 [autograd kernel]
AutogradMeta: registered at /pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:20142 [autograd kernel]
AutogradNestedTensor: registered at /pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:20142 [autograd kernel]
Tracer: registered at /pytorch/torch/csrc/autograd/generated/TraceType_2.cpp:17801 [kernel]
AutocastCPU: fallthrough registered at /pytorch/aten/src/ATen/autocast_mode.cpp:322 [backend fallback]
AutocastMTIA: fallthrough registered at /pytorch/aten/src/ATen/autocast_mode.cpp:466 [backend fallback]
AutocastXPU: fallthrough registered at /pytorch/aten/src/ATen/autocast_mode.cpp:504 [backend fallback]
AutocastMPS: fallthrough registered at /pytorch/aten/src/ATen/autocast_mode.cpp:209 [backend fallback]
AutocastCUDA: fallthrough registered at /pytorch/aten/src/ATen/autocast_mode.cpp:165 [backend fallback]
FuncTorchBatched: registered at /pytorch/aten/src/ATen/functorch/LegacyBatchingRegistrations.cpp:731 [backend fallback]
BatchedNestedTensor: registered at /pytorch/aten/src/ATen/functorch/LegacyBatchingRegistrations.cpp:758 [backend fallback]
FuncTorchVmapMode: fallthrough registered at /pytorch/aten/src/ATen/functorch/VmapModeRegistrations.cpp:27 [backend fallback]
Batched: registered at /pytorch/aten/src/ATen/LegacyBatchingRegistrations.cpp:1075 [backend fallback]
VmapMode: fallthrough registered at /pytorch/aten/src/ATen/VmapModeRegistrations.cpp:33 [backend fallback]
FuncTorchGradWrapper: registered at /pytorch/aten/src/ATen/functorch/TensorWrapper.cpp:208 [backend fallback]
PythonTLSSnapshot: registered at /pytorch/aten/src/ATen/core/PythonFallbackKernel.cpp:202 [backend fallback]
FuncTorchDynamicLayerFrontMode: registered at /pytorch/aten/src/ATen/functorch/DynamicLayer.cpp:475 [backend fallback]
PreDispatch: registered at /pytorch/aten/src/ATen/core/PythonFallbackKernel.cpp:206 [backend fallback]
PythonDispatcher: registered at /pytorch/aten/src/ATen/core/PythonFallbackKernel.cpp:198 [backend fallback]

Have a great day!

fluidnumerics_joe · March 5, 2025, 1:51am

It may seem odd, but the device needs to be set to cuda . The pytorch folks opted to stick with the CUDA naming conventions to mean running on a GPU backend so that developers didn’t need to change existing pytorch codes to run on AMD GPUs.

Here’s a modified version of your example code that should work

import torch

if torch.version.hip is not None:
    device = torch.device("cuda")
    a = torch.randn(3, 3, device=device)
    print(a)
else:
    print("ROCm/HIP not available.")

dcpetit · March 5, 2025, 3:59am

Never mind, I had a problem (below), but solved it by simply restarting the program and kernel.

### ### ### ### ### ### ### ### ### ###

That changes things, but unfortunately it does result in a different error at the same step:

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
Cell In[2], line 7
      5 if torch.version.hip is not None:
      6     device = torch.device("cuda")
----> 7     a = torch.randn(3, 3, device=device)
      8     print(a)
      9 else:

RuntimeError: HIP error: invalid device function
HIP kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing AMD_SERIALIZE_KERNEL=3
Compile with `TORCH_USE_HIP_DSA` to enable device-side assertions.

Do I have to change the hip to cuda in the if torch.version.hip is not None: line as well?

fluidnumerics_joe · March 5, 2025, 6:53pm

I’m glad it got resolved. Oddly enough, torch.version.hip is the API call to check for the HIP version used by the “cuda” backend. I know… odd decisions were made.

dcpetit · March 6, 2025, 12:17am

I hope this helps anyone reading this thread that also has gfx1103 GPUs, like mine.

This is so strange Joe, I got the error in my last post, but then restarted the kernel and it worked great. Then later I shut down my computer. Today, after turning my computer back on, I ran the same code and got the same error that I originally got in the last post. Confused, I tried a few things and eventually remembered I had forced pytorch to view my GPU as export HSA_OVERRIDE_GFX_VERSION=11.0.0. After setting that in the terminal, my code would run fine again… I hope that the error message can point towards this solution in the future, or better yet the 11.0.3 version gets supported and this workaround no longer becomes necessary.

fluidnumerics_joe · March 6, 2025, 3:37pm

@dcpetit , In your home directory, there is a file ~/.bashrc . When ever you start a shell session, this file gets sourced and can be used to configure your environment.

If you run the following from your terminal

cat "export HSA_OVERRIDE_GFX_VERSION=11.0.0" >> ~/.bashrc

this will ensure this environment variable is set each time you start a shell session.

fedorgoncharov · March 27, 2025, 9:49am

Thanks, I followed this thread and all worked! I am so glad that I can press out maximum of my poor personal laptop