Torch Version: 2.0.1+rocm5.4.2
I am trying to train a rnn with torch 2.0.1+rocm5.4.2 but I keep getting the this error:
File "train.py", line 297, in <module> generate_init_weight(model, init_weight_name) File "site-packages/lightning_utilities/core/rank_zero.py", line 27, in wrapped_fn return fn(*args, **kwargs) File "/model/src/trainer.py", line 165, in generate_init_weight mm = model.generate_init_weight() File "/model/src/model.py", line 597, in generate_init_weight nn.init.orthogonal_(m[n], gain=gain * scale) File "site-packages/torch/nn/init.py", line 484, in orthogonal_ q, r = torch.linalg.qr(flattened) RuntimeError: Calling torch.orgqr on a CUDA tensor requires compiling PyTorch with cuSOLVER. Please use PyTorch built with cuSOLVER support.
I thought an installation of magma would fix it but it looks like even when magma is in my installation, it isn’t being used for QR.
Is there a way top get the torch.nn.init.orthogonal_() function with hipSolver or rocSolver or Magma?