Hello,
Torch Version: 2.0.1+rocm5.4.2
I am trying to train a rnn with torch 2.0.1+rocm5.4.2 but I keep getting the this error:
File "train.py", line 297, in <module>
generate_init_weight(model, init_weight_name)
File "site-packages/lightning_utilities/core/rank_zero.py", line 27, in wrapped_fn
return fn(*args, **kwargs)
File "/model/src/trainer.py", line 165, in generate_init_weight
mm = model.generate_init_weight()
File "/model/src/model.py", line 597, in generate_init_weight
nn.init.orthogonal_(m[n], gain=gain * scale)
File "site-packages/torch/nn/init.py", line 484, in orthogonal_
q, r = torch.linalg.qr(flattened)
RuntimeError: Calling torch.orgqr on a CUDA tensor requires compiling PyTorch with cuSOLVER. Please use PyTorch built with cuSOLVER support.
I thought an installation of magma would fix it but it looks like even when magma is in my installation, it isn’t being used for QR.
Is there a way top get the torch.nn.init.orthogonal_() function with hipSolver or rocSolver or Magma?
Thanks!