Budget machine for tinkering with LLMs? ROCm a good alternative?

tl;dr: I’m considering building a budget machine for tinkering with LLMs, but I’m not sure if this is a good idea and how to go about it.

For context: I work in a university department. I currently have access to a 2080 Ti on a shared machine, and we’re in the process of acquiring a small server with 2 L40 cards. So for any larger experiments, I will be able to use this shared machine.

However, I think I would like to have my own small machine for tinkering: trying different models and techniques, and just playing around, and preparing larger experiments to be run on the server. My focus is on teaching and education not on state-of-the-art research.

With aiming for a good amount of VRAM, the 4060 Ti 16GB seems to be the most obvious choice; I also like the low power requirements (regarding energy and cooling). But this card seems to have a poor reputation overall. I’m also not sure what currently the sweet spot w.r.t. the CPU and memory is – I completely lost track of Intel’s and AMD’s generations over the last years.

Regarding getting the most VRAM for the money, an AMD 7900 XTX 24GB would also be an interesting alternative. I know that PyTorch supports ROCm, but there still seems to be the consensus to stick with Nvidia for ML/AI, at least for the time being. Or what are the community’s opinion and experiences with ROCm in practice?

1 Like

Or what are the community’s opinion and experiences with ROCm in practice?

Mine is not very positive. If you have access to an AMD-based cluster, it may be worth investing time to deal with the bugs and workarounds, if it’s your own money you are spending, stay with NVidia.

My preferred hardware setup is:

  • Any GPU that can run my model with batch size one on my dev computer, I just use it to debug my code.
  • Switch immediately to a server/cluster. If you are a student or academic, you have access to PyCharm pro license, so you can even develop and debug directly on a remote server via ssh. Otherwise, use jupyter if you need to interactively update and run your code.