Testing nightly builds with xformers + pytorch3d

adsharma · June 11, 2024, 5:05pm

I’m debugging performance problems with an application related to advanced tensor indexing in the autograd engine. Was unable to come up with a minimal repro I can share here.

One of the recent commits sounds promising based on what I see in the profile. I’d like to test if it really fixes the performance problem.

With some effort I could get a nightly builds of torch + torchvision installed. It appears that if I pick torch from date N, I need to go with torchvision nightly from date N+1 to be compatible. However, I couldn’t figure out how to find compatible versions of xformers and pytorch3d (other libs the app uses). Any hints?

Also, I learned that all these nightlies support only python-3.12 on linux. No support for python-3.10. Is that accurate?

ptrblck · June 11, 2024, 9:56pm

You can just copy/paste the install command from here to install the nightly binaries without worrying about tagging the right versions.

No, nightly binaries support Python 3.8-3.12.

adsharma · June 13, 2024, 10:39pm

Thanks. Trying it out now.

However installing xformers brings in torch-2.3.0 and cu121. I don’t know what happens when multiple cuda versions and torch versions are in play.

Is this the recommended way to install xformers from source to work with pytorch nightly + cu124?

github.com/facebookresearch/xformers

[Guide] How to finish xFormers compile with source code, update 3rd-party compoents, or Nvidia Docker envs

opened 09:35AM - 12 Jan 24 UTC

soulteary

I have seen many users in the community encounter compilation and build problems…, and many failures, especially when using new versions of CUDA, PyTorch, or wanting to update CUTLASS or Flash Attention. So I wrote a tutorial on how to use Docker (Nvidia's monthly release, including the latest CUDA, Torch, etc.) to fully install and build xFormers. ## guide - https://soulteary.com/2024/01/12/xformers-source-code-compilation-with-nvidia-docker.html - (backup) https://zhuanlan.zhihu.com/p/677516241 - dockerfile example: https://github.com/soulteary/docker-stable-diffusion-webui/blob/main/docker/Dockerfile.xformers I believe that people who have encountered the same problem should be able to complete the construction by drawing on the ideas in the article. If you like reading in English, I recommend using Google Translate to translate the article. However, if you don’t translate it, but refer to the command line in the article, it should be enough. I wish everyone good luck and hope it can save time. ## main steps download complete source code, includes xformers, flash attention, cutlass... ```bash git clone --recursive https://github.com/facebookresearch/xformers.git --depth 1 ``` update 3rd-party source code. ```bash # we can update to latest cd xformers/third_party/flash-attention git pull origin main # update to flash attention match version (optional) # cd xformers/third_party/cutlass # git pull origin main # git checkout v3.3.0 ``` set git config ```bash git config --global --add safe.directory /app/xformers git config --global --add safe.directory /app/xformers/third_party/flash-attention git config --global --add safe.directory /app/xformers/third_party/cutlass ``` install build deps: ```bash pip install ninja ``` enjoy ```bash pip install -v -e . ``` final, `python -m torch.utils.collect_env` ```bash xFormers 0.0.24+6600003.d20240112 memory_efficient_attention.cutlassF: available memory_efficient_attention.cutlassB: available memory_efficient_attention.decoderF: available memory_efficient_attention.flshattF@v2.3.6: available memory_efficient_attention.flshattB@v2.3.6: available memory_efficient_attention.smallkF: available memory_efficient_attention.smallkB: available memory_efficient_attention.tritonflashattF: unavailable memory_efficient_attention.tritonflashattB: unavailable memory_efficient_attention.triton_splitKF: available indexing.scaled_index_addF: available indexing.scaled_index_addB: available indexing.index_select: available swiglu.dual_gemm_silu: available swiglu.gemm_fused_operand_sum: available swiglu.fused.p.cpp: available is_triton_available: True pytorch.version: 2.2.0a0+81ea7a4 pytorch.cuda: available gpu.compute_capability: 8.9 gpu.name: NVIDIA GeForce RTX 4090 dcgm_profiler: unavailable build.info: available build.cuda_version: 1230 build.python_version: 3.10.12 build.torch_version: 2.2.0a0+81ea7a4 build.env.TORCH_CUDA_ARCH_LIST: 5.2 6.0 6.1 7.0 7.2 7.5 8.0 8.6 8.7 9.0+PTX build.env.XFORMERS_BUILD_TYPE: None build.env.XFORMERS_ENABLE_DEBUG_ASSERTIONS: None build.env.NVCC_FLAGS: None build.env.XFORMERS_PACKAGE_FROM: None build.nvcc_version: 12.3.107 source.privacy: open source ``` @danthe3rd , hope it helps, and maybe you can fixed this topic, i think there're many people failed on build from source.