How to get the queue for xpu device

I have built pytorch from source with xpu support. I want to write a pytorch extension that uses xpu. I need to get the queue. This worked for ipex:

    auto device_type = c10::DeviceType::XPU;
    c10::impl::VirtualGuardImpl impl(device_type);
    c10::Stream c10_stream = impl.getStream(c10::Device(device_type));
    auto& q = torch::xpu::get_queue_from_stream(c10_stream);

But I get:

    /shared/tpi0/users/rscohn1/projects/cutlass/spex/examples/my-sycl/add_.cpp:12:27: error: no member named 'get_queue_from_stream' in namespace 'torch::xpu'
       12 |     auto& q = torch::xpu::get_queue_from_stream(c10_stream);
          |               ~~~~~~~~~~~~^

I also tried this:

auto& q = at::xpu::getCurrentSYCLQueue();

and

 c10::xpu::getCurrentXPUStream().queue();      

But none of them are recognized. How do you get the queue? Should extensions that worked with ipex work? Are there examples?

From looking at pytorch xpu implementation, I think I want to do something like this:

torch::Tensor add_(torch::Tensor a, torch::Tensor b) {
    // Check if the input tensors are on the same device
    if (a.device() != b.device()) {
        throw std::invalid_argument("Input tensors must be on the same device");
    }
    auto q = torch::xpu::getCurrentXPUStream(a.device().index());

Which means that I have to explicitly include:

#include <c10/xpu/XPUStream.h>

But this file includes xpu_cmake_macros.h

    In file included from /shared/tpi0/users/rscohn1/projects/cutlass/spex/.venv/lib/python3.10/site-packages/torch/include\
/c10/xpu/XPUStream.h:5:
    In file included from /shared/tpi0/users/rscohn1/projects/cutlass/spex/.venv/lib/python3.10/site-packages/torch/include\
/c10/xpu/XPUFunctions.h:4:
    In file included from /shared/tpi0/users/rscohn1/projects/cutlass/spex/.venv/lib/python3.10/site-packages/torch/include\
/c10/xpu/XPUDeviceProp.h:3:
    /shared/tpi0/users/rscohn1/projects/cutlass/spex/.venv/lib/python3.10/site-packages/torch/include/c10/xpu/XPUMacros.h:4\
:10: fatal error: 'c10/xpu/impl/xpu_cmake_macros.h' file not found
        4 | #include <c10/xpu/impl/xpu_cmake_macros.h>

and that is not in installed with pytorch:

rscohn1@gkdse-dnp-23:spex$ find .venv -name 'xpu_cmake*'
rscohn1@gkdse-dnp-23:spex$

Hi @Robert_Cohn Sorry for causing you trouble. Let me clarify that get_queue_from_stream is an API before IPEX 2.1. From IPEX 2.3, please use c10::xpu::getCurrentXPUStream().queue to obtain the SYCL queue.
BTW, c10/xpu/impl/xpu_cmake_macros.h seems a bug. I will check and fix it as soon as possible. I will keep you informed of any updates.
Thanks very much.

1 Like

Fixed in Add xpu_cmake_macros.h to xpu build by guangyey · Pull Request #132847 · pytorch/pytorch · GitHub.

It works now. Thanks!

1 Like

Glad I could help.
Welcome to any feedback~