PyTorch and Bazel

Hello!

I’m creating PyTorch C++ extensions and building them with bazel.

The documentation (source) explains how to build your extensions with either setuptools or JIT. However, Bazel likes to take the building into its own hands.

I’d like to get some advise on how to proceed. I currently have two working solutions:

  1. Hacky solution
    Per extension, create a bazel genrule, that just invokes a python setup.py build, and set the resulting .so file as the output artifact. This can then be loaded in the code. This leverages all the nice abstractions and build arguments that are set through the torch utilities (torch.utils.cpp_extension.$)

  2. Proper solution
    Create the library through bazel’s cc_library. This is nice, but everything (arguments, flags, includes, directories, etc) need to be set manually.

Is anyone using pytorch extensions with bazel already, or does anyone have any general advice here.

5 Likes

Just to help out some other people, here is the gist of it. The solution does prerequires having setup for bazel the (1) python headers and (2) pip requirements and (3) cuda.

Create a .bzl file containing something like

load("@local_config_cuda//cuda:build_defs.bzl", "if_cuda")
load("@local_config_cuda//cuda:build_defs.bzl", "cuda_default_copts")

load("@pip_deps//:requirements.bzl", "requirement")

def pytorch_cpp_extension(name, srcs=[], gpu_srcs=[], deps=[], copts=[], defines=[],  
                          binary=True, linkopts=[]):
    """Create a pytorch cpp extension as a cpp and importable python library.
    
    All options defined below should stay close to the official torch cpp extension options as
    defined in https://github.com/pytorch/pytorch/blob/master/torch/utils/cpp_extension.py.
    """
    name_so = name + ".so"
    torch_deps = [
        requirement("torch", target = "cpp"),
    ]

    cuda_deps = [
        "@local_config_cuda//cuda:cudart_static",
        "@local_config_cuda//cuda:cuda_headers",
    ]

    copts = copts +[
        "-fPIC",
        "-D_GLIBCXX_USE_CXX11_ABI=0",
        "-DTORCH_API_INCLUDE_EXTENSION_H",
        "-fno-strict-aliasing",
        "-fopenmp",
        "-fstack-protector-strong",
        "-fwrapv",
        "-O2",
        "-std=c++14",
        "-DTORCH_EXTENSION_NAME=" + name
    ]

    if gpu_srcs:
        native.cc_library(
            name = name_so + "_gpu",
            srcs = gpu_srcs,
            deps = deps + torch_deps + if_cuda(cuda_deps),
            copts = copts + cuda_default_copts(),
            defines = defines,
            linkopts = linkopts,
        )
        cuda_deps.extend([":" + name_so + "_gpu"])

    if binary:
        native.cc_binary(
            name = name_so,
            srcs = srcs,
            deps = deps + torch_deps + if_cuda(cuda_deps),
            linkshared = 1,
            copts = copts,
            defines = defines,
            linkopts = linkopts,
        )
    else:
        native.cc_library(
            name = name_so,
            srcs = srcs,
            deps = deps + torch_deps + if_cuda(cuda_deps),
            copts = copts,
            defines = defines,
            linkopts = linkopts,
        )

    native.py_library(
        name = name,
        data = [":" + name_so],
    )

And be sure you can actually require torch as a cpp target library like like so;

genrule_directory(
    name = "include",
    srcs = [":extracted"],
    cmd = "mkdir -p $@ && cp -a $</torch/lib/include/. $@",
)

# NOTE: Make sure this yields the same includes as `include_paths()`:
# See https://github.com/pytorch/pytorch/blob/master/torch/utils/cpp_extension.py#L494

cc_library(
    name = "cpp",
    hdrs = [":include"],
    visibility = ["//visibility:public"],
    includes = [
        "include",
        "include/torch/csrc/api/include",
        "include/TH",
        "include/THC",
    ],
    deps = [
        "@//third_party/python:headers",
    ]
)
4 Likes

I don’t understand where :extracted comes from?

Does anyone else have more context on this.

@TimZaman I’d be extremely interested in understanding your solution to this and helping write a post about this to help other people who might face this issue.

Thanks!

@TimZaman Could you share a little bit more about how the hacky solution is done exactly?

For example, in the description:

Per extension, create a bazel genrule, that just invokes a python setup.py build, and set the resulting .so file as the output artifact. This can then be loaded in the code. This leverages all the nice abstractions and build arguments that are set through the torch utilities (torch.utils.cpp_extension.$)

How do these files looks like?

  1. BUILD
  2. The .cpp file of custom C++ extension
  3. The .py file calling the custom C++ extension

I am also interesting in writing a post about this for future reference. Thank you.

@TimZaman Thanks for your solution shared.
But there’s a step that I don’t understand quite clear. Should we put the genrule_directory in each C++ extension BUILD file? How does the genrule_directory work?