I’m trying to study the low level implementation of some computations in neural network. I built the PyTorch from source with following command.
DEBUG=1 NO_CUDA=1 python setup.py develop
I’ve succeeded to debug in C++ source level for torch.add
function with cgdb. (I set the breakpoint at at::native::add
function) But what I want to analyze is the SIMD instructions. I know PyTorch have added specialized AVX and AVX2 intrinsics for Tensor operations. AVX is what I’m interested in. (not AVX2) But I cannot go inside to that level.
- aten/src/TH/vector/AVX.cpp
- aten/src/ATen/cpu/vec256/*
I think the source files above are the candidates, but setting the breakpoints on some functions inside them failed. The code I’m using is the simple add, sub operation and the CNN codes in PyTorch tutorial. But both don’t implement the function in those source files.
I’m curious which PyTorch codes(written in Python) implement AVX instructions. And I want to know if I missed some options when I built it.
Actually, I’ve googled about it, but I cannot figured it out. A lot of advice with respect to this kind of topic deal with cmake, but I couldn’t understand it because I’m not well aware of it.