Problems with GPU allocation in Windows Visual Studio 2017

I am getting the error

‘is_available’: is not a member of 'at::cuda

when compiling the following code:

int main(int argc, const char* argv[]) {
	//if (argc != 2) {
	//	std::cerr << "usage: example-app <path-to-exported-script-module>\n";
	//	return -1;
	//}

	torch::Device device = torch::kCPU;
	if (torch::cuda::is_available()) {
		std::cout << "CUDA is available! Training on GPU." << std::endl;
		device = torch::kCUDA;
	}

	argv[1] = "C:/Users/gjben/Documents/PytorchCPP/torch_example/x64/Release/model.pt";
	// Deserialize the ScriptModule from a file using torch::jit::load().
	std::shared_ptr<torch::jit::script::Module> module = torch::jit::load(argv[1]);

	assert(module != nullptr);
	std::cout << "ok\n";

	// Create a vector of inputs.
	std::vector<torch::jit::IValue> inputs;
	inputs.push_back(torch::ones({ 1, 3, 224, 224 }));

	auto begin = std::chrono::high_resolution_clock::now();
	// Execute the model and turn its output into a tensor.
	auto output = module->forward(inputs).toTensor();
	auto end = std::chrono::high_resolution_clock::now();
	std::cout << std::chrono::duration_cast<std::chrono::milliseconds>(end - begin).count() << " ms" << std::endl;

	std::cout << output.slice(/*dim=*/1, /*start=*/0, /*end=*/5) << '\n';

Interestingly, if I use python I can allocate tensors to cuda.
I have tested with the following code, and Nvidia-smi reports the usage of my GPU without problems.

import torch
a = torch.rand(20000,20000).cuda()
while True:
	a += 1
	a -= 1

Is this a problem of LibTorch??
Should I reinstall pytorch manually setting TORCH_CUDA_ARCH_LIST??

I have installed pytroch from source (using ninja) and here is my environment:

PyTorch version: 1.2.0a0+f2623c7
Is debug build: No
CUDA used to build PyTorch: 10.0

OS: Microsoft Windows 10 Home
GCC version: (tdm64-1) 5.1.0
Visual Studio version: 2017 v15.9.12
CMake version: version 3.14.5

Python version: 3.6
Is CUDA available: Yes
CUDA runtime version: 10.0.130
GPU models and configuration: GPU 0: GeForce GTX 1070
Nvidia driver version: 425.25
cuDNN version: C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.0\bin\cudnn64_7.dll

Versions of relevant libraries:
[pip] numpy==1.12.1
[pip] numpydoc==0.6.0
[pip] torch==1.2.0a0+f2623c7
[conda] mkl                       2019.0                    <pip>
[conda] mkl                       2017.0.1                      0
[conda] mkl-include               2019.0                    <pip>
[conda] mkl-service               1.1.2                    py36_3
[conda] torch                     1.2.0a0+f2623c7           <pip>

@peterjc123 Sorry for bothering you again but maybe you can help me.

If you are using VS GUI, please remember to add the include dir and the link libraries(e.g. c10.lib, torch.lib and etc.). Also, since the link between caffe2 and caffe2_gpu is broken, please write LoadLibrary('caffe2_gpu.dll'); as the first line of the main function.

Thanks for your reply.
I’ve already added the directories as follows:

- Linker/Input/Additional Depndencies:
torch.lib;caffe2.lib;caffe2_gpu.lib;c10.lib;c10_cuda.lib
- VC++ Directories/Library Directories:
C:\pytorch\torch\lib
- VC++ Directories/Include Directories:
C:\pytorch\torch\csrc\api\include; C:\pytorch\torch\include;

Also, I added LoadLibraryA("caffe2_gpu.dll"); as the first line of the main function. However, I still have the following errors:

Do you have #include <ATen/ATen.h>?

yes, these are the libraries included:

#include "pch.h"
#include <torch/script.h> // One-stop header.
#include <ATen/ATen.h>
#include <windows.h>
#include <iostream>
#include <memory>

Would you please try replacing torch/script.h to torch/extension.h?

maybe is #include<torch/torch.h> ?

You did not use the variable “device” anywhere in your code.

Try this:

 // Loading your model
    const std::string s_model_name = argv[1];
    std::cout << " >>> Loading " << s_model_name << std::endl;
    auto module = torch::jit::load(s_model_name, torch::kCUDA); // **This is what you should use** 
    assert(module != nullptr);

And then your tensor:

inputs.emplace_back(input_tensor.to(torch::kCUDA)); /