I have a compatibility 3.0 card so I always have to install from source for it.
But there is no magma for 10.2 so what do people do if they want to install from source?
When installing cuda on fedora it installs the latest, downgrading it would be a major hassle.
Edit: When proceeding without magma as suggested the install fails after seconds:
-- Generating done
-- Build files have been written to: /home/aaa/000git/pytorch/build
cmake --build . --target install --config Release -- -j 8
[16/3522] Performing build step for 'nccl_external'
FAILED: nccl_external-prefix/src/nccl_external-stamp/nccl_external-build nccl/lib/libnccl_static.a
cd /home/aaa/000git/pytorch/third_party/nccl/nccl && env CCACHE_DISABLE=1 SCCACHE_DISABLE=1 make CXX=/usr/bin/c++ CUDA_HOME=/usr/local/cuda NVCC=/usr/local/cuda/bin/nvcc NVCC_GENCODE=-gencode=arch=compute_30,code=sm_30 BUILDDIR=/home/aaa/000git/pytorch/build/nccl VERBOSE=0 -j && /home/aaa/anaconda3/bin/cmake -E touch /home/aaa/000git/pytorch/build/nccl_external-prefix/src/nccl_external-stamp/nccl_external-build
make -C src build BUILDDIR=/home/aaa/000git/pytorch/build/nccl
make[1]: Entering directory '/home/aaa/000git/pytorch/third_party/nccl/nccl/src'
Grabbing include/nccl_net.h > /home/aaa/000git/pytorch/build/nccl/include/nccl_net.h
Compiling init.cc > /home/aaa/000git/pytorch/build/nccl/obj/init.o
Generating nccl.h.in > /home/aaa/000git/pytorch/build/nccl/include/nccl.h
Compiling channel.cc > /home/aaa/000git/pytorch/build/nccl/obj/channel.o
Compiling bootstrap.cc > /home/aaa/000git/pytorch/build/nccl/obj/bootstrap.o
Compiling transport.cc > /home/aaa/000git/pytorch/build/nccl/obj/transport.o
Compiling enqueue.cc > /home/aaa/000git/pytorch/build/nccl/obj/enqueue.o
Compiling misc/group.cc > /home/aaa/000git/pytorch/build/nccl/obj/misc/group.o
Compiling misc/nvmlwrap.cc > /home/aaa/000git/pytorch/build/nccl/obj/misc/nvmlwrap.o
Compiling misc/rings.cc > /home/aaa/000git/pytorch/build/nccl/obj/misc/rings.o
Compiling misc/ibvwrap.cc > /home/aaa/000git/pytorch/build/nccl/obj/misc/ibvwrap.o
Compiling misc/argcheck.cc > /home/aaa/000git/pytorch/build/nccl/obj/misc/argcheck.o
Compiling misc/utils.cc > /home/aaa/000git/pytorch/build/nccl/obj/misc/utils.o
Compiling misc/trees.cc > /home/aaa/000git/pytorch/build/nccl/obj/misc/trees.o
Compiling misc/topo.cc > /home/aaa/000git/pytorch/build/nccl/obj/misc/topo.o
Compiling transport/p2p.cc > /home/aaa/000git/pytorch/build/nccl/obj/transport/p2p.o
Compiling transport/shm.cc > /home/aaa/000git/pytorch/build/nccl/obj/transport/shm.o
Compiling transport/net.cc > /home/aaa/000git/pytorch/build/nccl/obj/transport/net.o
Compiling transport/net_socket.cc > /home/aaa/000git/pytorch/build/nccl/obj/transport/net_socket.o
Compiling transport/net_ib.cc > /home/aaa/000git/pytorch/build/nccl/obj/transport/net_ib.o
Compiling collectives/all_reduce.cc > /home/aaa/000git/pytorch/build/nccl/obj/collectives/all_reduce.o
Compiling collectives/all_gather.cc > /home/aaa/000git/pytorch/build/nccl/obj/collectives/all_gather.o
Compiling collectives/broadcast.cc > /home/aaa/000git/pytorch/build/nccl/obj/collectives/broadcast.o
Compiling collectives/reduce.cc > /home/aaa/000git/pytorch/build/nccl/obj/collectives/reduce.o
Compiling collectives/reduce_scatter.cc > /home/aaa/000git/pytorch/build/nccl/obj/collectives/reduce_scatter.o
Generating nccl.pc.in > /home/aaa/000git/pytorch/build/nccl/lib/pkgconfig/nccl.pc
make[2]: Entering directory '/home/aaa/000git/pytorch/third_party/nccl/nccl/src/collectives/device'
Generating rules > /home/aaa/000git/pytorch/build/nccl/obj/collectives/device/Makefile.rules
In file included from bootstrap.cc:12:
include/socket.h: In function ‘ncclResult_t connectAddress(int*, socketAddress*)’:
include/socket.h:41:19: warning: ‘<’ directive writing 1 byte into a region of size between 0 and 1024 [-Wformat-overflow=]
41 | sprintf(buf, "%s<%s>", host, service);
| ^
include/socket.h:41:10: note: ‘sprintf’ output between 3 and 1058 bytes into a destination of size 1024
41 | sprintf(buf, "%s<%s>", host, service);
| ~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
include/socket.h: In function ‘int findInterfaceMatchSubnet(char*, socketAddress*, socketAddress, int, int)’:
include/socket.h:41:19: warning: ‘<’ directive writing 1 byte into a region of size between 0 and 1024 [-Wformat-overflow=]
41 | sprintf(buf, "%s<%s>", host, service);
| ^
include/socket.h:41:10: note: ‘sprintf’ output between 3 and 1058 bytes into a destination of size 1024
41 | sprintf(buf, "%s<%s>", host, service);
| ~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from transport/net_socket.cc:9:
include/socket.h: In function ‘ncclResult_t connectAddress(int*, socketAddress*)’:
include/socket.h:41:19: warning: ‘<’ directive writing 1 byte into a region of size between 0 and 1024 [-Wformat-overflow=]
41 | sprintf(buf, "%s<%s>", host, service);
| ^
include/socket.h:41:10: note: ‘sprintf’ output between 3 and 1058 bytes into a destination of size 1024
41 | sprintf(buf, "%s<%s>", host, service);
| ~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from /usr/local/cuda/bin/../targets/x86_64-linux/include/cuda_runtime.h:83,
from <command-line>:
/usr/local/cuda/bin/../targets/x86_64-linux/include/crt/host_config.h:138:2: error: #error -- unsupported GNU version! gcc versions later than 8 are not supported!
138 | #error -- unsupported GNU version! gcc versions later than 8 are not supported!
| ^~~~~
In file included from /usr/local/cuda/bin/../targets/x86_64-linux/include/cuda_runtime.h:83,
from <command-line>:
/usr/local/cuda/bin/../targets/x86_64-linux/include/crt/host_config.h:138:2: error: #error -- unsupported GNU version! gcc versions later than 8 are not supported!
138 | #error -- unsupported GNU version! gcc versions later than 8 are not supported!
| ^~~~~
In file included from /usr/local/cuda/bin/../targets/x86_64-linux/include/cuda_runtime.h:83,
from <command-line>:
/usr/local/cuda/bin/../targets/x86_64-linux/include/crt/host_config.h:138:2: error: #error -- unsupported GNU version! gcc versions later than 8 are not supported!
138 | #error -- unsupported GNU version! gcc versions later than 8 are not supported!
| ^~~~~
include/socket.h: In function ‘ncclResult_t ncclSocketInit(ncclDebugLogger_t)’:
include/socket.h:41:19: warning: ‘<’ directive writing 1 byte into a region of size between 0 and 1024 [-Wformat-overflow=]
41 | sprintf(buf, "%s<%s>", host, service);
| ^
include/socket.h:41:10: note: ‘sprintf’ output between 3 and 1058 bytes into a destination of size 1024
41 | sprintf(buf, "%s<%s>", host, service);
| ~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
include/socket.h:41:19: warning: ‘<’ directive writing 1 byte into a region of size between 0 and 1024 [-Wformat-overflow=]
41 | sprintf(buf, "%s<%s>", host, service);
| ^
include/socket.h:41:10: note: ‘sprintf’ output between 3 and 1058 bytes into a destination of size 1024
41 | sprintf(buf, "%s<%s>", host, service);
| ~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
transport/net_socket.cc:40:67: warning: ‘%s’ directive output may be truncated writing up to 1023 bytes into a region of size between 1017 and 1018 [-Wformat-truncation=]
40 | snprintf(line+strlen(line), 1023-strlen(line), " [%d]%s:%s", i, ncclNetIfNames+i*MAX_IF_NAME_SIZE,
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
41 | socketToString(&ncclNetIfAddrs[i].sa, addrline));
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
transport/net_socket.cc:40:19: note: ‘snprintf’ output 6 or more bytes (assuming 1030) into a destination of size 1023
40 | snprintf(line+strlen(line), 1023-strlen(line), " [%d]%s:%s", i, ncclNetIfNames+i*MAX_IF_NAME_SIZE,
| ~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
41 | socketToString(&ncclNetIfAddrs[i].sa, addrline));
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from /usr/local/cuda/bin/../targets/x86_64-linux/include/cuda_runtime.h:83,
from <command-line>:
/usr/local/cuda/bin/../targets/x86_64-linux/include/crt/host_config.h:138:2: error: #error -- unsupported GNU version! gcc versions later than 8 are not supported!
138 | #error -- unsupported GNU version! gcc versions later than 8 are not supported!
| ^~~~~
In file included from /usr/local/cuda/bin/../targets/x86_64-linux/include/cuda_runtime.h:83,
from <command-line>:
/usr/local/cuda/bin/../targets/x86_64-linux/include/crt/host_config.h:138:2: error: #error -- unsupported GNU version! gcc versions later than 8 are not supported!
138 | #error -- unsupported GNU version! gcc versions later than 8 are not supported!
| ^~~~~
In file included from /usr/local/cuda/bin/../targets/x86_64-linux/include/cuda_runtime.h:83,
from <command-line>:
/usr/local/cuda/bin/../targets/x86_64-linux/include/crt/host_config.h:138:2: error: #error -- unsupported GNU version! gcc versions later than 8 are not supported!
138 | #error -- unsupported GNU version! gcc versions later than 8 are not supported!
| ^~~~~
In file included from transport/net_ib.cc:9:
include/socket.h: In function ‘int findInterfaces(const char*, char*, socketAddress*, int, int, int)’:
include/socket.h:108:14: warning: ‘char* strncpy(char*, const char*, size_t)’ specified bound 16 equals destination size [-Wstringop-truncation]
108 | strncpy(names+found*maxIfNameSize, interface->ifa_name, maxIfNameSize);
| ~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
include/socket.h:108:14: warning: ‘char* strncpy(char*, const char*, size_t)’ specified bound 16 equals destination size [-Wstringop-truncation]
include/socket.h: In function ‘ncclResult_t bootstrapNetInit()’:
include/socket.h:41:19: warning: ‘<’ directive writing 1 byte into a region of size between 0 and 1024 [-Wformat-overflow=]
41 | sprintf(buf, "%s<%s>", host, service);
| ^
include/socket.h:41:10: note: ‘sprintf’ output between 3 and 1058 bytes into a destination of size 1024
41 | sprintf(buf, "%s<%s>", host, service);
| ~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
bootstrap.cc:44:67: warning: ‘%s’ directive output may be truncated writing up to 1023 bytes into a region of size between 1017 and 1018 [-Wformat-truncation=]
44 | snprintf(line+strlen(line), 1023-strlen(line), " [%d]%s:%s", i, bootstrapNetIfNames+i*MAX_IF_NAME_SIZE,
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
45 | socketToString(&bootstrapNetIfAddrs[i].sa, addrline));
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
bootstrap.cc:44:19: note: ‘snprintf’ output 6 or more bytes (assuming 1030) into a destination of size 1023
44 | snprintf(line+strlen(line), 1023-strlen(line), " [%d]%s:%s", i, bootstrapNetIfNames+i*MAX_IF_NAME_SIZE,
| ~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
45 | socketToString(&bootstrapNetIfAddrs[i].sa, addrline));
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
make[2]: *** [Makefile:53: /home/aaa/000git/pytorch/build/nccl/obj/collectives/device/all_gather.dep] Error 1
make[2]: *** Waiting for unfinished jobs....
include/socket.h: In function ‘ncclResult_t ncclIbInit(ncclDebugLogger_t)’:
include/socket.h:41:19: warning: ‘<’ directive writing 1 byte into a region of size between 0 and 1024 [-Wformat-overflow=]
41 | sprintf(buf, "%s<%s>", host, service);
| ^
make[2]: *** [Makefile:53: /home/aaa/000git/pytorch/build/nccl/obj/collectives/device/all_reduce.dep] Error 1
include/socket.h:41:10: note: ‘sprintf’ output between 3 and 1058 bytes into a destination of size 1024
41 | sprintf(buf, "%s<%s>", host, service);
| ~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
include/socket.h:41:19: warning: ‘<’ directive writing 1 byte into a region of size between 0 and 1024 [-Wformat-overflow=]
41 | sprintf(buf, "%s<%s>", host, service);
| ^
include/socket.h:41:10: note: ‘sprintf’ output between 3 and 1058 bytes into a destination of size 1024
41 | sprintf(buf, "%s<%s>", host, service);
| ~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In function ‘int findInterfaces(const char*, char*, socketAddress*, int, int, int)’,
inlined from ‘int findInterfaces(char*, socketAddress*, int, int)’ at include/socket.h:294:26,
inlined from ‘ncclResult_t ncclIbInit(ncclDebugLogger_t)’ at transport/net_ib.cc:97:25:
include/socket.h:108:14: warning: ‘char* strncpy(char*, const char*, size_t)’ specified bound 16 equals destination size [-Wstringop-truncation]
108 | strncpy(names+found*maxIfNameSize, interface->ifa_name, maxIfNameSize);
| ~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
include/socket.h:108:14: warning: ‘char* strncpy(char*, const char*, size_t)’ specified bound 16 equals destination size [-Wstringop-truncation]
In function ‘int findInterfaceMatchSubnet(char*, socketAddress*, socketAddress, int, int)’,
inlined from ‘int findInterfaces(char*, socketAddress*, int, int)’ at include/socket.h:302:40,
inlined from ‘ncclResult_t ncclIbInit(ncclDebugLogger_t)’ at transport/net_ib.cc:97:25:
include/socket.h:189:12: warning: ‘char* strncpy(char*, const char*, size_t)’ specified bound 16 equals destination size [-Wstringop-truncation]
189 | strncpy(ifNames+found*ifNameMaxSize, interface->ifa_name, ifNameMaxSize);
| ~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
make[2]: *** [Makefile:53: /home/aaa/000git/pytorch/build/nccl/obj/collectives/device/functions.dep] Error 1
make[2]: *** [Makefile:53: /home/aaa/000git/pytorch/build/nccl/obj/collectives/device/reduce_scatter.dep] Error 1
make[2]: *** [Makefile:53: /home/aaa/000git/pytorch/build/nccl/obj/collectives/device/broadcast.dep] Error 1
make[2]: *** [Makefile:53: /home/aaa/000git/pytorch/build/nccl/obj/collectives/device/reduce.dep] Error 1
make[2]: Leaving directory '/home/aaa/000git/pytorch/third_party/nccl/nccl/src/collectives/device'
make[1]: *** [Makefile:49: /home/aaa/000git/pytorch/build/nccl/obj/collectives/device/colldevice.a] Error 2
make[1]: *** Waiting for unfinished jobs....
include/socket.h: In function ‘ncclResult_t connectAddress(int*, socketAddress*)’:
include/socket.h:41:19: warning: ‘<’ directive writing 1 byte into a region of size between 0 and 1024 [-Wformat-overflow=]
41 | sprintf(buf, "%s<%s>", host, service);
| ^
include/socket.h:41:10: note: ‘sprintf’ output between 3 and 1058 bytes into a destination of size 1024
41 | sprintf(buf, "%s<%s>", host, service);
| ~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from init.cc:10:
include/param.h: In function ‘void setEnvFile(const char*)’:
include/param.h:37:12: warning: ‘char* strncpy(char*, const char*, size_t)’ specified bound 1024 equals destination size [-Wstringop-truncation]
37 | strncpy(envValue, line+s, 1024);
| ~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~
make[1]: Leaving directory '/home/aaa/000git/pytorch/third_party/nccl/nccl/src'
make: *** [Makefile:25: src.build] Error 2
[23/3522] Building CXX object third_party/protobuf/cmake/CMakeFiles/libprotobuf-lite.dir/__/src/google/protobuf/extension_set.cc.o
ninja: build stopped: subcommand failed.
Traceback (most recent call last):
File "setup.py", line 737, in <module>
build_deps()
File "setup.py", line 316, in build_deps
cmake=cmake)
File "/home/aaa/000git/pytorch/tools/build_pytorch_libs.py", line 62, in build_caffe2
cmake.build(my_env)
File "/home/aaa/000git/pytorch/tools/setup_helpers/cmake.py", line 339, in build
self.run(build_args, my_env)
File "/home/aaa/000git/pytorch/tools/setup_helpers/cmake.py", line 141, in run
check_call(command, cwd=self.build_dir, env=env)
File "/home/aaa/anaconda3/lib/python3.7/subprocess.py", line 347, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['cmake', '--build', '.', '--target', 'install', '--config', 'Release', '--', '-j', '8']' returned non-zero exit status 1.