[ROCm] [Build Error] [caffe2] `cuda.h/cuda_runtime.h` file not found

🐛 Bug

When compiling pytorch with ROCm, pytorch fails when trying to compiling caffe2 related code. Even with BUILD_CAFFE2_OPS=0

Building wheel torch-1.6.0a0+b31f58d
-- Building version 1.6.0a0+b31f58d
cmake --build . --target install --config Release -- -j 4
[1/231] Building HIPCC object caffe2/CMakeFiles/torch_hip.dir/utils/math/torch_hip_generated_broadcast.cu.o
FAILED: caffe2/CMakeFiles/torch_hip.dir/utils/math/torch_hip_generated_broadcast.cu.o
cd /home/acxz/vcs/git/github/rocm-arch/python-pytorch-rocm/src/pytorch-1.6.0-rocm/build/caffe2/CMakeFiles/torch_hip.dir/utils/math && /usr/bin/cmake -E make_directory /home/acxz/vcs/git/github/rocm-arch/python-pytorch-rocm/src/pytorch-1.6.0-rocm/build/caffe2/CMakeFiles/torch_hip.dir/utils/math/. && /usr/bin/cmake -D verbose:BOOL=OFF -D build_configuration:STRING=RELEASE -D generated_file:STRING=/home/acxz/vcs/git/github/rocm-arch/python-pytorch-rocm/src/pytorch-1.6.0-rocm/build/caffe2/CMakeFiles/torch_hip.dir/utils/math/./torch_hip_generated_broadcast.cu.o -P /home/acxz/vcs/git/github/rocm-arch/python-pytorch-rocm/src/pytorch-1.6.0-rocm/build/caffe2/CMakeFiles/torch_hip.dir/utils/math/torch_hip_generated_broadcast.cu.o.cmake
In file included from /home/acxz/vcs/git/github/rocm-arch/python-pytorch-rocm/src/pytorch-1.6.0-rocm/caffe2/utils/math/broadcast.cu:3:
In file included from /home/acxz/vcs/git/github/rocm-arch/python-pytorch-rocm/src/pytorch-1.6.0-rocm/caffe2/core/context_gpu.h:8:
/home/acxz/vcs/git/github/rocm-arch/python-pytorch-rocm/src/pytorch-1.6.0-rocm/caffe2/core/common_gpu.h:5:10: fatal error: 'cuda.h' file not found
#include <cuda.h>
         ^~~~~~~~
1 error generated when compiling for host.
CMake Error at torch_hip_generated_broadcast.cu.o.cmake:138 (message):
  Error generating
  /home/acxz/vcs/git/github/rocm-arch/python-pytorch-rocm/src/pytorch-1.6.0-rocm/build/caffe2/CMakeFiles/torch_hip.dir/utils/math/./torch_hip_generated_broadcast.cu.o


[2/231] Building HIPCC object caffe2/CMakeFiles/torch_hip.dir/utils/math/torch_hip_generated_transpose.cu.o
FAILED: caffe2/CMakeFiles/torch_hip.dir/utils/math/torch_hip_generated_transpose.cu.o
cd /home/acxz/vcs/git/github/rocm-arch/python-pytorch-rocm/src/pytorch-1.6.0-rocm/build/caffe2/CMakeFiles/torch_hip.dir/utils/math && /usr/bin/cmake -E make_directory /home/acxz/vcs/git/github/rocm-arch/python-pytorch-rocm/src/pytorch-1.6.0-rocm/build/caffe2/CMakeFiles/torch_hip.dir/utils/math/. && /usr/bin/cmake -D verbose:BOOL=OFF -D build_configuration:STRING=RELEASE -D generated_file:STRING=/home/acxz/vcs/git/github/rocm-arch/python-pytorch-rocm/src/pytorch-1.6.0-rocm/build/caffe2/CMakeFiles/torch_hip.dir/utils/math/./torch_hip_generated_transpose.cu.o -P /home/acxz/vcs/git/github/rocm-arch/python-pytorch-rocm/src/pytorch-1.6.0-rocm/build/caffe2/CMakeFiles/torch_hip.dir/utils/math/torch_hip_generated_transpose.cu.o.cmake
In file included from /home/acxz/vcs/git/github/rocm-arch/python-pytorch-rocm/src/pytorch-1.6.0-rocm/caffe2/utils/math/transpose.cu:7:
/home/acxz/vcs/git/github/rocm-arch/python-pytorch-rocm/src/pytorch-1.6.0-rocm/caffe2/core/common_gpu.h:5:10: fatal error: 'cuda.h' file not found
#include <cuda.h>
         ^~~~~~~~
1 error generated when compiling for host.
CMake Error at torch_hip_generated_transpose.cu.o.cmake:138 (message):
  Error generating
  /home/acxz/vcs/git/github/rocm-arch/python-pytorch-rocm/src/pytorch-1.6.0-rocm/build/caffe2/CMakeFiles/torch_hip.dir/utils/math/./torch_hip_generated_transpose.cu.o


[3/231] Building HIPCC object caffe2/CMakeFiles/torch_hip.dir/core/torch_hip_generated_context_gpu.cu.o
FAILED: caffe2/CMakeFiles/torch_hip.dir/core/torch_hip_generated_context_gpu.cu.o
cd /home/acxz/vcs/git/github/rocm-arch/python-pytorch-rocm/src/pytorch-1.6.0-rocm/build/caffe2/CMakeFiles/torch_hip.dir/core && /usr/bin/cmake -E make_directory /home/acxz/vcs/git/github/rocm-arch/python-pytorch-rocm/src/pytorch-1.6.0-rocm/build/caffe2/CMakeFiles/torch_hip.dir/core/. && /usr/bin/cmake -D verbose:BOOL=OFF -D build_configuration:STRING=RELEASE -D generated_file:STRING=/home/acxz/vcs/git/github/rocm-arch/python-pytorch-rocm/src/pytorch-1.6.0-rocm/build/caffe2/CMakeFiles/torch_hip.dir/core/./torch_hip_generated_context_gpu.cu.o -P /home/acxz/vcs/git/github/rocm-arch/python-pytorch-rocm/src/pytorch-1.6.0-rocm/build/caffe2/CMakeFiles/torch_hip.dir/core/torch_hip_generated_context_gpu.cu.o.cmake
In file included from /home/acxz/vcs/git/github/rocm-arch/python-pytorch-rocm/src/pytorch-1.6.0-rocm/caffe2/core/context_gpu.cu:8:
In file included from /home/acxz/vcs/git/github/rocm-arch/python-pytorch-rocm/src/pytorch-1.6.0-rocm/c10/cuda/CUDACachingAllocator.h:4:
/home/acxz/vcs/git/github/rocm-arch/python-pytorch-rocm/src/pytorch-1.6.0-rocm/c10/cuda/CUDAStream.h:6:10: fatal error: 'cuda_runtime_api.h' file not found
#include <cuda_runtime_api.h>
         ^~~~~~~~~~~~~~~~~~~~
1 error generated when compiling for host.
CMake Error at torch_hip_generated_context_gpu.cu.o.cmake:138 (message):
  Error generating
  /home/acxz/vcs/git/github/rocm-arch/python-pytorch-rocm/src/pytorch-1.6.0-rocm/build/caffe2/CMakeFiles/torch_hip.dir/core/./torch_hip_generated_context_gpu.cu.o


[4/231] Building HIPCC object caffe2/CMakeFiles/torch_hip.dir/utils/math/torch_hip_generated_elementwise.cu.o
FAILED: caffe2/CMakeFiles/torch_hip.dir/utils/math/torch_hip_generated_elementwise.cu.o
cd /home/acxz/vcs/git/github/rocm-arch/python-pytorch-rocm/src/pytorch-1.6.0-rocm/build/caffe2/CMakeFiles/torch_hip.dir/utils/math && /usr/bin/cmake -E make_directory /home/acxz/vcs/git/github/rocm-arch/python-pytorch-rocm/src/pytorch-1.6.0-rocm/build/caffe2/CMakeFiles/torch_hip.dir/utils/math/. && /usr/bin/cmake -D verbose:BOOL=OFF -D build_configuration:STRING=RELEASE -D generated_file:STRING=/home/acxz/vcs/git/github/rocm-arch/python-pytorch-rocm/src/pytorch-1.6.0-rocm/build/caffe2/CMakeFiles/torch_hip.dir/utils/math/./torch_hip_generated_elementwise.cu.o -P /home/acxz/vcs/git/github/rocm-arch/python-pytorch-rocm/src/pytorch-1.6.0-rocm/build/caffe2/CMakeFiles/torch_hip.dir/utils/math/torch_hip_generated_elementwise.cu.o.cmake
In file included from /home/acxz/vcs/git/github/rocm-arch/python-pytorch-rocm/src/pytorch-1.6.0-rocm/caffe2/utils/math/elementwise.cu:10:
In file included from /home/acxz/vcs/git/github/rocm-arch/python-pytorch-rocm/src/pytorch-1.6.0-rocm/caffe2/core/context_gpu.h:8:
/home/acxz/vcs/git/github/rocm-arch/python-pytorch-rocm/src/pytorch-1.6.0-rocm/caffe2/core/common_gpu.h:5:10: fatal error: 'cuda.h' file not found
#include <cuda.h>
         ^~~~~~~~
1 error generated when compiling for host.
CMake Error at torch_hip_generated_elementwise.cu.o.cmake:138 (message):
  Error generating
  /home/acxz/vcs/git/github/rocm-arch/python-pytorch-rocm/src/pytorch-1.6.0-rocm/build/caffe2/CMakeFiles/torch_hip.dir/utils/math/./torch_hip_generated_elementwise.cu.o


ninja: build stopped: subcommand failed.
Traceback (most recent call last):
  File "setup.py", line 732, in <module>
    build_deps()
  File "setup.py", line 311, in build_deps
    build_caffe2(version=version,
  File "/home/acxz/vcs/git/github/rocm-arch/python-pytorch-rocm/src/pytorch-1.6.0-rocm/tools/build_pytorch_libs.py", line 62, in build_caffe2
    cmake.build(my_env)
  File "/home/acxz/vcs/git/github/rocm-arch/python-pytorch-rocm/src/pytorch-1.6.0-rocm/tools/setup_helpers/cmake.py", line 345, in build
    self.run(build_args, my_env)
  File "/home/acxz/vcs/git/github/rocm-arch/python-pytorch-rocm/src/pytorch-1.6.0-rocm/tools/setup_helpers/cmake.py", line 141, in run
    check_call(command, cwd=self.build_dir, env=env)
  File "/usr/lib/python3.8/subprocess.py", line 364, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['cmake', '--build', '.', '--target', 'install', '--config', 'Release', '--', '-j', '4']' returned non-zero exit status 1.
1 acxz@archard ..on-pytorch-rocm/src/pytorch-1.6.

To Reproduce

Steps to reproduce the behavior:
High level:

  1. git clone
  2. python tools/amd_build/build_amd.py
  3. python setup.py build

Technically I am using the following PKGBUILD script here

Expected behavior

Should compile successfully

Environment

Please copy and paste the output from our
environment collection script
(or fill out the checklist below manually).

You can get the script and run it with:

wget https://raw.githubusercontent.com/pytorch/pytorch/master/torch/utils/collect_env.py
# For security purposes, please check the contents of collect_env.py before running it.
python collect_env.py
  • PyTorch Version (e.g., 1.0): 1.6.0
  • OS (e.g., Linux): Arch Linux
  • How you installed PyTorch (conda, pip, source): source
  • Build command you used (if compiling from source): python setup.py build
  • Python version: 3.8.5
  • CUDA/cuDNN version: N/A
  • GPU models and configuration:
  • Any other relevant information: ROCm

Additional context

Downstream issue: rocm-arch/python-pytorch-rocm#5

cc @malfet @jeffdaily @sunway513 @t-vi

1 possible answer(s) on “[ROCm] [Build Error] [caffe2] `cuda.h/cuda_runtime.h` file not found