『转载』使用 CMake 构建跨平台 CUDA 应用程序

原文地址：

https://developer.nvidia.com/zh-cn/blog/building-cuda-applications-cmake/

https://developer.nvidia.com/blog/building-cuda-applications-cmake/

一个有意思的cuda入门项目：https://github.com/LitLeo/OpenCUDA

对我个人主要是cmake入门。project不是必须的，但是可以指定编译语言，应该是和编译器的选择有关：

project(cmake_and_cuda LANGUAGES CXX CUDA)

此外， CUDACXX 和 CXX 环境变量可以分别设置为 nvcc 和 C ++编译器的路径。

下面这个不太懂，看起来是在指定cpu程序的编译器：

You can explicitly specify a host compiler to use with NVCC using the CUDAHOSTCXX environment variable. (This controls the -ccbin option for NVCC.)

下面这个使得和build目标particles 相关的编译使用c++11标准：

target_compile_features(particles PUBLIC cxx_std_11)

位置无关代码：

set_target_properties(particles PROPERTIES POSITION_INDEPENDENT_CODE ON)

看起来是给静态库（particles ）用的，cmake的动态库自动启用位置无关特性，静态库需要连接到动态库，通过上面行启用，对cuda语言的支持需要3.8以上版本。

CMake 3.8 supports the POSITION_INDEPENDENT_CODE property for CUDA compilation, and builds all host-side code as relocatable when requested. This is great news for projects that wish to use CUDA in cross-platform projects or inside shared libraries, or desire to support esoteric C++ compilers.

可分离编译（https://developer.nvidia.com/blog/separate-compilation-linking-cuda-device-code/）：主要讲的似乎是cuda代码可以被存放在多个静态库中（主要手段是将device linking推迟到静态库链接动态库或可执行程序之后），且支持cmake的增量编译特性。

在上面引的文章中解释了问题：

One of the key limitations that device code linking lifts is the need to have all the code for a GPU kernel present when compiling the kernel, including all the device functions that the kernel calls. As C++ programmers, we are used to calling externally defined functions simply by declaring the functions’ prototypes (or including a header that declares them).

GPU编译需要知道kernel调用的全部函数具体代码，在C++中，编译特定文件时可以用头文件、声明一下该函数等方式暂时搪塞过去（链接时才会真正的去找函数代码）。