CUDA SDK VolumeRender 分析 (2)

本文主要分析VolumeRender中涉及到的一些图形算法：Ray casting、直线平面求交。

QQ截图20110717210942

VolumeRender渲染效果

Volume Render通常用来绘制几何图形难以表现的流体、云、火焰、烟雾等效果,流行的volume render算法有：ray casting、texture-based volume rendering。SDK例子使用的是Ray casting算法。

Ray casting volume renderers shoot rays from the camera through each pixel in the output image which then intersect the cells in the volume and are combined via the transfer function to make a color for the pixel. The ray casting volume renderer is slower than the hardware accelerated methods but it produces superior images and shows all of the data instead of a resampled version.

Ray casting渲染volume rendering大体需要两步:

计算view ray 和 bounding volume的两个交点(far、 near)
沿着ray的方向在far和near间取点，用点坐标在3D texture中采样，并将一条ray的所有点的颜色累加起来，点与点之间的距离(step)可以是统一的或自适应的

在本SDK sample中使用了固定的step，求每个step颜色时先在3D texture中采样，再用得到的值在1D texture中采样得到颜色。

在VolumeRender中, d_render函数实现像素颜色的计算.

首先计算出视线参数:

  1:     // calculate eye ray in world space
  2:     Ray eyeRay;
  3:     eyeRay.o = make_float3(mul(c_invViewMatrix, make_float4(0.0f, 0.0f, 0.0f, 1.0f)));
  4:     eyeRay.d = normalize(make_float3(u, v, -2.0f));
  5:     eyeRay.d = mul(c_invViewMatrix, eyeRay.d);

其中o为摄像机位置坐标, z为视线方向向量. c_invViewMatrix为视距阵的转置. (u v –2)是ray的方向, (0 0 2)是视平面, 摄像机在 (0 0 –4), uv是这样计算的:

  1:     // calculate pixel position in view space
  2:     uint x = blockIdx.x*blockDim.x + threadIdx.x;
  3:     uint y = blockIdx.y*blockDim.y + threadIdx.y;
  4:     if ((x >= imageW) || (y >= imageH)) return;
  5: 
  6:     float u = (x / (float) imageW)*2.0f-1.0f;
  7:     float v = (y / (float) imageH)*2.0f-1.0f;

这里详细的分析可以看kheresy的分析, 写得十分清楚.

有了ray, 现在需要计算交点, 这在SDK中由intersectBox函数实现

  1: // calculate interset with box parallel to coordinate axes
  2: __device__
  3: int intersectBox(Ray r, float3 boxmin, float3 boxmax, float *tnear, float *tfar)
  4: {
  5:     // compute intersection of ray with all six bbox planes
  6:     float3 invR = make_float3(1.0f) / r.d;
  7:     float3 tbot = invR * (boxmin - r.o);
  8:     float3 ttop = invR * (boxmax - r.o);
  9: 
 10:     // re-order intersections to find smallest and largest on each axis
 11:     float3 tmin = fminf(ttop, tbot);
 12:     float3 tmax = fmaxf(ttop, tbot);
 13: 
 14:     // find the largest tmin and the smallest tmax
 15:     float largest_tmin = fmaxf(fmaxf(tmin.x, tmin.y), fmaxf(tmin.x, tmin.z));
 16:     float smallest_tmax = fminf(fminf(tmax.x, tmax.y), fminf(tmax.x, tmax.z));
 17: 
 18:   *tnear = largest_tmin;
 19:   *tfar = smallest_tmax;
 20: 
 21:   return smallest_tmax > largest_tmin;
 22: }

由代码注释中可以看出, 这里先求ray和box所在六个面交点, 再按照这里的规则找到真正的交点. 计算直线与六个平面的交点只用了三行代码, 很漂亮. collision detection是很耗时的计算, 这里通过把volume box转换到(-1 –1 -1), (1 1 1), 以及平行坐标轴的技巧, 将复杂度降为O(1), 可以作为elegant code的典范了.

下面是计算每条ray的颜色的代码片段, 这里为了帮助大家理解对注释和代码稍做修改

  1:     // march along ray from front to back, accumulating color
  2:     float4 sum = make_float4(0.0f);    // final color
  3:     float t = tnear;                   // position between tnear and tfar
  4:     float3 pos = eyeRay.o + eyeRay.d*tnear;
  5:     float3 step = eyeRay.d*tstep;
  6:     
  7:     // opacity color and transparent value for each pos in [-1, 1] cube
  8:     for(int i=0; i<maxSteps; i++) {
  9:         // remap position to [0, 1] coordinates
 10:         float sample = tex3D(tex, pos.x*0.5f+0.5f, pos.y*0.5f+0.5f, pos.z*0.5f+0.5f);
 11: 
 12:         // lookup the color and transparent from transferTex
 13:         float4 col = tex1D(transferTex, (sample-transferOffset)*transferScale);
 14:         col.w *= density;
 15: 
 16:         // pre-multiply alpha
 17:         col.x *= col.w;
 18:         col.y *= col.w;
 19:         col.z *= col.w;
 20: 
 21:         // front-to-back color blending 
 22:         sum = sum + col*(1.0f - sum.w);
 23: 
 24:         // exit early if opaque
 25:         if (sum.w > opacityThreshold)
 26:             break;
 27: 
 28:         t += tstep;
 29:         if (t > tfar) break;
 30: 
 31:         pos += step;
 32:     }

这段代码中的Ray位置的计算, 和颜色, 透明度累加的方式值得深入学习.

Ray casting做volume render的缺点是计算量大, 首先每条ray都需要做求交, 再对每个step进行采样和叠加. 计算量受画面解析度(Ray 数量)和场景复杂程度影响很大. 但是Ray和Ray之间没有数据交换, 因此容易并行. 这个例子就是利用这个特点, 为每个Ray分配一个线程(由CUDA管理虚拟的二维线程集合). 在512*512分辨率下达到了60+fps的效果.

以上就是volumeRender对颜色计算的分析. 欢迎大家一起研究学习.

参考

Volume Render 技术

HERESY’s space