[转]顶点数据压缩

http://www.cnblogs.com/oiramario/archive/2012/09/26/2703277.html

看过敏敏的http://www.klayge.org/2012/09/21/%E5%8E%8B%E7%BC%A9tangent-frame/

今年2、3月份曾经整过这玩意，做到用tangent.w来存handedness，解决了uv mirror的问题

没想到顶点数据压缩还有这么深的学问，于是乎按照资料对max插件进行了修改，效果超出想象

目前做到使用unsigned char x 4来存normal和tangent，short x 2来存texcoord，我们可以大致算一下

之前是normal = float x 3，tangent = float x 4，texcoord = float x 2（还要看一共有几层uv），一共是12 + 16 + 8 = 36

压缩之后变成normal = unsigned char x 4，tangent = unsigned char x 4，texcoord = short x 2，一共是4 + 4 + 4 = 12

每个顶点从36字节减少到12字节，少了一半多，通过观察一个20000多面的模型，mesh的大小从1388KB减少到552KB，压缩后是原大小的0.39倍

还没有像文中介绍的那样将tangent frame压缩到仅用8个字节的程度

其优点是数据量大大减少，这样vertex cache的命中率会提高，据观察fps有约5%的提高

其缺点是vs中的计算量稍微增加了一些，另外压缩导致精度上会有损失

float f = 0.1234567f;
unsigned char uc = (unsigned char)((f * 0.5f + 0.5f) * 255);
short s = (short)((f * 0.5f + 0.5f) * 32767.0f);

float unpackuc = uc * 2.0f / 255.0f - 1.0f;
float unpacks = s * 2.0f / 32767.0f - 1.0f;

unpackuc = 0.12156863
unpacks = 0.12344737

参考资料：

http://www.humus.name/Articles/Persson_CreatingVastGameWorlds.pdf

http://www.crytek.com/download/izfrey_siggraph2011.pdf

http://fabiensanglard.net/dEngine/index.php

http://oddeffects.blogspot.com/2010/09/optimizing-vertex-formats.html

注意：

在声明顶点元素时，使用UBYTE4或者SHORT4。

D3DVERTEXELEMENT9 declExt[] = {
 // stream, offset, type, method, usage, usageIndex
 { 0, 0, D3DDECLTYPE_FLOAT3, D3DDECLMETHOD_DEFAULT, D3DDECLUSAGE_POSITION, 0 },
 { 0, 12, D3DDECLTYPE_UBYTE4, D3DDECLMETHOD_DEFAULT, D3DDECLUSAGE_NORMAL, 0 },
 // 2d uv
 { 0, 16, D3DDECLTYPE_FLOAT2, D3DDECLMETHOD_DEFAULT, D3DDECLUSAGE_TEXCOORD, 0 },
 { 0, 24, D3DDECLTYPE_SHORT2N, D3DDECLMETHOD_DEFAULT, D3DDECLUSAGE_TEXCOORD, 1 },
 // tangent
 { 0, 28, D3DDECLTYPE_UBYTE4, D3DDECLMETHOD_DEFAULT, D3DDECLUSAGE_TEXCOORD, 2 },
 D3DDECL_END()
};

但是在着色器中直接使用float4作为输入，GPU会自动转换。

float4 normal : NORMAL;

或：
float4 normal : BLENDINDICES;

有些显卡不支持UBYTE4类型的NORMAL语法输入，可尝试作为BLENDINDICES使用。这也是UBYTE4常用的方式。

================================================

Pack the normals into the w value of the position of each vertex, then you should be able to do something similar to this to read it back, and then you just need to convert it back to a normal vector in the shader (multiply by 2, then subtract 1).

To pack the normal into a float you should be able to use something like this (not tested and should probably use the proper casts instead of C style casts, and the normal needs to be normalized):

float PackNormal(const Vector3& normal)
{
   //Use 127.99999f instead of 128 so that if the value was 1 it won't be 256 which screws things up
   unsigned int packed = (unsigned int)((normal.x + 1.0f) * 127.99999f);
   packed += (unsigned int)((normal.y + 1.0f) * 127.99999f) << 8;
   packed += (unsigned int)((normal.z + 1.0f) * 127.99999f) << 16;

   return *((float*)(&packed));
}