HLSL Shader编程基础总结

转自：https://blog.csdn.net/Blues1021/article/details/47093487

基本前提概念

　　Shader是一种映射到GPU硬件汇编语言上的高级语言，Shader中的数据类型是要声明类型和用途的，用途其实就是和着色器寄存器关联,输入位置寄存器，输出位置寄存器，输出颜色寄存器等。Shader HLSL中的颜色是rgba的类型，不要弄错了。Shader中的每一个类型，函数，字符串都是有含义的。

　　顶点和像素着色器中,常量表用SetDefault传入Device将常量表中的字段常量设置初值，避免忘记赋值了而没有初值的情况。
　　D3DXCompileShaderFromFile用D3DXSHADER_SKIPVALIVATION当已确认可用时不进行任何代码验证可以提高效率不要老用D3DXSHADER_DEBUG那只是开发测试时候用的。

HLSL语言基础总结

I、变量类型

1. 标量类型：
bool, int, half为16位浮点数，float, double。
初始化和赋值:

const static int g_nArraySize = 3;
const static int g_nArraySize2 = {3};
const static int g_nArraySize3 = int(3);

2.向量类型：
1）类型
vector是4D向量，每个分量都是float类型。
vector<T, n>其中n为1到4，T为需要的向量分量类型。
float2,float3,float4定义的是2维，3维，4维的向量。
2）初始化(直接或者构造函数方式赋值)

vector u = {0.6f, 0.3f, 1.0f, 1.0f};
vector v = {1.0f, 5.0f, 0.2f, 1.0f};
vector u = vector(0.6f, 0.3f, 1.0f, 1.0f);
vector v = vector(1.0f, 5.0f, 0.2f, 1.0f);
float3 = float3(0, 0, 0);

3）访问和赋值
使用数组下标的语法访问向量的一个分量，例如访问第i个向量分量，用vec[i] = 2.0f。
可以像访问结构的成员一样访问向量vec的一个分量，使用已定义的分量名x，y，z，w，r，g，b和a，如：

vec.x = vec.r = 1.0f;
vec.y = vec.g = 2.0f;
vec.z = vec.b = 3.0f;
vec.w = vec.a = 4.0f;

名称为r，g，b和a的分量分别对应x，y，z和w的分量。当使用向量来表示颜色时，RGBA符号是更适合的，因为它加强了向量所表示的颜色。

考虑向量u = (ux, uy, uz, uw)，假设我们要拷贝u的所有分量到一个像v = (ux, uy, uy, uw)这样的向量v。最直接的方法可能是逐个从u往v拷贝每个分量。但不管怎样，HLSL提供了一种特殊的语法做这些无序的拷贝，它叫做swizzles：

vector u = {l.0f, 2.0f, 3.0f, 4.0f};
vector v = {0.0f, 0.0f, 5.0f, 6.0f};
v = u.xyyw; // v = {1.0f, 2.0f, 2.0f, 4.0f}

拷贝数组时，我们不必拷贝每个分量。例如，我们可以仅拷贝x和y分量：

vector u = {1.0f, 2.0f, 3.0f, 4.0f};
vector v = {0.0f, 0.0f, 5.0f, 6.0f};
v.xy = u; // v = {l.0f, 2.0f, 5.0f, 6.0f}

3.矩阵类型：

1）类型
matrix为4x4的矩阵，每个元素的类型都是float类型。
matrix<T,m,n>中的m,n为1~4之间。

float2x2, float3x3, float4x4

intmxn也是可以的。
2）初始化(直接或者构造函数方式赋值)

int2x2 m = {1, 2, 3, 4};
float2x2 f2x2 = float2x2(1.0f, 2.0f, 3.0f, 4.0f);

3）访问和赋值
可以用二维数组的下标语法访问矩阵中的项，例如M[i] [j] = value;
此外，我们可以像访问结构的成员那样访问矩阵M的项。下列条目已定义：

以1为基数的：

M._11 = M._12 = M._13 = M._14 = 0.0f;

M._21 = M._22 = M._23 = M._24 = 0.0f;

M._31 = M._32 = M._33 = M._34 = 0.0f;

M._41 = M._42 = M._43 = M._44 = 0.0f;

以0为基数的：

M._m00 = M._m01 = M._m02 = M._m03 = 0.0f;

M._m10 = M._m11 = M._m12 = M._m13 = 0.0f;

M._m20 = M._m21 = M._m22 = M._m23 = 0.0f;

M._m30 = M._m31 = M._m32 = M._m33 = 0.0f;

有时，我们想要访问矩阵中一个特定的行。我们可以用一维数组的下标语法来做。例如，要引用矩阵M中第i行的向量，我们可以写：

vector ithRow = M[i]; // get the ith row vector in M

4.数组类型

数组类型和C++一样，声明时候可以不用初始化。
例如：

float  M[4][4];
half   p[4];
vector v[12];

但是数组的大小是有限制的，不能太大了。
详细参见：https://en.wikipedia.org/wiki/High-Level_Shading_Language
Constant registers，一个Constant register可以存放一个vector也就是4个float, 那么Pix Shader可以存放56个常量matrix类型，只是理论上的。

在C++程序中可以用传递数组的方式为HLSL传递向量或者矩阵。例如：

FLOAT texSize[2] = {imageInfo.Width * 1.0f, imageInfo.Height * 1.0f};
MultiTexCT->SetFloatArray(Device, TexSizeHandle, texSize, 2);

float 变长数组问题，总是希望程序里面传递一个参数给HLSL来确定HLSL的数组或者矩阵大小，但是不能做到，只能通过其它途径解决。

const static int g_num = num;
float fVec[g_num][g_num];

形式给出，而不能用float fVec[g_num * 2]一维的来处理。
这样设置也是不行的，因为不支持变长的数组，还是需要明确的数组，如果要计算就在应用程序计算设置进去，然后HLSL根据需要标记进行区分。
或者设置一个足够大的二维数组，然后只用少部分的；或者设计一个少的数组，然后用多次Pass绘制来解决，绘制时候：

for each mesh being rendered
for each light affecting the mesh
if (first light)
    render first light with ambient and no blending
else
    render nth light with no ambient and additive belnding

5.结构体

结构的定义和在C++里一样。但是，HLSL里的结构不能有成员函数。

struct MyStruct
{
     matrix T;
     vector n;
     float  f;
     int    x;
     bool   b;
};

MyStruct s; // instantiate
s.f = 5.0f; // member access

II、typedef关键字
HLSL的typedef关键字功能和C++里的完全一样。例如，我们可以给类型vector<float, 3>用下面的语法命名：

typedef vector<float, 3> point;

然后，不用写成：

vector<float, 3> myPoint;

我们只需这样写：

point myPoint;

这里是另外两个例子，它展示了如何对常量和数组类型使用typedef关键字：

typedef const float CFLOAT;
typedef float point2[2];

III、变量前缀

extern——如果变量以extern关键字为前缀，就意味着该变量可在着色器外被访问，比如被C++应用程序。仅全局变量可以以extern关键字为前缀。不是static的全局变量默认就是extern。

uniform——如果变量以uniform关键字为前缀，就意味着此变量在着色器外面被初始化，比如被C++应用程序初始化，然后再输入进着色器。

const——HLSL中的const关键字和C++里的意思一样。也就是说，如果变量以const为前缀，那此变量就是常量，并且不能被改变。

static——如果带static关键字前缀，若它是全局变量，就表示它不是暴露于着色器之外的。换句话说，它是着色器局部的。如果一个局部变量以static关键字为前缀，它就和C++中static局部变量有相同的行为。也就是说，该变量在函数首次执行时被一次性初始化，然后在所有函数调用中维持其值。如果变量没有被初始化，它就自动初始化为0。

shared——如果变量以shared关键字为前缀，就提示效果框架：变量将在多个效果间被共享。仅全局变量可以以shared为前缀。

volatile——如果变量以volatile关键字为前缀，就提示效果框架：变量将时常被修改。仅全局变量可以以volatile为前缀。

其它关键字：

注意内建类型，sampler, texture, compile, decl关键字用法。

IV、运算符和类型转换

运算符，%符号可以用于整数和浮点数，左右操作数需同号。
许多HLSL的运算（+,-, * , / )都是在向量的分量级上进行的，v++,u++, u*v。
比较运算也是在分量上操作的。
类型转换和C一样，可以强转且类型精度，维度会提升。
双目运算符中的类型提升：
值类型提升 bool < int < half < float < double.
维数提升 float定义了转换到float2,float3不同分量上是取相同的数值，故可以转换为float2, float3。而float2没有定义转换到float3的，所以不能进行提升转换。

V、语句
for循环不能从负数开始，且前后定义的for里面的变量不能重复，否则会导致冲突。

VI、自定义函数和内置函数
按值传递，不支持递归，总是内联的（所以函数要尽量短小)。
函数形参增加了in, out, inout修饰，默认是int类型。
内置的函数对各标量，向量，矩阵类型的大部分都做了重载，故是可以直接使用的。
更多的内置函数见：https://msdn.microsoft.com/en-us/library/windows/desktop/ff471376%28v=vs.85%29.aspx

顶点着色器和像素着色器原理和实现

1.原理

顶点着色器的输入是从物体坐标系开始包含位置、顶点法向量、纹理UV值；输出是到设备规范坐标系的顶点位置、颜色和纹理uv值（没有了顶点法向量），过程中包括变换和光照，还有对顶点的大小，点焊接，面拆分，LOD边坍塌，LOD反演点拆分等技术。后面对顶点经过屏幕变换，背面剔除，深度测试, 根据顶点对面片进行uv插值计算，或者颜色(或光照材质颜色）插值。进入像素着色器，像素着色器的输入是插值后的像素UV纹理坐标和像素的颜色，像素着色器狭隘的说是代替多纹理融合着色阶段(操作单个像素和每个像素的纹理坐标的能力）其中采用器索引和第i层纹理的关联可以在程序中用D3DXCONSTANT_DESC TexDesc来关联，内部功能包含了alpha测试，stencil测试，颜色融合；像素着色器的输出是像素的颜色。

顶点着色器(3D shaders)
顶点数据流：顶点格式中相同的类型索引含义和HLSL 顶点着色器输入结构含义，通过Usage使用类型和UsageIndex使用类型索引进行映射关联。如果不是很复杂的顶点声明其实可以用FVF在程序里面做，FVF会在渲染管道过程中会转换为顶点声明。如果有顶点着色器，那么顶点着色器输出后，进行屏幕变换，背面裁剪和深度检测，根据顶点进行面片uv插值或者颜色插值后，那么得到的输出（没有了像素位置说法，只有像素uv,和颜色）就是像素着色器的输入。如果没有顶点着色器，只有像素着色器，那么像素着色器的输入就和顶点声明中的纹理uv和颜色对应作为输入（不需要顶点位置和顶点法向量信息 )。
3D shaders act on 3D models or other geometry but may also access the colors and textures used to draw the model or mesh. Vertex shaders are the oldest type of 3d shader, generally modifying on a per-vertex basis. Geometry shaderscan generate new vertices from within the shader.Tessellation shaders are newer 3d shaders that act on batches of vertexes all at once to add detail - such as subdividing a model into smaller groups of triangles or other primitives at runtime, to improve things like curves and bumps, or change other attributes.

As of OpenGL 4.0 and Direct3D 11, a new shader class called a Tessellation Shader has been added. It adds two new shader stages to the traditional model. Tessellation Control Shaders(also known as Hull Shaders) and Tessellation Evaluation Shaders (also known as Domain Shaders), which together allow for simpler meshes to be subdivided into finer meshes at run-time according to a mathematical function. The function can be related to a variety of variables, most notably the distance from the viewing camera to allow active level-of-detail scaling. This allows objects close to the camera to have fine detail, while further away ones can have more coarse meshes, yet seem comparable in quality.

像素着色器(2D shaders)
在顶点声明中多重纹理坐标需要多个uv,和D3DFVF_TEXT3来指明，相同大小的纹理图可以用一个uv来声明。
在纹理采样阶段，用sampler来声明，可以用更安全的sampler2D,sampler3D来声明更安全，采样器专门使用tex*相关的内置函数来实现。像素着色器，是唯一一个在光栅化后能够修改和过滤像素值的通道。像素着色器在光栅化后，但是如果有修改深度缓存，模板值，颜色值硬件还是可以再进行特殊处理，实现更加丰富的颜色效果的。
Pixel shaders, also known asfragment shaders, compute color and other attributes of each "fragment" - a technical term usually meaning a single pixel.
For instance, a pixel shader is the only kind of shader that can act as a postprocessor or filter for a video stream after it has been rasterized.

程序中Draw命令只是提交渲染，才是图形渲染开始的第一步，也就是一个batch提交，还需要commandBuffer, driverBuffer提交到给显卡硬件输入，后面才开始变换和光照(摄像机空间)，屏幕变换背面剔除深度剔除、面片插值和像素着色。

2.顶点着色器和像素着色器关键函数和实现步骤

0）编写HLSL Shader脚本，用notpad++和HLSL语法高亮插件http://www.discoverthat.co.uk/games/edit-shaders.htm:
例如:

struct VS_INPUT
{
    vector position  : POSITION;
};
// Output structure describes the vertex that is
// output from the shader.  Here the output
// vertex contains a position and color component.
struct VS_OUTPUT
{
    vector position : POSITION;
    vector diffuse  : COLOR;
};
//
// Main Entry Point, observe the main function
// receives a copy of the input vertex through
// its parameter and returns a copy of the output
// vertex it computes.
//
VS_OUTPUT Main(VS_INPUT input)
{
    // zero out members of output
    VS_OUTPUT output = (VS_OUTPUT)0;
 
    // transform to view space and project
    output.position  = mul(input.position, ViewProjMatrix);
    // set vertex diffuse color to blue
    output.diffuse = Blue;
    return output;
}

如果没有使用INPUT,OUTPUT中声明时要求的使用寄存器，那么main函数中需要指明输入输出寄存器,例如：

float4 Main(in float2 base:TEXCOORD0,
                 in float2 spot:TEXCOORD1,
                 in float2 text:TEXCOORD2):COLOR
                 {
                 }

1）用D3DXCompileShaderFromFile函数在C++代码中编译HLSL脚本得到Shader内部代码字节和常量句柄ID3DXConstantTable对象。
2）用CreateVertexShader或CreatePixelShader，从Shader内部代码字节得到IDirect3DVertexShader9顶点着色器或者IDirect3DPixelShader9像素着色器。
3）用常量句柄TransformConstantTable->SetDefaults(Device)；初始化Shader中的常量。
4）用常量句柄得到Shader内的常量句柄，并为其设置值，例如：

D3DXHANDLE TransformViewProjHandle = 0;
TransformViewProjHandle = TransformConstantTable->GetConstantByName(0, "ViewProjMatrix");

TransformConstantTable->SetMatrix(
   Device,
   TransformViewProjHandle,
   &ViewProj);
FLOAT texSize[2] = {imageInfo.Width * 1.0f, imageInfo.Height * 1.0f};
 MultiTexCT->SetFloatArray(Device, TexSizeHandle, texSize, 2);
 MultiTexCT->SetInt(Device, ArraySizeHandle, 3);

IDirect3DPixelShader9* MultiTexPS = 0;
ID3DXConstantTable* MultiTexCT    = 0;

 ID3DXBuffer* shader      = 0;
 ID3DXBuffer* errorBuffer = 0;
 hr = D3DXCompileShaderFromFile(
  "mohu_texture.txt",
  0,
  0,
  "main", // entry point function name
  "ps_2_0",
  D3DXSHADER_ENABLE_BACKWARDS_COMPATIBILITY/*D3DXSHADER_DEBUG*/,
  &shader,
  &errorBuffer,
  &MultiTexCT);

5）如果是像素着色器中存在纹理，还要用纹理对象和Shader中的采样器关联和设置采样参数：

IDirect3DTexture9* BaseTex      = 0;
D3DXHANDLE BaseTexHandle      = 0;
D3DXCONSTANT_DESC BaseTexDesc;

// 实际纹理对象
D3DXCreateTextureFromFile(Device, "crate.bmp", &BaseTex);

// Shader纹理采样器句柄
BaseTexHandle      = MultiTexCT->GetConstantByName(0, "BaseTex");
MultiTexCT->GetConstantDesc(BaseTexHandle,      &BaseTexDesc, &count);

// 在绘制时候，通过shader采样器描述，为多通道的纹理，设置纹理层级和实际纹理 的关联；从而也使得纹理对象和采样器对象关联
Device->SetTexture(     BaseTexDesc.RegisterIndex, BaseTex);
// 在绘制时候，设置纹理采样状态参数
  Device->SetSamplerState(BaseTexDesc.RegisterIndex, D3DSAMP_MAGFILTER, D3DTEXF_LINEAR);
  Device->SetSamplerState(BaseTexDesc.RegisterIndex, D3DSAMP_MINFILTER, D3DTEXF_LINEAR);
  Device->SetSamplerState(BaseTexDesc.RegisterIndex, D3DSAMP_MIPFILTER, D3DTEXF_LINEAR);

6）设置启用顶点着色器或者像素着色器

Device->SetPixelShader(MultiTexPS);

7）清理释放着色器和常量句柄和纹理对象

d3d::Release<ID3DXMesh*>(Teapot); // d3d::Release<IDirect3DVertexBuffer9*>(QuadVB);
d3d::Release<IDirect3DVertexShader9*>(TransformShader);
d3d::Release<ID3DXConstantTable*>(TransformConstantTable);
d3d::Release<IDirect3DTexture9*>(BaseTex);

效果框架原理和实现
effect其实整合了render state和shader的控制两大部分内容。
关于render state部分
HLSL中的内置类型(特别在效果文件中用得多）见：DX Graphics Documentation文档目录HLSLReferenceLanguage SyntaxVariablesData Types下面有sampler texture PixelShader VertexShader。

内置函数见：
DX Graphics Documentation文档目录HLSLReferenceHLSL Intrinsic Functions。

HLSL效果文件中的内置状态和值见：
DX Graphics Documentation文档目录Direct3dD 9ReferenceEffect ReferenceEffect FormatEffect States下。
effect state [ [index] ] = expression;形式，外面的[]意思是可选的，内部括号Index是当需要使用数组状态时候（支持多下标的是要指明下标的，例如纹理层，光照个数）标识的当前索引状态。
State是上述文档路径中的各种state。

最后记得效果文件Pass中引用效果文件无论是自己设置，还是应用程序设置的变量，都要用括号括起来，否则是非法的。

关于shader部分
创建效果，设置参数，多个技术是因为不同硬件需要不同的技术GetTechnique；激活技术SetTechnique；开启技术BeginPass；设置绘制物体前的绘制过程Pass(i)绘制过程有多个是因为设置不同的渲染状态，纹理采用器采样方法，材质，光照等需要多次绘制物体；绘制物体DrawIndexPrimitive或DrawSubset等提交一个batch；关闭技术EndPass。效果文件pass中获取外部和sampler变量需要用括号括起来。
The first step is to organize the state you want to control in an effect. This includes shader state (vertex, hull, domain, geometry, pixel and compute shaders), texture and sampler state used by the shaders, and other non-programmable pipeline state.

The Geometry shader can generate new graphics primitives, such as points, lines, and triangles, from those primitives that were sent to the beginning of the graphics pipeline。
Geometry shader programs are executed after vertex shaders. They take as input a whole primitive, possibly with adjacency information. For example, when operating on triangles, the three vertices are the geometry shader's input. The shader can then emit zero or more primitives, which are rasterized and their fragments ultimately passed to a pixel shader。

The compute shader provides memory sharing and thread synchronization features to allow more effective parallel programming methods.

Shader渲染本身是支持多通道渲染，在写程序上的所谓多通道应该是分为串行的和并行的，在串行的当中（GPU计算内部对提交大量的计算会进行并行处理所以不用关心 )，如果绘制的物体是不同的物体，位置不一样，那么需要每次都设置世界坐标转换，然后进行设置技术绘制过程进行渲染；如果是绘制的物体是相同的物体，例如镜面正交投影，那么可以在绘制过程中设置不同的绘制状态，纹理采用，模板，融合，光照材质等进行绘制。如果是绘制的并行的多通道应该是在顶点声明中，使用不同的通道，如何控制提高性能有待更深入的积累？
Shaders的顶点着色器处理网格中的很多顶点可以同时并行的进行，后台缓存中的很多像素也可以很多像素可以同时进行像素着色处理。
Shaders are written to apply transformations to a large set of elements at a time, for example, to each pixel in an area of the screen, or for every vertex of a model. This is well suited to parallel processing, and most modern GPUs have multiple shader pipelines to facilitate this, vastly improving computation throughput.

效果实现
(0).先编写Fx脚本。
(1).D3DXCreateEffectFromFile获取效果对象，用效果对象来代替常量表对象来获取和设置fx内的Shader变量值。
(2).调用Effect来渲染步骤：
用GetTechniqueByName获取技术句柄
SetTechnique激活技术句柄

ID3DXEffect::Begin 启动技术句柄

ID3DXEffect::BeginPass 开启绘制过程

ID3DXEffect::CommitChanges提交对当前pass的各种参数修改，要在draw之前调用

ID3DXEffect::EndPass 关闭绘制过程

ID3DXEffect::End 关闭技术句柄

(3).释放效果框架对象

 d3d::Release<ID3DXEffect*>(FogEffect);

0）编写Fx脚本
例如，没有着色器的版本：

technique Fog
{
    pass P0
    {
        //
        // Set Misc render states.

        pixelshader      = null;
        vertexshader     = null;
        fvf              = XYZ | Normal;
        Lighting         = true;
        NormalizeNormals = true;
        SpecularEnable   = false;

        //
        // Fog States

        FogVertexMode = LINEAR; // linear fog function
        FogStart      = 50.0f;  // fog starts 50 units away from viewpoint
        FogEnd        = 300.0f; // fog ends 300 units away from viewpoint

        FogColor      = 0x00CCCCCC; // gray
        FogEnable     = true;       // enable
    }
}

有着色器的版本：

extern matrix WorldViewMatrix;
extern matrix WorldViewProjMatrix;

extern vector Color;
extern vector LightDirection;
extern texture ShadeTex;
//
// Structures
//

struct VS_INPUT
{
    vector position : POSITION;
    vector normal   : NORMAL;
};

struct VS_OUTPUT
{
    vector position : POSITION;
    float2 uvCoords : TEXCOORD;
    vector diffuse  : COLOR;
};
//
// Main
//

VS_OUTPUT Main(VS_INPUT input)
{
    // zero out each member in output
    VS_OUTPUT output = (VS_OUTPUT)0;


    // transform vertex position to homogenous clip space
     output.position = mul(input.position, WorldViewProjMatrix);

    //
    // Transform lights and normals to view space.  Set w
    // components to zero since we're transforming vectors.
    // Assume there are no scalings in the world
    // matrix as well.
    //
    LightDirection.w = 0.0f;
    input.normal.w   = 0.0f;
    LightDirection   = mul(LightDirection, WorldViewMatrix);
    input.normal     = mul(input.normal, WorldViewMatrix);

    //
    // Compute the 1D texture coordinate for toon rendering.
    //
    float u = dot(LightDirection, input.normal);

    //
    // Clamp to zero if u is negative because u
    // negative implies the angle between the light
    // and normal is greater than 90 degrees.  And
    // if that is true then the surface receives
    // no light.
    //
    if( u < 0.0f )
        u = 0.0f;

    //
    // Set other tex coord to middle.
    //
    float v = 0.5f;


    output.uvCoords.x = u;
    output.uvCoords.y = v;

    // save color
    output.diffuse = Color;
   
    return output;
}

//
// Sampler
//

sampler ShadeSampler = sampler_state
{
    Texture   = (ShadeTex);
    MinFilter = POINT; // no filtering for cartoon shading
    MagFilter = POINT;
    MipFilter = NONE;
};


//
// Effect
//

technique Toon
{
    pass P0
    {
        vertexShader = compile vs_1_1 Main();

        Sampler[0] = (ShadeSampler);
    }
}

HLSL效果框架pass,要获得程序外部设置的变量需要用括号（）括起来。
效果框架pass要获得sampler构造的采样器也要用（）括号括起来。

1 )获取效果框架对象，获取常量句柄和设置句柄值
用D3DXCreateEffectFromFile从Fx文件获取效果对象指针

ID3DXEffect* FogEffect   = 0;
ID3DXBuffer* errorBuffer = 0;
 hr = D3DXCreateEffectFromFile(
  Device,
  "fog.txt",
  0,                // no preprocessor definitions
  0,                // no ID3DXInclude interface
  D3DXSHADER_DEBUG, // compile flags
  0,                // don't share parameters
  &FogEffect,
  &errorBuffer);

用效果对象来代替常量表对象来获取和设置fx内的Shader变量值
例如：

ID3DXBuffer* errorBuffer = 0;
 hr = D3DXCreateEffectFromFile(
  Device,
  "light_tex.txt",
  0,                // no preprocessor definitions
  0,                // no ID3DXInclude interface
  D3DXSHADER_DEBUG, // compile flags
  0,                // don't share parameters
  &LightTexEffect,
  &errorBuffer);

WorldMatrixHandle  = LightTexEffect->GetParameterByName(0, "WorldMatrix");
 ViewMatrixHandle   = LightTexEffect->GetParameterByName(0, "ViewMatrix");
 ProjMatrixHandle   = LightTexEffect->GetParameterByName(0, "ProjMatrix");
 TexHandle          = LightTexEffect->GetParameterByName(0, "Tex");

LightTexEffect->SetMatrix( WorldMatrixHandle, &W);
LightTexEffect->SetMatrix(ViewMatrixHandle, &V);
LightTexEffect->SetMatrix( ProjMatrixHandle, &P);
 
IDirect3DTexture9* tex = 0;
 D3DXCreateTextureFromFile(Device, "Terrain_3x_diffcol.jpg", &tex);
 LightTexEffect->SetTexture(TexHandle, tex);

2 ) 绘制过程
用GetTechniqueByName获取技术句柄

D3DXHANDLE FogTechHandle = 0;
FogTechHandle = FogEffect->GetTechniqueByName("Fog");

用SetTechnique激活技术句柄

FogEffect->SetTechnique( FogTechHandle );

用Begin启动技术句柄并获取绘制过程数

FogEffect->Begin(&numPasses, 0);

用BeginPass和开启绘制过程。

在BeginPass后用CommitChanges提交对当前pass的各种参数修改，要在draw之前调用。

用Draw函数真正绘制物体，绘制过程中设定了顶点变换，绘制状态，纹理采样器状态，材质光照算法等。

EndPass关闭绘制过程，不关闭会导致内部异常而dump机。

for(int i = 0; i < numPasses; i++)
  {
   FogEffect->BeginPass(i);

   if( TheTerrain )
    TheTerrain->draw(&I, false);

   FogEffect->EndPass();
  }

用End函数关闭技术。

FogEffect->End();

3）清理效果框架对象

void Cleanup()
{
 d3d::Delete<Terrain*>(TheTerrain);
 d3d::Release<ID3DXEffect*>(FogEffect);
}

开发经验
1）顶点着色器例子，卡通着色，用的是一个灰度纹理
将向量和光照夹角余弦作为灰度纹理的u,v=0.5来实现根据光照进行纹理的采样。
卡通着色的轮廓勾画或描边，可以用为轮廓边增加厚度着色来做到，通过为网格中的面片中的所有边生成一个直线类型的四边形，原来的面片不变，用判断边是否是轮廓边来对新生成的线条边的其中两个顶点进行偏移，那么就会对轮廓边产生一个新的描边面片并且着色上去，这样就将轮廓边勾画出来了。

2）像素着色的多重纹理，可以在静态物体和静态场景中通过场景物体和光照贴图融合
得到静态光照效果引擎。如果是动态的物体那么嗨需要光源来进行实时光照才会得到较好的光照效果。

像素着色器颜色vector是4维的。

vector a = a0 + vector(-fMoHuScale, -fMoHuScale,0.0f,0.0f); 
vector b = a0 + vector(-fMoHuScale, fMoHuScale,0.0f,0.0f);
vector c = a0 + vector(fMoHuScale, -fMoHuScale,0.0f,0.0f);
vector d = a0 + vector(fMoHuScale, fMoHuScale,0.0f,0.0f);
// combine texel colors
vector cc = (a + b + c + d) / 16.0f;

颜色值如果是D3DCOLOR类型的，是用D3DCOLOR_ARGB赋值的，那么是ARGB类型的颜色，例如效果文件中的FogColor的颜色。
颜色值如果是D3DCOLORVALUE类型的，例如光照，材质中的颜色，那么是RGBA类型的。

3）常量表中的常量确实为常量，如果要在HLSL里面修改它，那么需要通过声明其它的变量来实现或者加载着色器文件时候用标识D3DXSHADER_ENABLE_BACKWARDS_COMPATIBILITY | D3DXSHADER_DEBUG；调试过后用D3DXSHADER_SKIPVALIDATION不需要验证。

4）计算时候一定要尽量简单，否则太多的全局变量，和算法内部太多的临时变量，会导致编译失败。

5）多个Shader着色器之间，DrawPrimitive或者DrawSubset时候会提交一批
如果是像素着色器那么不会有问题，如果是顶点着色器绘制不同的物体那么需要SetTranform(D3DTS_WORLD, &curObjMatrix)来重新指明如何将当前的物体坐标系变换到世界坐标系中，然后再进行绘制渲染即可。

6 ）勾画轮廓边，描边总结的三角网格邻接信息
/*需要知道当前操作的网格信息的类型，例如DX默认内置创建的物体的顶点结构，还需要取得或生产邻接信息用网格对象GenerateAdjacency得到，网格顶点结构如下：

struct MeshVertex
{
 D3DXVECTOR3 position;
 D3DXVECTOR3 normal;
 static const DWORD FVF;
};
struct EdgeVertex
{
 D3DXVECTOR3 position;
 D3DXVECTOR3 normal;
 D3DXVECTOR3 faceNormal1;
 D3DXVECTOR3 faceNormal2;
};

顶点声明：

IDirect3DVertexDeclaration9* _decl;
D3DVERTEXELEMENT9 decl[] =
 {
  // offsets in bytes
  {0,  0, D3DDECLTYPE_FLOAT3, D3DDECLMETHOD_DEFAULT, D3DDECLUSAGE_POSITION, 0},
  {0, 12, D3DDECLTYPE_FLOAT3, D3DDECLMETHOD_DEFAULT, D3DDECLUSAGE_NORMAL,   0},
  {0, 24, D3DDECLTYPE_FLOAT3, D3DDECLMETHOD_DEFAULT, D3DDECLUSAGE_NORMAL,   1},
  {0, 36, D3DDECLTYPE_FLOAT3, D3DDECLMETHOD_DEFAULT, D3DDECLUSAGE_NORMAL,   2},
  D3DDECL_END()
 };
 hr = _device->CreateVertexDeclaration(decl, &_decl);
 if(FAILED(hr))
 {
  ::MessageBox(0, "CreateVertexDeclaration() - FAILED", 0, 0);
  return false;
 }
void SilhouetteEdges::render()
{
// 顶点声明的使用，是和顶点着色器或像素着色器，输入相对应的，例如见EdgeVertex
 _device->SetVertexDeclaration(_decl);
 _device->SetStreamSource(0, _vb, 0, sizeof(EdgeVertex));
 _device->SetIndices(_ib);
 _device->DrawIndexedPrimitive(
  D3DPT_TRIANGLELIST, 0, 0, _numVerts, 0, _numFaces);
}
d3d::Release<IDirect3DVertexDeclaration9*>(_decl);

(1).从边考虑(而不是顶点)，每条边有4个顶点;非轮廓边因为是退化四边形得到的退化三角形不会被绘制，轮廓边是经过移动后的三角形
所以会绘制。
(2).轮廓边需要其中两个顶点产生偏移，另外两个顶点不动而生成三角形，渲染该三角形就得到描边。
(3).判断是否是轮廓边，通过物体顶点在摄像机空间中的位置向量（就是顶点到摄像机原点的向量）和边构造的4个顶点中的每个顶点的
当前三角形法向量和邻接边三角形法向量进行点乘，如果两个点乘后的值符号相反那么是轮廓边需要对顶点沿着顶点法向量方向平移并着色，
如果相同那么是非轮廓边,不需要平移当做退化三角形来处理。例如：

extern matrix WorldViewMatrix;
extern matrix ProjMatrix;

static vector Black = {0.9f, 0.0f, 0.0f, 0.0f};

//
// Structures
//

struct VS_INPUT
{
 vector position    : POSITION;
 vector normal      : NORMAL0;
 vector faceNormal1 : NORMAL1;
 vector faceNormal2 : NORMAL2;
};

struct VS_OUTPUT
{
 vector position : POSITION;
 vector diffuse  : COLOR;
};

//
// Main
//

VS_OUTPUT Main(VS_INPUT input)
{
 // zero out each member in output
 VS_OUTPUT output = (VS_OUTPUT)0;

 // transform position to view space
 input.position = mul(input.position, WorldViewMatrix);

 // Compute a vector in the direction of the vertex
 // from the eye.  Recall the eye is at the origin
 // in view space - eye is just camera position.
 vector eyeToVertex = input.position;

 // transform normals to view space.  Set w
 // components to zero since we're transforming vectors.
 // Assume there are no scalings in the world
 // matrix as well.
 input.normal.w      = 0.0f;
 input.faceNormal1.w = 0.0f;
 input.faceNormal2.w = 0.0f;

 input.normal      = mul(input.normal,      WorldViewMatrix);
 input.faceNormal1 = mul(input.faceNormal1, WorldViewMatrix);
 input.faceNormal2 = mul(input.faceNormal2, WorldViewMatrix);

 // compute the cosine of the angles between
 // the eyeToVertex vector and the face normals.
 float dot0 = dot(eyeToVertex, input.faceNormal1);
 float dot1 = dot(eyeToVertex, input.faceNormal2);

 // if cosines are different signs (positive/negative)
 // than we are on a silhouette edge.  Do the signs
 // differ?
 if( (dot0 * dot1) < 0.0f )
 {
 // yes, then this vertex is on a silhouette edge,
 // offset the vertex position by some scalar in the
 // direction of the vertex normal.
 input.position += 0.05f * input.normal;
 }

 // transform to homogeneous clip space
 output.position = mul(input.position, ProjMatrix);

 // set outline color
 output.diffuse = Black;

 return output;
}

(4).从网格信息和邻接信息对象中构建边的顶点缓存和索引缓存，邻接信息真正的有用关系，邻接信息和索引缓存大小一样，且和索引缓存对应，邻接信息缓存key是和索引缓存一样（概念为边)，
value没有DX中是USHRT_MAX，有是和当前三角形相邻的三角形编号，该编号代表索引下标3*value, 3*value + 1, 3*value + 2代表的
三角形，而索引上的value代表的是顶点缓存的key,这样可以通过邻接信息是可以获取到相邻的三角形的顶点信息的。例如：

for(int i = 0; i < mesh->GetNumFaces(); i++)
{
    // 根据当前的面片信息，获取当前的缓存信息，因为索引缓存和三角形的关系是，索引下标(3*i, 3*i + 1, 3*i + 2)确定当前编号为i的三角形
    WORD index0 = indices[i * 3];
    WORD index1 = indices[i * 3 + 1];
    WORD index2 = indices[i* 3 + 2];

    // Now extract the triangles vertices positions
    // 索引缓存上的值是顶点缓存的下标，通过顶点缓存得到顶点信息
    D3DXVECTOR3 v0 = vertices[index0].position;
    D3DXVECTOR3 v1 = vertices[index1].position;
    D3DXVECTOR3 v2 = vertices[index2].position;

    // 根据当前的面片信息，得到当前的邻接信息，因为邻接信息和索引缓存对应的，邻接信息下标(3*i, 3*i + 1, 3*i + 2)确定当前编号为i的三角形
    // 邻接信息上的值是代表当前三角形边，相邻的三角形的编号。
    WORD faceIndex0 = adj[i * 3];
    WORD faceIndex1 = adj[i* 3 + 1];
    WORD faceIndex2 = adj[i * 3 + 2];
    // 得到了邻接三角形的编号，就可以得到邻接三角形的索引缓存值，从而得到顶点缓存值
    if( faceIndex0 != USHRT_MAX ) // is there an adjacent triangle?
    {
    WORD i0 = indices[faceIndex0 * 3];
    WORD i1 = indices[faceIndex0 * 3 + 1];
    WORD i2 = indices[faceIndex0 * 3 + 2];

    D3DXVECTOR3 v0 = vertices[i0].position;
    D3DXVECTOR3 v1 = vertices[i1].position;
    D3DXVECTOR3 v2 = vertices[i2].position;

    D3DXVECTOR3 edge0 = v1 - v0;
    D3DXVECTOR3 edge1 = v2 - v0;
    D3DXVec3Cross(&faceNormal0, &edge0, &edge1);
    D3DXVec3Normalize(&faceNormal0, &faceNormal0);
    }
}

(5). 索引缓存可以通过观察顶点缓存的规律，顶点缓存是四个顶点进行排序的，那么四个顶点的索引缓存就要6个索引，且他们之间是有相关
顺序的，例如：

void SilhouetteEdges::genEdgeIndices(ID3DXMesh* mesh)
{
 DWORD numEdges = mesh->GetNumFaces() * 3;

 _numFaces = numEdges * 2;

 _device->CreateIndexBuffer(
 numEdges * 6 * sizeof(WORD), // 2 triangles per edge
 D3DUSAGE_WRITEONLY,
 D3DFMT_INDEX16,
 D3DPOOL_MANAGED,
 &_ib,
 0);

 WORD* indices = 0;

 _ib->Lock(0, 0, (void**)&indices, 0);

 // 0        1
 // *--------*
 // |  edge  |
 // *--------*
 // 2        3

 for(UINT i = 0; i < numEdges; i++)
 {
 // Six indices to define the triangles of the edge,
 // so every edge we skip six entries in the
 // index buffer.  Four vertices to define the edge,
 // so every edge we skip four entries in the
 // vertex buffer.
 indices[i * 6]     = i * 4 + 0;
 indices[i * 6 + 1] = i * 4 + 1;
 indices[i * 6 + 2] = i * 4 + 2;
 indices[i * 6 + 3] = i * 4 + 1;
 indices[i * 6 + 4] = i * 4 + 3;
 indices[i * 6 + 5] = i * 4 + 2;
 }

 _ib->Unlock();
}

(7)启用顶点着色器和绘制轮廓边，之前的正常网格正常绘制，轮廓边后面绘制

Device->SetVertexShader(ToonShader);
Device->SetTexture(0, ShadeTex);
Meshes[i]->DrawSubset(0);

Device->SetVertexShader(OutlineShader);
Device->SetTexture(0, 0);
OutlineConstTable->SetMatrix(
 Device,
 OutlineWorldViewHandle,
 &worldView);

OutlineConstTable->SetMatrix(
 Device,
 OutlineProjHandle,
 &ProjMatrix);

MeshOutlines[i]->render();

7）效果文件的参考信息可以从SDK文档中获取到，例如效果文件中的内置类型，内置函数，和状态类型。
8) 当用属性子集绘制物体时候，DrawSubset下标和网格的材质相关，而和Fx中的绘制次数无关
Fx绘制次数只是说需要不同的渲染状态下，不同的顶点和像素着色器下多次绘制同一个物体。

UINT numPasses = 0;
  LightTexEffect->Begin(&numPasses, 0);

   for(int i = 0; i < numPasses; i++)
   {
   // 这里绘制会导致部分顶点变换都没有进行，因而渲染失败
   //ToonEffect->BeginPass(i);
    for( int j = 0; j < int(Mtrls.size()); j++)
    {
        // 正确的在此绘制，且绘制下标为i,保证每个物体或者属性子集都要进行pass(i)的绘制，
     // 才能将每个物体或者部分顶点和像素渲染正确
         LightTexEffect->BeginPass(i);
          LightTexEffect->CommitChanges();

         Device->SetMaterial(&(Mtrls[j]));
         Device->SetTexture(0, Textures[j]);
        Mesh->DrawSubset(j);

         LightTexEffect->EndPass();
      }
   
   }
  }
  LightTexEffect->End();

8）雾化效果可以将场景中的真实感提升到一个新的层次，并且可以模拟某种类型的天气情况。
而且雾化可以避免远裁剪面在视觉上带给玩家不自然的感觉。

technique Fog
{
    pass P0
    {
        //
        // Set Misc render states.

        pixelshader      = null;
        vertexshader     = null;
        fvf              = XYZ | Normal;
        Lighting         = true;
        NormalizeNormals = true;
        SpecularEnable   = false;

        //
        // Fog States

        FogVertexMode = LINEAR; // linear fog function
        FogStart      = 50.0f;  // fog starts 50 units away from viewpoint
        FogEnd        = 300.0f; // fog ends 300 units away from viewpoint

        FogColor      = 0x00CCCCCC; // gray
        FogEnable     = true;       // enable
    }
}

9）把程序中的句柄设置给效果文件参数句柄后就可以释放内存了，效果文件是拷贝了数据内存的
例如。

IDirect3DTexture9* tex = 0;
 D3DXCreateTextureFromFile(Device, "toonshade.bmp", &tex);
 ToonEffect->SetTexture(ShadeTexHandle, tex);
 d3d::Release<IDirect3DTexture9*>(tex);

10）效果文件中存放顶点着色器和像素着色器，除了pass中的渲染状态设置，纹理设置，还有就是着色器的使用
直接在脚本里面编译赋值给内建着色器类型即可，内建纹理采样器类型也是一样的，例如：

technique Toon
{
    pass P0
    {
        vertexShader = compile vs_3_0 Main();

        Sampler[0] = (ShadeSampler);
    }
}

———————————————
版权声明：本文为CSDN博主「Sam-Cen」的原创文章，遵循 CC 4.0 BY-SA 版权协议，转载请附上原文出处链接及本声明。
原文链接：https://blog.csdn.net/Blues1021/article/details/47093487

HLSL API

Intrinsic Functions (DirectX HLSL)
The following table lists the intrinsic functions available in HLSL. Each function has a brief description, and a link to a reference page that has more detail about the input argument and return type.

Name Syntax Description
abs abs(x) Absolute value (per component).
acos acos(x) Returns the arccosine of each component of x.
all all(x) Test if all components of x are nonzero.
any any(x) Test if any component of x is nonzero.
asfloat asfloat(x) Convert the input type to a float.
asin asin(x) Returns the arcsine of each component of x.
asint asint(x) Convert the input type to an integer.
asuint asuint(x) Convert the input type to an unsigned integer.
atan atan(x) Returns the arctangent of x.
atan2 atan2(y, x) Returns the arctangent of of two values (x,y).
ceil ceil(x) Returns the smallest integer which is greater than or equal to x.
clamp clamp(x, min, max) Clamps x to the range [min, max].
clip clip(x) Discards the current pixel, if any component of x is less than zero.
cos cos(x) Returns the cosine of x.
cosh cosh(x) Returns the hyperbolic cosine of x.
cross cross(x, y) Returns the cross product of two 3D vectors.
D3DCOLORtoUBYTE4 D3DCOLORtoUBYTE4(x) Swizzles and scales components of the 4D vector x to compensate for the lack of UBYTE4 support in some hardware.
ddx ddx(x) Returns the partial derivative of x with respect to the screen-space x-coordinate.
ddy ddy(x) Returns the partial derivative of x with respect to the screen-space y-coordinate.
degrees degrees(x) Converts x from radians to degrees.
determinant determinant(m) Returns the determinant of the square matrix m.
distance distance(x, y) Returns the distance between two points.
dot dot(x, y) Returns the dot product of two vectors.
exp exp(x) Returns the base-e exponent.
exp2 exp2(x) Base 2 exponent (per component).
faceforward faceforward(n, i, ng) Returns -n * sign(•(i, ng)).
floor floor(x) Returns the greatest integer which is less than or equal to x.
fmod fmod(x, y) Returns the floating point remainder of x/y.
frac frac(x) Returns the fractional part of x.
frexp frexp(x, exp) Returns the mantissa and exponent of x.
fwidth fwidth(x) Returns abs(ddx(x)) + abs(ddy(x))
GetRenderTargetSampleCount GetRenderTargetSampleCount() Returns the number of render-target samples.
GetRenderTargetSamplePosition GetRenderTargetSamplePosition(x) Returns a sample position (x,y) for a given sample index.
isfinite isfinite(x) Returns true if x is finite, false otherwise.
isinf isinf(x) Returns true if x is +INF or -INF, false otherwise.
isnan isnan(x) Returns true if x is NAN or QNAN, false otherwise.
ldexp ldexp(x, exp) Returns x * 2exp
length length(v) Returns the length of the vector v.
lerp lerp(x, y, s) Returns x + s(y - x).
lit lit(n • l, n • h, m) Returns a lighting vector (ambient, diffuse, specular, 1)
log log(x) Returns the base-e logarithm of x.
log10 log10(x) Returns the base-10 logarithm of x.
log2 log2(x) Returns the base-2 logarithm of x.
max max(x, y) Selects the greater of x and y.
min min(x, y) Selects the lesser of x and y.
modf modf(x, out ip) Splits the value x into fractional and integer parts.
mul mul(x, y) Performs matrix multiplication using x and y.
noise noise(x) Generates a random value using the Perlin-noise algorithm.
normalize normalize(x) Returns a normalized vector.
pow pow(x, y) Returns xy.
radians radians(x) Converts x from degrees to radians.
reflect reflect(i, n) Returns a reflection vector.
refract refract(i, n, R) Returns the refraction vector.
round round(x) Rounds x to the nearest integer
rsqrt rsqrt(x) Returns 1 / sqrt(x)
saturate saturate(x) Clamps x to the range [0, 1]
sign sign(x) Computes the sign of x.
sin sin(x) Returns the sine of x
sincos sincos(x, out s, out c) Returns the sine and cosine of x.
sinh sinh(x) Returns the hyperbolic sine of x
smoothstep smoothstep(min, max, x) Returns a smooth Hermite interpolation between 0 and 1.
sqrt sqrt(x) Square root (per component)
step step(a, x) Returns (x >= a) ? 1 : 0
tan tan(x) Returns the tangent of x
tanh tanh(x) Returns the hyperbolic tangent of x
tex1D tex1D(s, t) 1D texture lookup.
tex1Dbias tex1Dbias(s, t) 1D texture lookup with bias.
tex1Dgrad tex1Dgrad(s, t, ddx, ddy) 1D texture lookup with a gradient.
tex1Dlod tex1Dlod(s, t) 1D texture lookup with LOD.
tex1Dproj tex1Dproj(s, t) 1D texture lookup with projective divide.
tex2D tex2D(s, t) 2D texture lookup.
tex2Dbias tex2Dbias(s, t) 2D texture lookup with bias.
tex2Dgrad tex2Dgrad(s, t, ddx, ddy) 2D texture lookup with a gradient.
tex2Dlod tex2Dlod(s, t) 2D texture lookup with LOD.
tex2Dproj tex2Dproj(s, t) 2D texture lookup with projective divide.
tex3D tex3D(s, t) 3D texture lookup.
tex3Dbias tex3Dbias(s, t) 3D texture lookup with bias.
tex3Dgrad tex3Dgrad(s, t, ddx, ddy) 3D texture lookup with a gradient.
tex3Dlod tex3Dlod(s, t) 3D texture lookup with LOD.
tex3Dproj tex3Dproj(s, t) 3D texture lookup with projective divide.
texCUBE texCUBE(s, t) Cube texture lookup.
texCUBEbias texCUBEbias(s, t) Cube texture lookup with bias.
texCUBEgrad texCUBEgrad(s, t, ddx, ddy) Cube texture lookup with a gradient.
texCUBElod tex3Dlod(s, t) Cube texture lookup with LOD.
texCUBEproj texCUBEproj(s, t) Cube texture lookup with projective divide.
transpose transpose(m) Returns the transpose of the matrix m.
trunc trunc(x) Truncates floating-point value(s) to integer value(s)

表 3-1 HLSL内置函数

函数名用法

abs 计算输入值的绝对值。

acos 返回输入值反余弦值。

all 测试非0值。

any 测试输入值中的任何非零值。

asin 返回输入值的反正弦值。

atan 返回输入值的反正切值。

atan2 返回y/x的反正切值。

ceil 返回大于或等于输入值的最小整数。

clamp 把输入值限制在[min, max]范围内。

clip 如果输入向量中的任何元素小于0，则丢弃当前像素。

cos 返回输入值的余弦。

cosh 返回输入值的双曲余弦。

cross 返回两个3D向量的叉积。

ddx 返回关于屏幕坐标x轴的偏导数。

ddy 返回关于屏幕坐标y轴的偏导数。

degrees 弧度到角度的转换

determinant 返回输入矩阵的值。

distance 返回两个输入点间的距离。

dot 返回两个向量的点积。

exp 返回以e为底数，输入值为指数的指数函数值。

exp2 返回以2为底数，输入值为指数的指数函数值。

faceforward 检测多边形是否位于正面。

floor 返回小于等于x的最大整数。

fmod 返回a / b的浮点余数。

frac 返回输入值的小数部分。

frexp 返回输入值的尾数和指数

fwidth 返回 abs ( ddx (x) + abs ( ddy(x))。

isfinite 如果输入值为有限值则返回true，否则返回false。

isinf 如何输入值为无限的则返回true。

isnan 如果输入值为NAN或QNAN则返回true。

ldexp frexp的逆运算，返回 x * 2 ^ exp。

len / lenth 返回输入向量的长度。

lerp 对输入值进行插值计算。

lit 返回光照向量（环境光，漫反射光，镜面高光，1）。

log 返回以e为底的对数。

log10 返回以10为底的对数。

log2 返回以2为底的对数。

max 返回两个输入值中较大的一个。

min 返回两个输入值中较小的一个。

modf 把输入值分解为整数和小数部分。

mul 返回输入矩阵相乘的积。

normalize 返回规范化的向量，定义为 x / length(x)。

pow 返回输入值的指定次幂。

radians 角度到弧度的转换。

reflect 返回入射光线i对表面法线n的反射光线。

refract 返回在入射光线i，表面法线n，折射率为eta下的折射光线v。

round 返回最接近于输入值的整数。

rsqrt 返回输入值平方根的倒数。

saturate 把输入值限制到[0, 1]之间。

sign 计算输入值的符号。

sin 计算输入值的正弦值。

sincos 返回输入值的正弦和余弦值。

sinh 返回x的双曲正弦。

smoothstep 返回一个在输入值之间平稳变化的插值。

sqrt 返回输入值的平方根。

step 返回（x >= a）? 1 : 0。

tan 返回输入值的正切值。

fanh 返回输入值的双曲线切线。

transpose 返回输入矩阵的转置。

tex1D* 1D纹理查询。

tex2D* 2D纹理查询。

tex3D* 3D纹理查询。
texCUBE* 立方纹理查询。

Name Description Minimum shader model
abs Absolute value (per component). 11
acos Returns the arccosine of each component of x. 11
all Test if all components of x are nonzero. 11
AllMemoryBarrier Blocks execution of all threads in a group until all memory accesses have been completed. 5
AllMemoryBarrierWithGroupSync Blocks execution of all threads in a group until all memory accesses have been completed and all threads in the group have reached this call. 5
any Test if any component of x is nonzero. 11
asdouble Reinterprets a cast value into a double. 5
asfloat Convert the input type to a float. 4
asin Returns the arcsine of each component of x. 11
asint Convert the input type to an integer. 4
asuint Reinterprets the bit pattern of a 64-bit type to a uint. 5
asuint Convert the input type to an unsigned integer. 4
atan Returns the arctangent of x. 11
atan2 Returns the arctangent of of two values (x,y). 11
ceil Returns the smallest integer which is greater than or equal to x. 11
clamp Clamps x to the range [min, max]. 11
clip Discards the current pixel, if any component of x is less than zero. 11
cos Returns the cosine of x. 11
cosh Returns the hyperbolic cosine of x. 11
countbits Counts the number of bits (per component) in the input integer. 5
cross Returns the cross product of two 3D vectors. 11
D3DCOLORtoUBYTE4 Swizzles and scales components of the 4D vector xto compensate for the lack of UBYTE4 support in some hardware. 11
ddx Returns the partial derivative of x with respect to the screen-space x-coordinate. 21
ddx_coarse Computes a low precision partial derivative with respect to the screen-space x-coordinate. 5
ddx_fine Computes a high precision partial derivative with respect to the screen-space x-coordinate. 5
ddy Returns the partial derivative of x with respect to the screen-space y-coordinate. 21
ddy_coarse Computes a low precision partial derivative with respect to the screen-space y-coordinate. 5
ddy_fine Computes a high precision partial derivative with respect to the screen-space y-coordinate. 5
degrees Converts x from radians to degrees. 11
determinant Returns the determinant of the square matrix m. 11
DeviceMemoryBarrier Blocks execution of all threads in a group until all device memory accesses have been completed. 5
DeviceMemoryBarrierWithGroupSync Blocks execution of all threads in a group until all device memory accesses have been completed and all threads in the group have reached this call. 5
distance Returns the distance between two points. 11
dot Returns the dot product of two vectors. 1
dst Calculates a distance vector. 5
EvaluateAttributeAtCentroid Evaluates at the pixel centroid. 5
EvaluateAttributeAtSample Evaluates at the indexed sample location. 5
EvaluateAttributeSnapped Evaluates at the pixel centroid with an offset. 5
exp Returns the base-e exponent. 11
exp2 Base 2 exponent (per component). 11
f16tof32 Converts the float16 stored in the low-half of the uint to a float. 5
f32tof16 Converts an input into a float16 type. 5
faceforward Returns -n * sign(dot(i, ng)). 11
firstbithigh Gets the location of the first set bit starting from the highest order bit and working downward, per component. 5
firstbitlow Returns the location of the first set bit starting from the lowest order bit and working upward, per component. 5
floor Returns the greatest integer which is less than or equal to x. 11
fmod Returns the floating point remainder of x/y. 11
frac Returns the fractional part of x. 11
frexp Returns the mantissa and exponent of x. 21
fwidth Returns abs(ddx(x)) + abs(ddy(x)) 21
GetRenderTargetSampleCount Returns the number of render-target samples. 4
GetRenderTargetSamplePosition Returns a sample position (x,y) for a given sample index. 4
GroupMemoryBarrier Blocks execution of all threads in a group until all group shared accesses have been completed. 5
GroupMemoryBarrierWithGroupSync Blocks execution of all threads in a group until all group shared accesses have been completed and all threads in the group have reached this call. 5
InterlockedAdd Performs a guaranteed atomic add of value to the dest resource variable. 5
InterlockedAnd Performs a guaranteed atomic and. 5
InterlockedCompareExchange Atomically compares the input to the comparison value and exchanges the result. 5
InterlockedCompareStore Atomically compares the input to the comparison value. 5
InterlockedExchange Assigns value to dest and returns the original value. 5
InterlockedMax Performs a guaranteed atomic max. 5
InterlockedMin Performs a guaranteed atomic min. 5
InterlockedOr Performs a guaranteed atomic or. 5
InterlockedXor Performs a guaranteed atomic xor. 5
isfinite Returns true if x is finite, false otherwise. 11
isinf Returns true if x is +INF or -INF, false otherwise. 11
isnan Returns true if x is NAN or QNAN, false otherwise. 11
ldexp Returns x * 2exp 11
length Returns the length of the vector v. 11
lerp Returns x + s(y - x). 11
lit Returns a lighting vector (ambient, diffuse, specular, 1) 11
log Returns the base-e logarithm of x. 11
log10 Returns the base-10 logarithm of x. 11
log2 Returns the base-2 logarithm of x. 11
mad Performs an arithmetic multiply/add operation on three values. 5
max Selects the greater of x and y. 11
min Selects the lesser of x and y. 11
modf Splits the value x into fractional and integer parts. 11
mul Performs matrix multiplication using x and y. 1
noise Generates a random value using the Perlin-noise algorithm. 11
normalize Returns a normalized vector. 11
pow Returns xy. 11
Process2DQuadTessFactorsAvg Generates the corrected tessellation factors for a quad patch. 5
Process2DQuadTessFactorsMax Generates the corrected tessellation factors for a quad patch. 5
Process2DQuadTessFactorsMin Generates the corrected tessellation factors for a quad patch. 5
ProcessIsolineTessFactors Generates the rounded tessellation factors for an isoline. 5
ProcessQuadTessFactorsAvg Generates the corrected tessellation factors for a quad patch. 5
ProcessQuadTessFactorsMax Generates the corrected tessellation factors for a quad patch. 5
ProcessQuadTessFactorsMin Generates the corrected tessellation factors for a quad patch. 5
ProcessTriTessFactorsAvg Generates the corrected tessellation factors for a tri patch. 5
ProcessTriTessFactorsMax Generates the corrected tessellation factors for a tri patch. 5
ProcessTriTessFactorsMin Generates the corrected tessellation factors for a tri patch. 5
radians Converts x from degrees to radians. 1
rcp Calculates a fast, approximate, per-component reciprocal. 5
reflect Returns a reflection vector. 1
refract Returns the refraction vector. 11
reversebits Reverses the order of the bits, per component. 5
round Rounds x to the nearest integer 11
rsqrt Returns 1 / sqrt(x) 11
saturate Clamps x to the range [0, 1] 1
sign Computes the sign of x. 11
sin Returns the sine of x 11
sincos Returns the sine and cosine of x. 11
sinh Returns the hyperbolic sine of x 11
smoothstep Returns a smooth Hermite interpolation between 0 and 1. 11
sqrt Square root (per component) 11
step Returns (x >= a) ? 1 : 0 11
tan Returns the tangent of x 11
tanh Returns the hyperbolic tangent of x 11
tex1D(s, t) 1D texture lookup. 1
tex1D(s, t, ddx, ddy) 1D texture lookup. 21
tex1Dbias 1D texture lookup with bias. 21
tex1Dgrad 1D texture lookup with a gradient. 21
tex1Dlod 1D texture lookup with LOD. 31
tex1Dproj 1D texture lookup with projective divide. 21
tex2D(s, t) 2D texture lookup. 11
tex2D(s, t, ddx, ddy) 2D texture lookup. 21
tex2Dbias 2D texture lookup with bias. 21
tex2Dgrad 2D texture lookup with a gradient. 21
tex2Dlod 2D texture lookup with LOD. 3
tex2Dproj 2D texture lookup with projective divide. 21
tex3D(s, t) 3D texture lookup. 11
tex3D(s, t, ddx, ddy) 3D texture lookup. 21
tex3Dbias 3D texture lookup with bias. 21
tex3Dgrad 3D texture lookup with a gradient. 21
tex3Dlod 3D texture lookup with LOD. 31
tex3Dproj 3D texture lookup with projective divide. 21
texCUBE(s, t) Cube texture lookup. 11
texCUBE(s, t, ddx, ddy) Cube texture lookup. 21
texCUBEbias Cube texture lookup with bias. 21
texCUBEgrad Cube texture lookup with a gradient. 21
texCUBElod Cube texture lookup with LOD. 31
texCUBEproj Cube texture lookup with projective divide. 21
transpose Returns the transpose of the matrix m. 1
trunc Truncates floating-point value(s) to integer value(s)
————————————————
版权声明：本文为CSDN博主「博赢天下」的原创文章，遵循 CC 4.0 BY-SA 版权协议，转载请附上原文出处链接及本声明。
原文链接：https://blog.csdn.net/plaxbsga/article/details/52787860