3.4tensorflow2.x自动求导原理函数详解

自己开发了一个股票智能分析软件，功能很强大，需要的点击下面的链接获取：

https://www.cnblogs.com/bclshuai/p/11380657.html

1.1 tensorflow2.x自动求导

1.1.1 自动求导GradientTape类

GradientTape的作用就是用于自动求导，需要有自变量x和因变量y，调用gradient（y，x）就可以求导，在GradientTape定义的上下文中，会默认观察自变量x，并进行记录，需要占用内存等资源，所以可以精确指定观察变量，减少资源。自变量可以是多个，gradient（y，[x,p,g]） ,计算的结果就是多个变量的张量组合。也可以通过嵌套的方式实现高阶求导。

1.1.2 GradientTape类定义

class GradientTape(object):

（1）初始化构造函数

def __init__(self, persistent=False, watch_accessed_variables=True):

#persistent变量，默认情况下，调用一次求导之后，GradientTape所持有的资源就会被释放，不能再执行，如果需要持续求导，persistent默认为False,，也就是g只能调用一次，如果指定persistent为true，则可以多次求导。

#watch_accessed_variables，是否要指定观察的变量，默认为True，默认观察并记录上下文中的自变量，g.watch(x)可以不写，自动观察求导中的变量。如果要对监视变量进行精细控制，可以通过将watch_accessed_variables = False传递给tape，手动指定观察变量，避免全部观察记录，减少不必要的资源消耗。

实例

x = tf.Variable(2.0)

w = tf.Variable(5.0)

with tf.GradientTape(

watch_accessed_variables=False, persistent=True) as tape:

tape.watch(x)

y = x ** 2 # Gradients will be available for `x`.

z = w ** 3 # No gradients will be available as `w` isn't being watched.

dy_dx = tape.gradient(y, x)

print(dy_dx)

# No gradients will be available as `w` isn't being watched.

dz_dy = tape.gradient(z, w)

print(dz_dy)

（2）进入和退出函数

输入一个上下文，在该上下文中将操作记录在此tape上

def __enter__(self):

退出记录上下文，不再跟踪其他操作

def __exit__(self, typ, value, traceback):

（3）观察变量函数

指定观察变量，默认自动观察求导自变量，watch_accessed_variables = False时才需要手动指定。

def watch(self, tensor):

（1） 停止在tape上记录操作

暂时停止在该tape上进行记录操作。此上下文管理器处于活动状态时执行的操作不会记录在tape上。这对于减少通过跟踪所有计算而使用的内存很有用。

def stop_recording(self):

实例：

import tensorflow as tf

x = tf.Variable(4.0)

with tf.GradientTape(persistent=True) as tape:

    y = tf.pow(x, 2)

    z=tape.gradient(y,x)

    #with tape.stop_recording():# The gradient computation below is not traced, saving memory.

    y+=x

    dy_dx = tape.gradient(y, x)

print(z)

print(dy_dx)

输出结果：

tf.Tensor(8.0, shape=(), dtype=float32)

tf.Tensor(9.0, shape=(), dtype=float32)

如果加上stop_recording

import tensorflow as tf

x = tf.Variable(4.0)

with tf.GradientTape(persistent=True) as tape:

    y = tf.pow(x, 2)

    z=tape.gradient(y,x)

    with tape.stop_recording():# The gradient computation below is not traced, saving memory.

        y+=x

        dy_dx = tape.gradient(y, x)

print(z)

print(dy_dx)

输出结果为

tf.Tensor(8.0, shape=(), dtype=float32)

None

可见stop_recording作用是停止记录变量的计算，不记录结果。求导结果为None；

（2） 重置求导记录函数

清除此tape中存储的所有信息。等效于退出并重新进入tape上下文管理器。

def reset(self):

import tensorflow as tf

x=tf.Variable(initial_value=[1.,2.,3,])

with tf.GradientTape() as t:

  loss = tf.pow(x,2)

with tf.GradientTape() as t:

  loss += x;

z=t.gradient(loss, x)  # Only differentiates other_loss_fn, not loss_fn

print(z)



# The following is equivalent to the above

with tf.GradientTape() as t:

  loss = tf.pow(x,2)#调用了reset，这里的记录会被清除

  #t.reset()注释，loss=x*x+x,不注释，会将x*x的记录清除，loss=x

  loss += x

z=t.gradient(loss, x)  # Only differentiates other_loss_fn, not loss_fn

print(z)

输出结果

tf.Tensor([1. 1. 1.], shape=(3,), dtype=float32)# 调用reset清除记录x*x

tf.Tensor([3. 5. 7.], shape=(3,), dtype=float32)# 不清除x*x

（6）返回观察的变量

该函数监视要求导的变量，可以是一个tensor[x]，也可以是多个参数组成的列表[x,y],如果watch_accessed_variables定义为True，则可以不调用该函数，可以自动求导。

def watched_variables(self):

（7）求导函数

def gradient(self,

target,# 求导的函数y,可以理解为要求导的因变量y

sources,# 要对哪一个自变量x求导，可以是多个组合

output_gradients=None,求导后输出结果乘以的系数

unconnected_gradients=UnconnectedGradients.NONE):# 它是一个可选参数，有两个值，“none”和“zero”，none是它的默认值，表示当我们的target（因变量y）与sources（自变量x）之间没有关系时，返回NONE

（8）雅克比矩阵

在向量微积分中，雅可比矩阵是一阶偏导数以一定方式排列成的矩阵。

def jacobian(self,

target, #求导的函数y,可以理解为要求导的因变量y

sources,自变量x，维度大于等于2的张量

unconnected_gradients=UnconnectedGradients.NONE,

parallel_iterations=None,# 控制并行调度的迭代次数，可以用来控制总的内存使用量

experimental_use_pfor=True):

实例

with tf.GradientTape() as g:

x = tf.constant([[1., 2.], [3., 4.]], dtype=tf.float32)

g.watch(x)

y = x * x#一阶导数是2*x

batch_jacobian = g.batch_jacobian(y, x)

# batch_jacobian is [[[2, 0], [0, 4]], [[6, 0], [0, 8]]]

1.1.3 自动求导的步骤

（1）创建一个GradientTape对象：g=tf.GradientTape()

（2）监视watch要求导的变量：g.watch(x)

（3）对函数进行求导：g.gredient(y,x)

1.1.4 自动求导实例

（1）高阶导数嵌套实例

先初始化一个变量，然后定义变量的函数，然后用tf.GradientTape()求一阶导数和二阶导数。

import tensorflow as tf

#定义并初始化变量

x=tf.Variable(initial_value=[[1.,2.,3.],[4.,5.,6.]])

#创建GradientTape对象

with tf.GradientTape() as g1:

    #指定观察变量

   # g1.watch(x)

    with tf.GradientTape(persistent=True) as g2:#嵌套高阶求导

       # g2.watch(x)

        y=x*x

        z=tf.sqrt(y+1)

    y1=g2.gradient(y,x)#一阶导数

    z1x=g2.gradient(z,x)#z对于x的导数，因为g2的persistent参数设置为true，所以可以调用两次，否则报错。

y2=g1.gradient(y1,x)#二阶导数



print(y)

print(y1)

print(y2)

print(z1x)

输出结果：

tf.Tensor(

[[ 1. 4. 9.]

[16. 25. 36.]], shape=(2, 3), dtype=float32)

tf.Tensor(

[[ 2. 4. 6.]

[ 8. 10. 12.]], shape=(2, 3), dtype=float32)

tf.Tensor(

[[2. 2. 2.]

[2. 2. 2.]], shape=(2, 3), dtype=float32)

tf.Tensor(

[[0.70710677 0.8944272 0.94868326]

[0.97014254 0.9805807 0.9863939 ]], shape=(2, 3), dtype=float32)

（2）多变量和输出系数求导实例

import tensorflow as tf



x = tf.Variable(initial_value=[1.0, 2.0, 3.0])

y = tf.Variable(initial_value=[2.0, 4.0, 6.0])



with tf.GradientTape(persistent=True) as g:

    g.watch(x)

    g.watch(y)

    z = tf.pow(x, 2) + tf.pow(y, 2)



output_gradients = tf.Variable([0.2, 0.6, 0.2])#输出结果相乘的系数

dz = g.gradient(z, [x, y], output_gradients=output_gradients)  # 给每一个元素施加不同的权重

print(dz)

'''运行结果为：

[<tf.Tensor: id=64, shape=(3,), dtype=float32, numpy=array([0.4, 2.4, 1.2], 

  dtype=float32)>, 

 <tf.Tensor: id=40, shape=(3,), dtype=float32, numpy=array([0.8, 4.8, 2.4], 

  dtype=float32)>]

'''

'''

本来的结果是dz_dx=[2,4,6], 分别乘以权重[0.2,0.6,0.2]之后，得到[0.4,2.4,1.2]

本来的结果是dz_dy=[4,8,12],分别乘以权重[0.2,0.6,0.2]之后，得到[0.8,4.8,2.4]

'''

1.1.5 参考文献

https://dengbocong.blog.csdn.net/article/details/108044938

https://blog.csdn.net/qq_27825451/article/details/89556703

https://tensorflow.google.cn/api_docs/python/tf/GradientTape#reset