tensorfolw学习笔记——张量、微分、自定义训练、keras

1张量

张量可以使用GPU加速，可以自动将python内置数据类型转换为张量。张量有形状和数据类型。张量与numpy主要区别为：1张量可以用GPU加速2张量不可变。

Tensors和Numpy ndarrays可以自动相互转换。Tensors使用.numpy()方法可以显示转换为ndarray。这种转换让Tensors和ndarray共享了底层内存。Tensors可能在GPU内存可是Numpy总是在cpu内存，转换涉及从GPU到主机内存的复制。

import numpy as np

ndarray = np.ones([3, 3])

print("TensorFlow operations convert numpy arrays to Tensors automatically")
tensor = tf.multiply(ndarray, 42)
print(tensor)


print("And NumPy operations convert Tensors to numpy arrays automatically")
print(np.add(tensor, 1))

print("The .numpy() method explicitly converts a Tensor to a numpy array")
print(tensor.numpy())

TensorFlow operations convert numpy arrays to Tensors automatically
tf.Tensor(
[[42. 42. 42.]
 [42. 42. 42.]
 [42. 42. 42.]], shape=(3, 3), dtype=float64)
And NumPy operations convert Tensors to numpy arrays automatically
[[43. 43. 43.]
 [43. 43. 43.]
 [43. 43. 43.]]
The .numpy() method explicitly converts a Tensor to a numpy array
[[42. 42. 42.]
 [42. 42. 42.]
 [42. 42. 42.]]

Tensorflow自动决定是否使用GPU来加速运算。Tensor.device提供了设备信息。可以显示设置执行计算的设备。

Dateset：

tf.data.Dataset将数据喂给算法，迭代简单。Dataset.from_tensors,Dataset.from_tensor_slices,TextLineDataset,TFRecordDataset可以创建dataset。map,batch,shuffle用来对dataset转换处理。dataset可以直接用于迭代（如for循环）。

ds_tensors = tf.data.Dataset.from_tensor_slices([1, 2, 3, 4, 5, 6])

# Create a CSV file
import tempfile
_, filename = tempfile.mkstemp()

with open(filename, 'w') as f:
  f.write("""Line 1
Line 2
Line 3
  """)

ds_file = tf.data.TextLineDataset(filename)

ds_tensors = ds_tensors.map(tf.square).shuffle(2).batch(2)

ds_file = ds_file.batch(2)


print('Elements of ds_tensors:')
for x in ds_tensors:
  print(x)

print('
Elements in ds_file:')
for x in ds_file:
  print(x)


Elements of ds_tensors:
tf.Tensor([4 1], shape=(2,), dtype=int32)
tf.Tensor([ 9 16], shape=(2,), dtype=int32)
tf.Tensor([25 36], shape=(2,), dtype=int32)

Elements in ds_file:
tf.Tensor([b'Line 1' b'Line 2'], shape=(2,), dtype=string)
tf.Tensor([b'Line 3' b'  '], shape=(2,), dtype=string)

2自动微分

利用张量类型在描述清楚变量之间关系，TensorFlow自动计算变量的微分和梯度。tf.GradientTape用来计算自动微分。

x = tf.ones((2, 2))

with tf.GradientTape() as t:
  t.watch(x)
  y = tf.reduce_sum(x)
  z = tf.multiply(y, y)

# Derivative of z with respect to the original input tensor x
dz_dx = t.gradient(z, x)
for i in [0, 1]:
  for j in [0, 1]:
    assert dz_dx[i][j].numpy() == 8.0

也可对中间变量进行微分：

x = tf.ones((2, 2))

with tf.GradientTape() as t:
  t.watch(x)
  y = tf.reduce_sum(x)
  z = tf.multiply(y, y)

# Use the tape to compute the derivative of z with respect to the
# intermediate value y.
dz_dy = t.gradient(z, y)
assert dz_dy.numpy() == 8.0

默认只要调用gradient方法，GradientTape持有的资源就会被释放。为了对多个变量进行微分，需要进行tf.GradientTape(persistent=True)设置。如：

x = tf.constant(3.0)
with tf.GradientTape(persistent=True) as t:
  t.watch(x)
  y = x * x
  z = y * y
dz_dx = t.gradient(z, x)  # 108.0 (4*x^3 at x = 3)
dy_dx = t.gradient(y, x)  # 6.0
del t  # Drop the reference to the tape

可以对导数再进行求导（二阶导数）：

x = tf.Variable(1.0)  # Create a Tensorflow variable initialized to 1.0

with tf.GradientTape() as t:
  with tf.GradientTape() as t2:
    y = x * x * x
  # Compute the gradient inside the 't' context manager
  # which means the gradient computation is differentiable as well.
  dy_dx = t2.gradient(y, x)
d2y_dx2 = t.gradient(dy_dx, x)

assert dy_dx.numpy() == 3.0
assert d2y_dx2.numpy() == 6.0

3自定义训练

线性拟合实例。

class Model(object):
  def __init__(self):
    # Initialize variable to (5.0, 0.0)
    # In practice, these should be initialized to random values.
    self.W = tf.Variable(5.0)
    self.b = tf.Variable(0.0)

  def __call__(self, x):
    return self.W * x + self.b

model = Model()

assert model(3.0).numpy() == 15.0

def loss(predicted_y, desired_y):  #定义损失函数
  return tf.reduce_mean(tf.square(predicted_y - desired_y))

TRUE_W = 3.0
TRUE_b = 2.0
NUM_EXAMPLES = 1000

inputs  = tf.random_normal(shape=[NUM_EXAMPLES])
noise   = tf.random_normal(shape=[NUM_EXAMPLES])
outputs = inputs * TRUE_W + TRUE_b + noise #有噪声的y值

import matplotlib.pyplot as plt  #画出预测值和真实值图

plt.scatter(inputs, outputs, c='b')
plt.scatter(inputs, model(inputs), c='r')
plt.show()

print('Current loss: '),
print(loss(model(inputs), outputs).numpy())

def train(model, inputs, outputs, learning_rate):  #更新权重
  with tf.GradientTape() as t:
    current_loss = loss(model(inputs), outputs)
  dW, db = t.gradient(current_loss, [model.W, model.b])
  model.W.assign_sub(learning_rate * dW)
  model.b.assign_sub(learning_rate * db)

model = Model()

# Collect the history of W-values and b-values to plot later
Ws, bs = [], []
epochs = range(10)
for epoch in epochs:     #迭代10次
  Ws.append(model.W.numpy())
  bs.append(model.b.numpy())
  current_loss = loss(model(inputs), outputs)

  train(model, inputs, outputs, learning_rate=0.1)
  print('Epoch %2d: W=%1.2f b=%1.2f, loss=%2.5f' %
        (epoch, Ws[-1], bs[-1], current_loss))

# Let's plot it all
plt.plot(epochs, Ws, 'r',
         epochs, bs, 'b')
plt.plot([TRUE_W] * len(epochs), 'r--',
         [TRUE_b] * len(epochs), 'b--')
plt.legend(['W', 'b', 'true W', 'true_b'])
plt.show()


#随着迭代次数增加，loss越来越少

Epoch  0: W=5.00 b=0.00, loss=9.36370
Epoch  1: W=4.59 b=0.42, loss=6.28548
Epoch  2: W=4.26 b=0.75, loss=4.34777
Epoch  3: W=4.00 b=1.01, loss=3.12800
Epoch  4: W=3.80 b=1.22, loss=2.36017
Epoch  5: W=3.64 b=1.39, loss=1.87683
Epoch  6: W=3.51 b=1.52, loss=1.57257
Epoch  7: W=3.40 b=1.62, loss=1.38104
Epoch  8: W=3.32 b=1.70, loss=1.26047
Epoch  9: W=3.26 b=1.77, loss=1.18457

4keras

Keras是一种高级抽象的API。

# In the tf.keras.layers package, layers are objects. To construct a layer,
# simply construct the object. Most layers take as a first argument the number
# of output dimensions / channels.
layer = tf.keras.layers.Dense(100)
# The number of input dimensions is often unnecessary, as it can be inferred
# the first time the layer is used, but it can be provided if you want to 输入可选
# specify it manually, which is useful in some complex models.
layer = tf.keras.layers.Dense(10, input_shape=(None, 5))

层的类型包括Dense（全连接层），Conv2D，LSTM，BatchNormalization，Dropout。

# To use a layer, simply call it.
layer(tf.zeros([10, 5]))  #提供输入，开始计算

<tf.Tensor: id=29, shape=(10, 10), dtype=float32, numpy=
array([[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]], dtype=float32)>

# Layers have many useful methods. For example, you can inspect all variables
# in a layer using `layer.variables` and trainable variables using
# `layer.trainable_variables`. In this case a fully-connected layer
# will have variables for weights and biases.
layer.variables  #权重包括w和b

[<tf.Variable 'dense_1/kernel:0' shape=(5, 10) dtype=float32, numpy=
 array([[ 0.24547583, -0.18174013, -0.517627  , -0.28625917,  0.46390074,
         -0.11792481,  0.25204027, -0.10177726, -0.6321482 ,  0.11025655],
        [-0.04801774,  0.05520332, -0.61134326, -0.380753  , -0.20825565,
          0.44676226,  0.34310788, -0.18177858, -0.0399543 , -0.6257955 ],
        [-0.09570092,  0.2136836 ,  0.11427677,  0.45249623,  0.02414119,
          0.2739644 , -0.5701976 , -0.28433737,  0.1094352 , -0.26321137],
        [ 0.6225632 ,  0.56247157,  0.14319342, -0.27484533,  0.06545639,
         -0.14055312,  0.02605182, -0.17947513, -0.43184835, -0.13298517],
        [ 0.21471226, -0.34008306,  0.13555825, -0.20253879, -0.14976257,
          0.24820238,  0.4052704 , -0.42966282,  0.46730322,  0.5801386 ]],
       dtype=float32)>,
 <tf.Variable 'dense_1/bias:0' shape=(10,) dtype=float32, numpy=array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0.], dtype=float32)>]

class MyDenseLayer(tf.keras.layers.Layer):  #封装
  def __init__(self, num_outputs):
    super(MyDenseLayer, self).__init__()  #调用基类构造函数
    self.num_outputs = num_outputs

  def build(self, input_shape):
    self.kernel = self.add_variable("kernel",
                                    shape=[int(input_shape[-1]),
                                           self.num_outputs])

  def call(self, input):
    return tf.matmul(input, self.kernel)

layer = MyDenseLayer(10)
print(layer(tf.zeros([10, 5])))
print(layer.trainable_variables)

tf.Tensor(
[[0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]], shape=(10, 10), dtype=float32)
[<tf.Variable 'my_dense_layer/kernel:0' shape=(5, 10) dtype=float32, numpy=
array([[ 0.40808827,  0.31189376, -0.6253119 , -0.5369232 , -0.6137004 ,
        -0.5453413 , -0.27606115, -0.60044795,  0.42764622,  0.25756943],
       [ 0.11746502,  0.40135378,  0.27446055,  0.12815869,  0.19679207,
         0.118303  , -0.5602601 , -0.3910997 , -0.50229526, -0.4392987 ],
       [-0.11431473,  0.30567902, -0.42285785,  0.41714746,  0.54528743,
         0.20401049, -0.0829075 ,  0.4709614 , -0.60372585, -0.45935804],
       [-0.51994437, -0.40799683,  0.306705  ,  0.588075  ,  0.12873381,
        -0.12829626,  0.03449196,  0.5080891 , -0.5090939 ,  0.1574735 ],
       [-0.30528757, -0.3296884 ,  0.168805  ,  0.40543085,  0.46509403,
        -0.52575713, -0.181254  , -0.24681184, -0.37497327, -0.37618726]],
      dtype=float32)>]

将tf.keras.Model 封装，自定义类

class ResnetIdentityBlock(tf.keras.Model):
  def __init__(self, kernel_size, filters):
    super(ResnetIdentityBlock, self).__init__(name='')
    filters1, filters2, filters3 = filters

    self.conv2a = tf.keras.layers.Conv2D(filters1, (1, 1))
    self.bn2a = tf.keras.layers.BatchNormalization()

    self.conv2b = tf.keras.layers.Conv2D(filters2, kernel_size, padding='same')
    self.bn2b = tf.keras.layers.BatchNormalization()

    self.conv2c = tf.keras.layers.Conv2D(filters3, (1, 1))
    self.bn2c = tf.keras.layers.BatchNormalization()

  def call(self, input_tensor, training=False):  #重载调用运算符
    x = self.conv2a(input_tensor)
    x = self.bn2a(x, training=training)
    x = tf.nn.relu(x)

    x = self.conv2b(x)
    x = self.bn2b(x, training=training)
    x = tf.nn.relu(x)

    x = self.conv2c(x)
    x = self.bn2c(x, training=training)

    x += input_tensor
    return tf.nn.relu(x)


block = ResnetIdentityBlock(1, [1, 2, 3])
print(block(tf.zeros([1, 2, 3, 3])))
print([x.name for x in block.trainable_variables])


tf.Tensor(
[[[[0. 0. 0.]
   [0. 0. 0.]
   [0. 0. 0.]]

  [[0. 0. 0.]
   [0. 0. 0.]
   [0. 0. 0.]]]], shape=(1, 2, 3, 3), dtype=float32)
['resnet_identity_block/conv2d/kernel:0', 'resnet_identity_block/conv2d/bias:0', 'resnet_identity_block/batch_normalization/gamma:0', 'resnet_identity_block/batch_normalization/beta:0', 'resnet_identity_block/conv2d_1/kernel:0', 'resnet_identity_block/conv2d_1/bias:0', 'resnet_identity_block/batch_normalization_1/gamma:0', 'resnet_identity_block/batch_normalization_1/beta:0', 'resnet_identity_block/conv2d_2/kernel:0', 'resnet_identity_block/conv2d_2/bias:0', 'resnet_identity_block/batch_normalization_2/gamma:0', 'resnet_identity_block/batch_normalization_2/beta:0']

更常用的是tf.keras.Sequential函数，将每层逐一写出：

 my_seq = tf.keras.Sequential([tf.keras.layers.Conv2D(1, (1, 1)),
                               tf.keras.layers.BatchNormalization(),
                               tf.keras.layers.Conv2D(2, 1,
                                                      padding='same'),
                               tf.keras.layers.BatchNormalization(),
                               tf.keras.layers.Conv2D(3, (1, 1)),
                               tf.keras.layers.BatchNormalization()])
my_seq(tf.zeros([1, 2, 3, 3]))

<tf.Tensor: id=514, shape=(1, 2, 3, 3), dtype=float32, numpy=
array([[[[0., 0., 0.],
         [0., 0., 0.],
         [0., 0., 0.]],

        [[0., 0., 0.],
         [0., 0., 0.],
         [0., 0., 0.]]]], dtype=float32)>

6按品种对鸢尾花进行分类