tensorflow2学习笔记---模块、层和模型

模块、层和模型

官网

1. 定义Model和Layer

Tensorflow中的Model和Layer的高层实现都构建自tf.Module，例如Keras和Sonnet。

在深度学习中Layer和Model都被称为“objects”（对象），并且它们都是有状态的，提供的方法主要就是在更新和获取其中的状态。

# 继承Module
class SimpleModule(tf.Module):

	# 重写构造器
  def __init__(self, name=None):
    super().__init__(name=name)
		# 初始化模型的变量
    self.a_variable = tf.Variable(5.0, name="train_me")
    self.non_trainable_variable = tf.Variable(5.0, trainable=False, name="do_not_train_me")

	# 重写实例的"()"方便使用
  def __call__(self, x):
    return self.a_variable * x + self.non_trainable_variable

#实例化模型
simple_module = SimpleModule(name="simple")

# 调用call方法
simple_module(tf.constant(5.0))

只要继承了Module，tf.Variable和tf.Module其中的成员变量都会被自动收集。

class Dense(tf.Module):
  def __init__(self, in_features, out_features, name=None):
    super().__init__(name=name)
    self.w = tf.Variable(
      tf.random.normal([in_features, out_features]), name='w')
    self.b = tf.Variable(tf.zeros([out_features]), name='b')
  def __call__(self, x):
    y = tf.matmul(x, self.w) + self.b
    return tf.nn.relu(y)

class SequentialModule(tf.Module):
  def __init__(self, name=None):
    super().__init__(name=name)

    self.dense_1 = Dense(in_features=3, out_features=3)
    self.dense_2 = Dense(in_features=3, out_features=2)

  def __call__(self, x):
    x = self.dense_1(x)
    return self.dense_2(x)

# You have made a model!
my_model = SequentialModule(name="the_model")

# Call it, with random results
print("Model results:", my_model(tf.constant([[2.0, 2.0, 2.0]])))

# 收集了所有的子模块
print("Submodules:", my_model.submodules)
# 收集了子模块中的所有变量
for var in my_model.variables:
  print(var, "
")

因为兼容性问题，Karas不会收集Module中的变量，所以尽量不要将两者进行混用。

For historical compatibility reasons Keras layers do not collect variables from modules, so your models should use only modules or only Keras layers. However, the methods shown below for inspecting variables are the same in either case.

延迟创建变量

在上面的例子中全连接层在创建时就需要指定输入特征数量和输出的结果数量，用下面的方式就可以省略输入的设定

class FlexibleDenseModule(tf.Module):

  # 构造器不用指定输入特征数量
  def __init__(self, out_features, name=None):
    super().__init__(name=name)
    self.is_built = False
    self.out_features = out_features

  def __call__(self, x):
  
		# 在第一次调用的时候，根据传入的输入参数来初始化权重x的shape
    if not self.is_built:
      self.w = tf.Variable(
        tf.random.normal([x.shape[-1], self.out_features]), name='w')
      self.b = tf.Variable(tf.zeros([self.out_features]), name='b')
      self.is_built = True

    y = tf.matmul(x, self.w) + self.b
    return tf.nn.relu(y)

2. 存储权重

有两种方式可以对模型进行存储，分别是checkpoint和SavedModel。

checkpoint

checkpoint只会保存模型中的变量信息，持久化的checkpoint由索引文件和数据文件组成，可以使用方法tf.train.list_variables查看已经持久化的变量。

checkpoint还可以在分布式训练时进行共享，所以命名方式也是区分了shard（分片）的。

chkp_path = "my_checkpoint"
checkpoint = tf.train.Checkpoint(model=my_model)
checkpoint.write(chkp_path)

# 生成文件的
# my_checkpoint.data-00000-of-00001  my_checkpoint.index

# 使用方法查看文件中的变量
tf.train.list_variables(chkp_path)

读取checkpoint的变量，读取过后checkpoint中的变量值会覆盖现有模型中的变量值。

new_model = MySequentialModule()
new_checkpoint = tf.train.Checkpoint(model=new_model)
new_checkpoint.restore("my_checkpoint")

# Should be the same result as above
new_model(tf.constant([[2.0, 2.0, 2.0]]))

3. 存储Function

Function使用计算图的形式来保存。

通过给方法加上@tf.function可以让其生成计算图，然后使用summary生成log，log中就保存了计算图的结构，并且可以通过tensorboard进行可视化。

class MySequentialModule(tf.Module):
  def __init__(self, name=None):
    super().__init__(name=name)

    self.dense_1 = Dense(in_features=3, out_features=3)
    self.dense_2 = Dense(in_features=3, out_features=2)

  @tf.function
  def __call__(self, x):
    x = self.dense_1(x)
    return self.dense_2(x)

# You have made a model with a graph!
my_model = MySequentialModule(name="the_model")

# 生成
# Set up logging.
stamp = datetime.now().strftime("%Y%m%d-%H%M%S")
logdir = "logs/func/%s" % stamp
writer = tf.summary.create_file_writer(logdir)

# 创建一个模型以便获得最新的计算图
new_model = MySequentialModule()

# 在tracing之前打开日志记录
tf.summary.trace_on(graph=True)
tf.profiler.experimental.start(logdir)

# 第一次调用生成的model以便执行tracing
z = print(new_model(tf.constant([[2.0, 2.0, 2.0]])))

# 将记录的日志持久化到文件
with writer.as_default():
  tf.summary.trace_export(
      name="my_func_trace",
      step=0,
      profiler_outdir=logdir)

可视化计算图

%tensorboard --logdir logs/func

创建SaveModel

要共享训练好的模型，推荐使用SaveModel，它会保存Function和权重数据。

tf.saved_model.save(my_model, "the_saved_model")

# 生成的文件
drwxr-sr-x 2 kbuilder kokoro  4096 Feb 23 02:23 assets
# protocal buffer文件
-rw-rw-r-- 1 kbuilder kokoro 14140 Feb 23 02:23 saved_model.pb
# checkpoint数据
drwxr-sr-x 2 kbuilder kokoro  4096 Feb 23 02:23 variables

可以直接从这些文件创建模型实例，返回的对象是一个Tensorflow对象，没有任何类信息。下面代码的输出也能看出实例并不是模型Class的实例。

并且这个模型只能传入已经执行过的输入参数列表，因为Function被多少个Graph重载在导出之前已经固定。

new_model = tf.saved_model.load("the_saved_model")

isinstance(new_model, SequentialModule)
===>
False

4. Keras模型和层

Keras是建立在tf.Module上的高层API。

Keras的一些特性：

Optional losses
Support for metrics
Built-in support for an optional training argument to differentiate between training and inference use
get_config and from_config methods that allow you to accurately store configurations to allow model cloning in Python

Keras Layers

tf.keras.layers.Layer是所有Keras层的父类，它又继承自tf.Module。自己实现一个Keras Layer只需要继承Layer类，并重写call方法就行。

class MyDense(tf.keras.layers.Layer):
  # 增加 **kwargs参数是为了兼容父类构造器
  def __init__(self, in_features, out_features, **kwargs):
    super().__init__(**kwargs)

    # This will soon move to the build step; see below
    self.w = tf.Variable(
      tf.random.normal([in_features, out_features]), name='w')
    self.b = tf.Variable(tf.zeros([out_features]), name='b')

  def call(self, x):
    y = tf.matmul(x, self.w) + self.b
    return tf.nn.relu(y)

simple_layer = MyDense(name="simple", in_features=3, out_features=3)

build方法

在实际传入了input之后再创建变量是更加方便的。

Keras提供了一些生命周期的方法，其中一个就是build。build方法只会被调用一次，经常被用来创建变量。如果build创建变量（权重）后，输入参数和权重的纬度不匹配，则会报错。

class FlexibleDense(tf.keras.layers.Layer):
  # Note the added `**kwargs`, as Keras supports many arguments
  def __init__(self, out_features, **kwargs):
    super().__init__(**kwargs)
    self.out_features = out_features

  def build(self, input_shape):  # Create the state of the layer (weights)
    self.w = tf.Variable(
      tf.random.normal([input_shape[-1], self.out_features]), name='w')
    self.b = tf.Variable(tf.zeros([self.out_features]), name='b')

  def call(self, inputs):  # Defines the computation from inputs to outputs
    return tf.matmul(inputs, self.w) + self.b

# Create the instance of the layer
flexible_dense = FlexibleDense(out_features=3)

Keras模型

Keras Model继承自tf.keras.layers.Layer并提供了更简单的训练方法、评估方法和分布式训练等。

class MySequentialModel(tf.keras.Model):
  def __init__(self, name=None, **kwargs):
    super().__init__(**kwargs)

    self.dense_1 = FlexibleDense(out_features=3)
    self.dense_2 = FlexibleDense(out_features=2)
  def call(self, x):
    x = self.dense_1(x)
    return self.dense_2(x)

# You have made a Keras model!
my_sequential_model = MySequentialModel(name="the_model")

# Call it on a tensor, with random results
print("Model results:", my_sequential_model(tf.constant([[2.0, 2.0, 2.0]])))

如果模型使用固定的Layer和Input就可以使用Funtional API来让模型代码更简洁

inputs = tf.keras.Input(shape=[3,])

x = FlexibleDense(3)(inputs)
x = FlexibleDense(2)(x)

my_functional_model = tf.keras.Model(inputs=inputs, outputs=x)

my_functional_model.summary()

5. 保存Keras模型

Keras模型可以像tf.module一样使用checkpoint和tf.saved_models.save()进行保存。但它还有一个更方便的方法：

# 保存模型
my_sequential_model.save("exname_of_file")
# 重新读取模型
reconstructed_model = tf.keras.models.load_model("exname_of_file")