(续 2 ）在深度计算框架MindSpore中如何对不持续的计算进行处理——对数据集进行一定epoch数量的训练后，进行其他工作处理，再返回来接着进行一定epoch数量的训练—

(续 2 ）在深度计算框架MindSpore中如何对不持续的计算进行处理——对数据集进行一定epoch数量的训练后，进行其他工作处理，再返回来接着进行一定epoch数量的训练——单步计算

内容接前文：

https://www.cnblogs.com/devilmaycry812839668/p/14988686.html

https://www.cnblogs.com/devilmaycry812839668/p/14990021.html

前面是我们自己按照个人理解实现的单步计算，随着对这个计算框架MindSpore的深入了解我们了解到其实官方是提供了单步计算函数的。

具体函数：

from mindspore.nn import TrainOneStepCell, WithLossCell

根据官方资料：

https://www.mindspore.cn/doc/programming_guide/zh-CN/master/network_component.html?highlight=%E5%8D%95%E6%AD%A5%E8%AE%AD%E7%BB%83

根据官方提供的函数，给出如下代码：

import mindspore
import numpy as np  # 引入numpy科学计算库
import matplotlib.pyplot as plt  # 引入绘图库

np.random.seed(123)  # 随机数生成种子

import mindspore.nn as nn
import mindspore.ops as ops
from mindspore import Tensor
from mindspore import ParameterTuple, Parameter
from mindspore import dtype as mstype
from mindspore import Model
import mindspore.dataset as ds
from mindspore.train.callback import ModelCheckpoint, CheckpointConfig
from mindspore.train.callback import LossMonitor
from mindspore.nn import TrainOneStepCell, WithLossCell


class Net(nn.Cell):
    def __init__(self, input_dims, output_dims):
        super(Net, self).__init__()
        self.matmul = ops.MatMul()

        self.weight_1 = Parameter(Tensor(np.random.randn(input_dims, 128), dtype=mstype.float32), name='weight_1')
        self.bias_1 = Parameter(Tensor(np.zeros(128), dtype=mstype.float32), name='bias_1')
        self.weight_2 = Parameter(Tensor(np.random.randn(128, 64), dtype=mstype.float32), name='weight_2')
        self.bias_2 = Parameter(Tensor(np.zeros(64), dtype=mstype.float32), name='bias_2')
        self.weight_3 = Parameter(Tensor(np.random.randn(64, output_dims), dtype=mstype.float32), name='weight_3')
        self.bias_3 = Parameter(Tensor(np.zeros(output_dims), dtype=mstype.float32), name='bias_3')

    def construct(self, x):
        x1 = self.matmul(x, self.weight_1) + self.bias_1
        x2 = self.matmul(x1, self.weight_2) + self.bias_2
        x3 = self.matmul(x2, self.weight_3) + self.bias_3
        return x3


def main():
    net = Net(1, 1)
    # loss function
    loss = nn.MSELoss()
    # optimizer
    optim = nn.SGD(params=net.trainable_params(), learning_rate=0.000001)
    # make net model
    # model = Model(net, loss, optim, metrics={'loss': nn.Loss()})
    net_with_criterion = WithLossCell(net, loss)
    train_network = TrainOneStepCell(net_with_criterion, optim)


    # 数据集
    x, y = np.array([[0.1]], dtype=np.float32), np.array([[0.1]], dtype=np.float32)
    x = Tensor(x)
    y = Tensor(y)


    for i in range(20000*100):
        #print(i, '	', '*' * 100)
        train_network.set_train()
        res = train_network(x, y)

    # right
    # False, False
    # False, True
    # True, True  xxx

    # not right
    # True, False


if __name__ == '__main__':
    """ 设置运行的背景context """
    from mindspore import context

    # 为mindspore设置运行背景context
    #context.set_context(mode=context.PYNATIVE_MODE, device_target='GPU')
    context.set_context(mode=context.GRAPH_MODE, device_target='GPU')

    import time

    a = time.time()
    main()
    b = time.time()
    print(b-a)

运行时间：

1158.24s

1154.29s

1152.69s

=====================================================

前文我们给出的单步计算 model.train 的代码修改如下：

import mindspore
import numpy as np  # 引入numpy科学计算库
import matplotlib.pyplot as plt  # 引入绘图库

np.random.seed(123)  # 随机数生成种子

import mindspore.nn as nn
import mindspore.ops as ops
from mindspore import Tensor
from mindspore import ParameterTuple, Parameter
from mindspore import dtype as mstype
from mindspore import Model
import mindspore.dataset as ds
from mindspore.train.callback import ModelCheckpoint, CheckpointConfig
from mindspore.train.callback import LossMonitor


class Net(nn.Cell):
    def __init__(self, input_dims, output_dims):
        super(Net, self).__init__()
        self.matmul = ops.MatMul()

        self.weight_1 = Parameter(Tensor(np.random.randn(input_dims, 128), dtype=mstype.float32), name='weight_1')
        self.bias_1 = Parameter(Tensor(np.zeros(128), dtype=mstype.float32), name='bias_1')
        self.weight_2 = Parameter(Tensor(np.random.randn(128, 64), dtype=mstype.float32), name='weight_2')
        self.bias_2 = Parameter(Tensor(np.zeros(64), dtype=mstype.float32), name='bias_2')
        self.weight_3 = Parameter(Tensor(np.random.randn(64, output_dims), dtype=mstype.float32), name='weight_3')
        self.bias_3 = Parameter(Tensor(np.zeros(output_dims), dtype=mstype.float32), name='bias_3')

    def construct(self, x):
        x1 = self.matmul(x, self.weight_1) + self.bias_1
        x2 = self.matmul(x1, self.weight_2) + self.bias_2
        x3 = self.matmul(x2, self.weight_3) + self.bias_3
        return x3


def main():
    net = Net(1, 1)
    # loss function
    loss = nn.MSELoss()
    # optimizer
    optim = nn.SGD(params=net.trainable_params(), learning_rate=0.000001)
    # make net model
    model = Model(net, loss, optim, metrics={'loss': nn.Loss()})

    # 数据集
    x, y = np.array([[0.1]], dtype=np.float32), np.array([[0.1]], dtype=np.float32)

    def generator_multidimensional():
        for i in range(1):
            a = x*i
            b = y*i
            #print(a, b)
            yield (a, b)

    dataset = ds.GeneratorDataset(source=generator_multidimensional, column_names=["input", "output"])

    for i in range(20000*100):
        #print(i, '	', '*' * 100)
        model.train(1, dataset, dataset_sink_mode=False)

    # right
    # False, False
    # False, True
    # True, True  xxx

    # not right
    # True, False


if __name__ == '__main__':
    """ 设置运行的背景context """
    from mindspore import context

    # 为mindspore设置运行背景context
    #context.set_context(mode=context.PYNATIVE_MODE, device_target='GPU')
    context.set_context(mode=context.GRAPH_MODE, device_target='GPU')

    import time

    a = time.time()
    main()
    b = time.time()
    print(b-a)

运行时间：

2173.19s

2181.61s

==================================================================

可以看到，在单步计算时，如果使用框架提供的单步训练函数会更好的提升算法运算效率，运算效率提升的幅度也很大，所有在进行单步训练或者非持续数据量训练时使用框架提供的单步训练函数是首选。

单步训练函数：

from mindspore.nn import TrainOneStepCell, WithLossCell

=====================================================================

本文实验环境为 MindSpore1.1 docker版本

宿主机：Ubuntu18.04系统

CPU:I7-8700

GPU:1060ti NVIDIA显卡

本博客是博主个人学习时的一些记录，不保证是为原创，个别文章加入了转载的源地址还有个别文章是汇总网上多份资料所成，在这之中也必有疏漏未加标注者，如有侵权请与博主联系。