pytorch小结

基础单位与操作
梯度计算
定义模型
训练模型
运用GPU算力

基础单位与操作

tensor与NumPy’s ndarrays的区别：
- tensor可以在GPU上进行加速计算
创建矩阵：torch.empty(), torch.rand(), torch.zeros()
创建tensor：torch.tensor(), torch.randn_like()
基础运算：torch.add(), ...
- 更多运算操作见：pytorch doc - torch
numpy brige: [tensor].numpy(), torch.from_numpy()

梯度计算

torch.Tensor是梯度计算的基本单位

追踪梯度：设置.requires_grad=True
获得梯度：执行.backward()后用.grad获得累积的梯度
停止梯度追踪：.detach()或with torch.no_grad():
每个tensor都有一个.grad_fn属性，该属性指明了创建该tensor的 Function
- 如果这个tensor是用户初始化的, 则它的 grad_fn is None

定义模型

神经网络使用torch.nn包进行构建
nn依赖autograd来定义模型并进行梯度运算
一个nn.Module包含了：1.网络层级结构；2.前向传播方法forward(input)，该方法返回output
net.parameters()返回模型需要学习的参数
一个简单的网络模型Demo：

class Net(torch.nn.Module):
    def __init__(self, n_feature, n_hidden, n_output):
        super(Net, self).__init__()
        self.hidden = torch.nn.Linear(n_feature, n_hidden)   # hidden layer
        self.predict = torch.nn.Linear(n_hidden, n_output)   # output layer

    def forward(self, x):
        x = F.relu(self.hidden(x))      # activation function for hidden layer
        x = self.predict(x)             # linear output
        return x

net = Net(n_feature=1, n_hidden=10, n_output=1)     # define the network
print(net)  # net architecture

训练模型

训练模型需要：1. 计算损失 2. 更新梯度
计算损失：

loss_func = torch.nn.MSELoss()  # 指定损失函数
prediction = net(x)  # 得到网络对样本x的预测结果
loss = loss_func(prediction, y)  # 根据loss_func和prediction计算损失

更新梯度：

optimizer = torch.optim.SGD(net.parameters(), lr=0.2)  # 指定优化方法
optimizer.zero_grad()   # 清空梯度（否则会与上一轮的结果累加在一起）
loss.backward()         # 反向传播，计算梯度
optimizer.step()        # 应用优化方法，更新模型参数（权重）

运用GPU算力

见：Training on GPU

参考：