pytorch学习笔记-week1-基本概念

打算花一两个月的时间了解一下 pytorch，做了一些学习笔记，发博以供备份。

1.pytorch 实现模型训练5大要素

数据：数据读取，数据清洗，进行数据划分和数据预处理

模型：构建模型模块，组建复杂网络，初始化网络参数，定义网络层

损失函数：创建损失函数，设置损失函数超参数，根据不同任务选择合适的损失函数

优化器：使用某种优化器更新参数，管理模型参数，管理多个参数组实现不同学习率，调整学习率

迭代训练：组织以上4个模块进行训练，观察训练效果，绘制 loss/accuracy 曲线，用 tensorboard 进行可视化分析

2.tensor 张量

tensor 是一个多维数组

torch.tensor 的8个属性：

data: 被包装的 tensor
grad: data 的梯度
grad_fn: 创建 tensor 所使用的 function, 是自动求导的关键
requires_grad: 是否需要梯度，不是所有的 tensor 都需要计算梯度
is_leaf: 是否为叶子结点
dtype: tensor 的数据类型

分为9种，共三大类：float (16-bit, 32-bit, 64-bit), integer (unsigned-8-bit, 8-bit, 16-bit, 32-bit, 64-bit), boolean

参数和数据用的最多的是 float-32-bit, label 常用的为 integer-64-bit
shape: tensor 的形状
device: tensor 所在的设备 (CPU/GPU), GPU 是加速计算的关键

（1）tensor 的创建方法

直接创建：

torch.tensor()

torch.tensor(data, dtype=None, requires_grad=False, pin_memory=False)

data: 数据，可以是 list，numpy
dtype: 数据类型，默认与 data 的一致
device: 所在设备，cuda/cpu
requires_grad: 是否需要梯度
pin_memory: 是否存于锁页内存

torch.from_numpy(narray): 创建的 tensor 和原来的 narray 共享内存，修改其中一个，另一个也被修改

根据数值创建 tensor：

torch.zeros()

torch.zeros(*size, out=None, dtype=None, layout=torch.strided, device=None, requires_grad=False)
# 根据size创建全0张量

size: 张量的形状
out: 输出的张量，如果指定了 out，那么torch.zeros()返回的张量和 out 指向的是同一个地址
layout: 内存中布局形式，有 strided，sparse_coo 等。当是稀疏矩阵时，设置为 sparse_coo 可以减少内存占用。

torch.zeros_like()

torch.zeros_like(input, dtype=None, layout=None, device=None, requires_grad=False)
# 根据input形状创建全0张量

torch.ones() 和 torch.ones_like()

torch.full() 和 torch.full_like()

torch,full(size, fill_value, out=None, dtype=None, layout=torch.strided, device=None, requires_grad=False)
# 创建自定义数值张量

torch.arange()

torch.arange(start=0, end, step=1, out=None, dtype=None, layout=torch.strided, device=None, requires_grad=None)
# 创建等差的一维张量，即向量，区间为[start, end), 其中step为公差

torch.linspace()

torch.linspace(start, end, steps=100, out=None, dtype=None, layout=torch.strided, device=None, requires_grad=False)
# 创建均分的一维张量，区间为[start, end], 其中step为元素个数

torch.logspace()

torch.logspace(start, end, steps=100, base=10.0, out=None, dtype=None, layout=torch.strided, device=None, requires_grad=False)
# 创建对数均分的一维张量，区间为[start, end], 其中step为元素个数，base为底

torch.eye()

torch.eye(n,m=None, out=None, dtype=None, layout=torch.strided, device=None, requires_grad=False)
# 创建单位对角矩阵，其中n为行数，默认为方阵，m为列数

根据概率创建tensor:

torch.normal()

torch.normal(mean, std, *, generator=None, out=None)
'''
生成正太分布，mean为均值，std为标准差
4种模式
1.mean为标量，std为标量，需要设置size，*部分size=(size,)
2.mean为标量，std为张量
3.mean为张量，std为标量
4.mean为张量，std为张量
'''

torch.randn() 和 torch.randn_like()

torch.randn(*size, out=None, dtype=None, layout=torch.strided, device=None, requires_grad=False)
# 生成标准正态分布，其中size为张量形状

torch.rand() 和 torch.rand_like()

torch.rand(*size, out=None, dtype=None, layout=torch.strided, device=None, requires_grad=False)
# 在[0, 1)上生成均匀分布

torch.randint() 和 torch.randint_like()

randint(low=0, high, size, *, generator=None, out=None, dtype=None, layout=torch.strided, device=None, requires_grad=False)
# 在区间[low, high)上生成整数均匀分布，其中size为张量的形状

torch.randperm()

torch.randperm(n, out=None, dtype=torch.int64, layout=torch.strided, device=None, requires_grad=False)
# 生成从0到n-1的随机排序，常用于生成索引，其中n为张量的长度

torch.bernoulli()

torch.bernoulli(input, *, generator=None, out=None)
# 以input为概率，生成伯努利分布

（2）tensor的操作

拼接:

torch.cat()

torch.cat(tensors, dim=0, out=None)
# 将张量按照dim维度进行拼接
# tensors：张量的序列
# dim：要拼接的维度

torch.stack()

torch.stack(tensors, dim=0, out=None)
# 将张量在新创建的dim维度上进行拼接

切分：

torch.chunk()

torch.chunk(input, chunks, dim=0)
# 将张量按照维度dim进行切分，若不能整除，最后一份张量小于其他张量
# 其中chunk为要切的份数，dim表示要切分的维度

torch.split()

torch.split(tensor, split_size_or_sections, dim=0)
# 将张量按照维度dim进行平均切分，可以指定每一个分量的切分长度
# split_size_or_sections: 为int时，表示每一份的长度，如果不能被整除，则最后一份张量小于其他张量；为list时，按照list元素作为每一个分量的长度切分。如果list元素之和不等于切分维度(dim)的值，就会报错

索引：

torch.index_select()

torch.index_select(input, dim, index, out=None)
# 在维度dim上，按照index索引取出数据拼接为张量返回
# index表示要索引书序的序号

torch.mask_select()

torch.mask_select(input, mask, out=None)
# 按照mask中的true进行索引拼接得到一维张量返回
# mask为与input同形状的布尔型张量

变换：

torch.reshape()

torch.reshape(input, shape)
# 变换张量的形状
# 当张量在内存中连续时，返回的张量与原来的共享内存，改变一个也会改变另一个

torch.transpose()

torch.transpose(input, dim0, dim1)
# 交换张量的两个维度

torch.t()

# 二维张量转置
# 对于二维矩阵，等于 torch.transpose(input, dim0, dim1)

torch.squeeze()

torch.squeeze(input, dim=None, out=None)
# 压缩长度为1的维度
# 若dim：设置为None, 则移除所有长度为1的维度, 若指定维度，当且仅当该维度长度为1时可以移除

torch.unsqueeze()

torch.unsqueeze(input, dim)
# 根据dim扩展维度，长度为1

数学运算：

torch.add()

torch.add(input, other, out=None)
torch.add(input, other, *, alpha=1, out=None)
# 逐元素计算 input + alpha * other 
# 在深度学习中经常用到先乘后加的操作

torch.addcdiv()

torch.addcdiv(input, tensor1, tensor2, *, value=1, out=None)
# OUTi = INPUTi + value * (TENSOR1i / TENSOR2i)

torch.addcmul()

torch.addcmul(input, tensor1, tensor2, *, value=1, out=None)
# OUTi = INPUTi + value * TENSOR1i * TENSOR2i