Pytorch使用-1

1. pytorch初体验

有两类情况都必须执行 to(cuda)操作：(1)需要input到model的input_tensor, (2)model, e.g.

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
# 但是上述有个bug, 指定0号显卡，即 "cuda:0"是OK的，但是指定 "cuda:1", "cuda:2", etc, 就会报错
# 暂时我还没想到什么办法指定显卡，这个
# os.environ["CUDA_VISIBLE_DEVICES"] = "1" 好像也不work了。。。

model = ConvLSTMAE().to(device) # 把model push to device(即cuda)
X, Y = X.to(device), Y.to(device) # X,Y待会儿会 进入 model
# init_hidden() 里面返回的tensor都要输入到 LSTMcell，所以也必须 to(device) or .cuda()

有时候为了确保计算图，tensor，Autograd，的正常，优先使用pytroch提供的numpy-api，因为直接使用原生的numpy api，在某些我还无法掌控原理的情况下，会报很奇怪的错误，e.g.

y_hat = torch.stack(y_list, 0) # 把一个 list[np1, np2]，拼接为一个：新增一维的np
# [[8,3,225,225], ...] => [19,8,3,225,225], 前者是一个list(len==19), 每个元素是[8,3,225,225]
# 之前使用 numpy原生的 append, concatenate,都报下面很奇怪的错误：

Can’t call numpy() on Variable that requires grad
https://discuss.pytorch.org/t/cant-call-numpy-on-variable-that-requires-grad/20763
https://discuss.pytorch.org/t/cant-call-numpy-on-variable-that-requires-grad-use-var-detach-numpy-instead/28281 
https://stackoverflow.com/questions/44340848/how-to-convert-pytorch-autograd-variable-to-numpy

关于pytorch-ConvLSTM的使用：
- 先到github上搜 pytorch ConvLSTM，选择star最多那个repos，然后看源码和使用接口
- 我的体会是，这个code写的非常好，非常适合5-D tensor，即视频数据使用
- 看懂源码后，可能需要调参
pytroch社区关于读取video dataset的一些讨论 (TODO细看)
我自己本次测试work的一个处理 video dataset的方案：
- TODO(有时间再整理补充)
本次我对于5-D tensor 与 Conv2D的使用心得
- 注意大于3-D的 tensor(4-D or 5-D, etc)，本质上做卷积操作，只影响 c,w,h 这3个维度，而 batch_size或者 seq_len 都只是相当于增多了矩阵的行数，是为了提高矩阵乘法(即卷积操作的本质)的并行度的。所以如何5-D tensor 输入 Conv2D呢？答：(1)input_tensor 先把 seq_len * batch_size 当做 n，将 (n,c,w,h) 这个 4-D tensor 推入Conv2D做卷积，(2)之后得到的结果，再 view(t, b, c, w, h) 重新张成 5-D tensor
本次我对于LSTM以及 ConvLSTM 的使用心得
- LSTM本身为NLP文本序列设计，input_size只能为Int，所以任何shape的tensor都必须展平才能输入LSTMcell，这一点而言，调用LSTMcell其实很简单
- ConvLSTM由于需要让 w 和 input_tensor 做Conv，所以不能破坏input_tensor的 shape，我这次使用的ConvLSTM的input_tensor就是一个 5-D tensor，[t,b,c,w,h], t是seq_len, b是batch_size
- 嘉伟还教我一招：因为整个序列输入到LSTM，计算的本质还是：遍历 seq_len，然后每个时间步中操作 LSTMcell，最后再将结果汇总。所以可以自己手写遍历 seq_len，然后每个时间步调用 LSTMcell，并且执行自己想要的其他操作，这样可以更简单地自定义很多类型的LSTM，代码写法更加灵活！
论文的实验细节可能不可靠，有作者code还是要看着code 来重构模型
一个不错的pytroch blog
- 还没时间细看，后面抽空看下
- https://blog.csdn.net/u011276025/article/details/76098185
- https://blog.csdn.net/u011276025/article/details/71244507
CVPR2016一篇 ConvAE的 tf-code (不完整版，没数据处理部分)
- https://github.com/iwyoo/TemporalRegularityDetector-tensorflow
- Implementation of "Learning Temoporal Regularity in Video Sequences" with TensorFlow for hexathon2016.
LSTM初理解：

https://www.pytorchtutorial.com/pytorch-sequence-model-and-lstm-networks/

ZhiHu ：HaoZhang的知乎

GitHub：HaoZhang的GitHub

Gmail ：njuhaozhang@gmail.com