吭吭吭

踩过的坑1

报错，sequence_loss的报错，说batch_size不一样，类似sequence_loss/SparseSoftmaxCrossEntropyWithLogits/SparseSoftmaxCrossEntropyWithLogits

追查记录：

1，sof查该错误，有人出类似错误，是因为sequence length不一样。于是在代码打日志，打印batch size 是 32，可是报错的不一致是 100和132不一致，明显不是batch size导致

应该sequence length不一样导致的，打印出targets loss和input长度，是一样的没问题，打印出decode 的output，确实少了一位，到此定位到原因，decode的sequence和target 长度不一致。

2，为什么不一致呢，只能查到源代码，发现decode的过程中是一个标记为finished来控制的，查找源头，原来创建helper的时候穿进去一个length，decode时候判断如果大于这个length就结束了

3，到此定位了是length的问题，通过打印这个length，发现确实是这个length长度少了。那么为什么长度少了呢，查看pad代码，会根据pad来算真实length，这个时候，我把unk和pad都设成了0，导致了

把unk认为是pad，那么算真实长度的时候会少了，到此定位结果

坑2

float的数据类型，写通过np的tostring转成string存储，读的是decode raw, float32来读，说维度不一致，说希望73维，确收到了146维，我打了各种log，维度没错啊，突然灵光一闪，decode raw用float64，一切正常

hen applying AttentionWrapper attention_wrapper_1: Non-matching batch sizes between the memory (encoder output) and the query (decoder output). Are you using the BeamSearchDecoder? You may need to tile your memory input via the tf.contrib.seq2seq.tile_batch function with argument multiple=beam_width.] [Condition x == y did not hold element-wise:] [x (read/decode/seq_decode/seq_decode/attention_wrapper/strided_slice_1:0) = ] [7] [y (read/decode/seq_decode/seq_decode/attention_wrapper/assert_equal_1/y:0) = ] [1]

坑3：

When applying AttentionWrapper attention_wrapper_1: Non-matching batch sizes between the memory (encoder output) and the query (decoder output). Are you using the BeamSearchDecoder? You may need to tile your memory input via the tf.contrib.seq2seq.tile_batch function with argument multiple=beam_width.] [Condition x == y did not hold element-wise:] [x (read/decode/seq_decode/seq_decode/attention_wrapper/strided_slice_1:0) = ] [7] [y (read/decode/seq_decode/seq_decode/attention_wrapper/assert_equal_1/y:0) = ] [1]

我在使用outgraph decode 的时候出的错。

首先定位代码位置：self._batch_size_checks(cell_batch_size, error_message)):

看实现是说decode的时候输入的batch size和attention_mechanism不一致，于是打印出batch size，是在step func的过程中出来的，可是因为是静态图，这个结果打不出来。于是只能去看代码逻辑。发现输入是start id的时候确实是batch size是1，可是我在beam_search文件中，使用的是一次feed一个beam，也就是说是这个是batch size是7了。由于最初的attention_mechanism还是根据根据encode的batch size 1来encode的，所有你decode的batch size 7肯定有问题啊，所以解决方案有两个，一个你decode的时候按照1来decode，1个是encode的时候按7，也就是beam size来encode，但是不得不吐槽，outgraph beam seach 真是坑，