embedding size与vocabulary size之间的关系: e = v**0.25 https://stackoverflow.com/questions/50747947/embedding-in-pytorch