Seq2seq attention pytorch. What values should we initialize the attn layer with?.