Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

why does LSTM can be discarded during inference? #43

Open
MXuer opened this issue May 12, 2023 · 1 comment
Open

why does LSTM can be discarded during inference? #43

MXuer opened this issue May 12, 2023 · 1 comment

Comments

@MXuer
Copy link

MXuer commented May 12, 2023

I am confused about this sentence in your papar of "GPT Understands, Too":

Moreover, in the inference, we only need the output embedding h and can discard the LSTM head.

If the LSTM encoder was used during training, and the finally embeddings was combined by the outputs of LSTM encoder and the original embeddings, while it was discarded duraing inference, the finally embeddings was just the outputs of two embedding layers. Does this make different performance?

So why LSTM can be discarded in the inference?

Thanks a lot.

@Deerkangkang
Copy link

Deerkangkang commented May 19, 2023

I am not English speaker, and i would using Chinese to answer this question. 在预测阶段,模板部分的输出是不变的,因为输入Encode的模板保持不变,所以LSTM的输出也不会改变。只需要拿到第一次LSTM的输出就可以在整个预测阶段使用了, I hope this answer can help you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants