Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug while loading t5 base model #53

Open
Sahajtomar opened this issue Apr 25, 2023 · 1 comment
Open

Bug while loading t5 base model #53

Sahajtomar opened this issue Apr 25, 2023 · 1 comment

Comments

@Sahajtomar
Copy link

I am trying to load t5 base model as per t5_ppo config. Strangely this error pops out. Works fine for t5-small.

	size mismatch for decoder.final_layer_norm.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for lm_head.weight: copying a param with shape torch.Size([32128, 512]) from checkpoint, the shape in current model is torch.Size([32128, 768]).
@Runingtime
Copy link

I got the same error, any workarounds?

RuntimeError: Error(s) in loading state_dict for T5ForConditionalGeneration:
size mismatch for shared.weight: copying a param with shape torch.Size([32100, 768]) from checkpoint, the shape in current model is torch.Size([32128, 768]).
size mismatch for encoder.embed_tokens.weight: copying a param with shape torch.Size([32100, 768]) from checkpoint, the shape in current model is torch.Size([32128, 768]).
size mismatch for decoder.embed_tokens.weight: copying a param with shape torch.Size([32100, 768]) from checkpoint, the shape in current model is torch.Size([32128, 768]).
size mismatch for lm_head.weight: copying a param with shape torch.Size([32100, 768]) from checkpoint, the shape in current model is torch.Size([32128, 768]).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants