support qwen2 1.5b #1782

lvhan028 · 2024-06-14T14:05:39Z

qwen2 1.5b set tie_word_embeddings=True.
In this case, the output layer and the token embedding layer share the same weight

irexyc · 2024-06-14T14:27:53Z

lmdeploy/turbomind/deploy/source_model/llama.py

                ret = self.Reader(new_params, unused_params,
-                                  i == self.nmgrs - 1, self.model_info())
+                                  i == self.nmgrs - 1, self.model_config)


This will affect many models, like internlm2, internvl

Yes, I understand. However, I believe it is necessary to ensure that the original model configuration is accessible to all source models. Otherwise, the model_info() function should be capable of handling all edge cases.

lvhan028 · 2024-06-14T15:05:43Z

TODO: full test all supported models

zhyncs · 2024-06-15T06:45:23Z

ref https://qwenlm.github.io/blog/qwen2/#model-information

Models	Qwen2-0.5B	Qwen2-1.5B	Qwen2-7B	Qwen2-57B-A14B	Qwen2-72B
Params	0.49B	1.54B	7.07B	57.41B	72.71B
Non-Emb Params	0.35B	1.31B	5.98B	56.32B	70.21B
GQA	True	True	True	True	True
Tie Embedding	True	True	False	False	False
Context Length	32K	32K	128K	64K	128K

For small models, we prefer the application of tying embedding as the large sparse embeddings take up a large proportion of the total model parameters.

dawnranger · 2024-06-17T08:48:18Z

any plan to support qwen2-0.5b? @lvhan028

lvhan028 · 2024-06-17T10:36:40Z

any plan to support qwen2-0.5b? @lvhan028

qwen2-0.5b defines head_dim 64. But lmdeploy turbomind engine now requires head_dim 128.
So it doesn't support qwen2-0.5b now. You may use the pytorch engine for the qwen2-0.5b model

support qwen2 1.5b

e27a255

lvhan028 requested review from lzhangzz and irexyc June 14, 2024 14:06

update

9d3de96

lvhan028 mentioned this pull request Jun 14, 2024

[Feature] qwen2系列模型 #1777

Closed

lzhangzz approved these changes Jun 14, 2024

View reviewed changes

lvhan028 added the Bug:P1 label Jun 14, 2024

update

b0c0109

irexyc reviewed Jun 14, 2024

View reviewed changes

irexyc approved these changes Jun 14, 2024

View reviewed changes

lvhan028 added 5 commits June 17, 2024 14:36

fix

92d6341

fix internvl.py

b6cb453

fix internvl

626c185

fix ut

f8f3a69

fix deepseek-vl

f6e1f77

AllentDan mentioned this pull request Jun 17, 2024

[Bug] lmdeploy lite auto_awq: RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! #1675

Closed

2 tasks

fix ut

5bf95f0

lvhan028 mentioned this pull request Jun 17, 2024

Refine AsyncEngine exception handler #1789

Merged

lvhan028 merged commit 9dcae9b into InternLM:main Jun 17, 2024
5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

support qwen2 1.5b #1782

support qwen2 1.5b #1782

lvhan028 commented Jun 14, 2024

irexyc Jun 14, 2024

lvhan028 Jun 14, 2024 •

edited

Loading

lvhan028 commented Jun 14, 2024

zhyncs commented Jun 15, 2024

dawnranger commented Jun 17, 2024

lvhan028 commented Jun 17, 2024

support qwen2 1.5b #1782

support qwen2 1.5b #1782

Conversation

lvhan028 commented Jun 14, 2024

irexyc Jun 14, 2024

Choose a reason for hiding this comment

lvhan028 Jun 14, 2024 • edited Loading

Choose a reason for hiding this comment

lvhan028 commented Jun 14, 2024

zhyncs commented Jun 15, 2024

dawnranger commented Jun 17, 2024

lvhan028 commented Jun 17, 2024

lvhan028 Jun 14, 2024 •

edited

Loading