You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When running llama3.mojo compiled in AOT mode (using 'mojo build'), error output occurs, even using --no-optimization, most important is that the error output token is different for each run(most common is repeating output wrong token). However, when using JIT mode (using 'mojo run'), no issues are encountered.
Steps to reproduce
The JIT mode is following the steps in Run inference. Here's the normal outputs:
$ mojo run llama3q.mojo llama3_8b_instruct_q80.bin -z tokenizer.bin -i "The planets of the solar system are" -n 128 -t 0
num parallel workers: 8 SIMD width: float32: 64 int32: 64 int8: 256
Reading weights...
header done, bytes read: 256
rms_att_weight done, bytes read: 524544
rms_ffn_weight done, bytes read: 1048832
rms_final_weight done, bytes read: 1065216
q_token_embedding_table done, bytes read: 559235328
token_embedding_table done, bytes read: 559235328
dequantize token_embedding_table done, bytes read: 559235328
wq, wk, wv, wo done, bytes read: 1985298688
w1, w2, w3 done, bytes read: 7974764800
wcls done, bytes read: 8532934912
n layers: 32 | vocab size: 128256
The planets of the solar system are the eight celestial bodies that orbit around the Sun. They are:
1. Mercury: The smallest planet in our solar system, Mercury is a rocky, barren world with a highly elliptical orbit.
2. Venus: The second planet from the Sun, Venus is a scorching hot world with a thick atmosphere that traps heat.
3. Earth: The third planet from the Sun, Earth is a terrestrial planet with a diverse range of environments and life forms.
4. Mars: The fourth planet from the Sun, Mars is a rocky, barren world with a thin atmosphere and a potential for life.
The AOT mode use mojo build --no-optimization llama3q.mojo to build the binary file and execute the file with same params.
$ mojo build --no-optimization llama3q.mojo
$ ./llama3q llama3_8b_instruct_q80.bin -z tokenizer.bin -i "The planets of the solar system are" -n 128 -t 0
num parallel workers: 8 SIMD width: float32: 64 int32: 64 int8: 256
Reading weights...
header done, bytes read: 256
rms_att_weight done, bytes read: 524544
rms_ffn_weight done, bytes read: 1048832
rms_final_weight done, bytes read: 1065216
q_token_embedding_table done, bytes read: 559235328
token_embedding_table done, bytes read: 559235328
dequantize token_embedding_table done, bytes read: 559235328
wq, wk, wv, wo done, bytes read: 1985298688
w1, w2, w3 done, bytes read: 7974764800
wcls done, bytes read: 8532934912
n layers: 32 | vocab size: 128256
The planets of the solar system areinerinerinerinerineriner
Bug description
When running llama3.mojo compiled in AOT mode (using 'mojo build'), error output occurs, even using
--no-optimization
, most important is that the error output token is different for each run(most common is repeating output wrong token). However, when using JIT mode (using 'mojo run'), no issues are encountered.Steps to reproduce
The JIT mode is following the steps in Run inference. Here's the normal outputs:
The AOT mode use
mojo build --no-optimization llama3q.mojo
to build the binary file and execute the file with same params.System information
mojo -v
modular -v
mojo build --sanitize address llama3q.mojo
OnlinePaste
The text was updated successfully, but these errors were encountered: