You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Generate a large garbage string of text. In JavaScript: const content = '.'.repeat(1e6)
Call the /tokenize endpoint with that text about 20 times. (I used the ember_v1 model)
Notice the text-embeddings-inference process consumes all available CPU and RAM.
Expected behavior
Options:
/tokenize has a timeout constraint (either hardcoded, set by env var, or passed in as argument)
A validator sits in front of the model to detect nonsensical/nefarious inputs
/tokenize returns a 413 if content > e.g. max_batch_tokens * 5 (or similar). Least preferred option since it's nice to get a complete token length as a starting point for chunking strategies.
The text was updated successfully, but these errors were encountered:
System Info
Apple M2 Pro 14.2.1 (23C71)
cargo 1.75.0 (1d8b05cdd 2023-11-20)
Information
Tasks
Reproduction
const content = '.'.repeat(1e6)
/tokenize
endpoint with that text about 20 times. (I used the ember_v1 model)Expected behavior
Options:
The text was updated successfully, but these errors were encountered: