[Feature]: Length-based pricing for Google models #4229

areibman · 2024-06-16T22:24:16Z

The Feature

Some models price tokens differently based on the length of the prompt. It would be helpful to potentially restructure or add fields to the model price dictionary to account for this.

This could look something like:

{
    "gemini-1.5-flash-latest": {
        "max_tokens": 8192,
        "max_input_tokens": 1000000,
        "max_input_tokens_short": 128000,
        "max_output_tokens": 8192,
        "input_cost_per_token_short": 3.5e-07,
        "input_cost_per_token_long": 7e-07,
        "output_cost_per_token_short": 1.0500000000000001e-06,
        "output_cost_per_token_long": 2.1000000000000002e-06,
        "litellm_provider": "vertex_ai-language-models",
        "mode": "chat",
        "supports_function_calling": true,
        "supports_vision": true,
        "source": "https://ai.google.dev/pricing"
    }
}

Motivation, pitch

Google models price tokens differently for prompts >128k tokens. According to https://ai.google.dev/pricing:

This was brought up in AgentOps-AI/tokencost#53 which relies on the LiteLLM cost tracker

Twitter / LinkedIn details

https://www.twitter.com/alexreibman

The text was updated successfully, but these errors were encountered:

krrishdholakia · 2024-06-17T15:09:27Z

we can be more specific here @areibman - since it's not standard yet what 'long' and 'short' mean.

This seems similar to how tgai pricing works based on token params - what if we do input_cost_per_token_up_to_128k and input_cost_per_token_above_128k ?

areibman · 2024-06-17T17:00:03Z

we can be more specific here @areibman - since it's not standard yet what 'long' and 'short' mean.

This seems similar to how tgai pricing works based on token params - what if we do input_cost_per_token_up_to_128k and input_cost_per_token_above_128k ?

That would probably work! Only precaution I can think of is if some providers start providing multiple tier pricing per model I.e. <128k, 128k - 256k, 256k - 512k etc.

This should be as easy as updating the proxies json, no?

areibman added the enhancement New feature or request label Jun 16, 2024

areibman mentioned this issue Jun 16, 2024

Length specific pricing bands for gemini-1.5-flash-latest AgentOps-AI/tokencost#53

Open

This was referenced Jun 17, 2024

Add Gemini context window pricing #4243

Merged

VertexAI/Gemini: Calculate cost based on context window #4245

Merged

krrishdholakia closed this as completed in #4243 Jun 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature]: Length-based pricing for Google models #4229

[Feature]: Length-based pricing for Google models #4229

areibman commented Jun 16, 2024

krrishdholakia commented Jun 17, 2024

areibman commented Jun 17, 2024

[Feature]: Length-based pricing for Google models #4229

[Feature]: Length-based pricing for Google models #4229

Comments

areibman commented Jun 16, 2024

The Feature

Motivation, pitch

Twitter / LinkedIn details

krrishdholakia commented Jun 17, 2024

areibman commented Jun 17, 2024