Docs request: where does histogram come from? #738

jamesbraza · 2024-04-30T06:58:26Z

I have three possible scores: 0, 0.1, and 1 for a Python assertion, and two basic assertions.

providers:
  - openai:chat:gpt-4-0613
  - openai:chat:gpt-4-turbo-2024-04-09
  - anthropic:messages:claude-3-sonnet-20240229
defaultTest:
  assert:
    - description: was answered
      type: not-icontains
      value: cannot answer
    - description: has sentences
      type: javascript
      value: output.length > 20
    - description: check value
      type: python
      value: file://assert.py

At the top of my promptfoo view, I see bins around 0.6 and 0.7, which isn't quite making sense to me:

The request is, can we add a little description such that this figure is easy to understand.

I have three different model providers, is that where Prompt 1 (red), Prompt 2 (blue), and Prompt 3 (green) come from?
Why does the histogram show scores of 0.6 and 0.7? Is that like a sum of multiple assertions' scores?

The text was updated successfully, but these errors were encountered:

jamesbraza · 2024-04-30T18:18:00Z

I now understand that I have three assertions:

Two binary ones: can be score 0 or 1
One custom assertion: can be score 0, 0.1, 1

I realized the histogram plots mean score: 0.7 = (1 + 1 + 0.1) / 3

That being said, I still think perhaps promptfoo can add a little info bubble or hover-over/tooltip that explains this.

Feel free to close this out if uninterested

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Docs request: where does histogram come from? #738

Docs request: where does histogram come from? #738

jamesbraza commented Apr 30, 2024

jamesbraza commented Apr 30, 2024

Docs request: where does histogram come from? #738

Docs request: where does histogram come from? #738

Comments

jamesbraza commented Apr 30, 2024

jamesbraza commented Apr 30, 2024