Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: update evaluation flow sample for abstractive summarization with g-eval method to enable GPT-4-Turbo #3317

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

fujikosu
Copy link
Member

Description

This PR updates a evaluation flow example that was introduced by #2037. This example only supported GPT-4 previously as GPT-4-Turbo was showing poor performance with previous approach. With this update, GPT-4-Turbo is introduced and meta-evaluated along with the implementation update from sampling based approach to weighted average over probability approach. New implementation outperformed previous evaluation performance according to meta-evaluation result. Besides, this new approach reduces estimated cost of evaluation from $6.19 to $1.32 per 100 documents.

Previous approach is still kept under sampling_based directory to provide backward compatibility with GPT-4 evaluator and reference for meta-evaluation

All Promptflow Contribution checklist:

  • The pull request does not introduce [breaking changes].
  • CHANGELOG is updated for new features, bug fixes or other significant changes.
  • I have read the contribution guidelines.
  • Create an issue and link to the pull request to get dedicated review from promptflow team. Learn more: suggested workflow.

General Guidelines and Best Practices

  • Title of the pull request is clear and informative.
  • There are a small number of commits, each of which have an informative message. This means that previously merged commits do not appear in the history of the PR. For more information on cleaning up the commits in your PR, see this page.

Testing Guidelines

  • Pull request includes test coverage for the included changes.

@fujikosu fujikosu requested a review from a team as a code owner May 21, 2024 09:24
@github-actions github-actions bot added the examples Improvements on examples label May 21, 2024
Copy link

github-actions bot commented Jun 4, 2024

Hi, thank you for your interest in helping to improve the prompt flow experience and for your contribution. We've noticed that there hasn't been recent engagement on this pull request. If this is still an active work stream, please let us know by pushing some changes or leaving a comment.

@github-actions github-actions bot added the no-recent-activity There has been no recent activity on this issue/pull request label Jun 4, 2024
@github-actions github-actions bot removed the no-recent-activity There has been no recent activity on this issue/pull request label Jun 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
examples Improvements on examples
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant