Ensure 'name' on initial message #2635

marklysze · 2024-05-09T09:54:41Z

Why are these changes needed?

When initiating a chat through ConversableAgent's initiate_chat the passed in message in the conversation doesn't get the name of the agent initiating the conversation attached to it. This is, then, not passed through to the LLM and it cannot use that name information.

So, running this block of code:

cathy = ConversableAgent(
    "cathy",
    system_message="Your name is Cathy and you are a part of a duo of comedians.",
    llm_config={"config_list": config_list, "cache_seed": None},
    human_input_mode="NEVER",  # Never ask for human input.
)

joe = ConversableAgent(
    "joe",
    system_message="Your name is Joe and you are a part of a duo of comedians.",
    llm_config={"config_list": config_list, "cache_seed": None},
    human_input_mode="NEVER",  # Never ask for human input.
)

result = joe.initiate_chat(cathy, message="Tell me a joke and include both of our names in it!", max_turns=2)

will result in a list of messages to the LLM like this:

"[
{'content': 'Your name is Cathy and you are a part of a duo of comedians.', 'role': 'system'},
{'content': 'Tell me a joke and include both of our names in it!', 'role': 'user'}
]"

I would expect there to be a name key, with the value joe, attached to the second message as that agent initiated the chat.

The response from the LLM is:
'Why did Cathy and her comedy partner decide to open a bakery together?\n\nBecause they realized they "dough"n\'t just make people laugh, they can also make them delicious pastries!'

... and it can be seen that it doesn't reference Joe's name.

By including the name key/value in the messages, like this:

"[
{'content': 'Your name is Cathy and you are a part of a duo of comedians.', 'role': 'system'},
{'content': 'Tell me a joke and include both of our names in it!', 'role': 'user', 'name': 'joe'}]"

... the LLM can use the name and returns this:
'Why did Cathy and Joe go to the comedy club?\n\nBecause they heard it was a great place for two funny people to crack jokes and make people laugh!'

... indicating that the name field is utilised (and LLMs aren't great at jokes!).

To fix this, the ConversableAgent's _append_oai_message has been updated to add the name field, if it doesn't exist and is not a function/tool message. To support this addition, an is_sending parameter is used on the function to indicate whether the self agent or conversation_id agent is the sender and, hence, the name to attach to the message.

I have not added this to documentation or updated tests for this. Please let me know if it needs specific tests added.

Related issue number

No related issue.

Checks

I've included any doc changes needed for https://microsoft.github.io/autogen/. See https://microsoft.github.io/autogen/docs/Contribute#documentation to build and test documentation locally.
I've added tests (if relevant) corresponding to the changes introduced in this PR.
I've made sure all auto checks have passed.

marklysze · 2024-05-09T20:00:12Z

I've updated existing test cases that now show the name on the messages.

ekzhu · 2024-05-10T23:02:13Z

Thanks! I think this addition is useful! Have you tested on non-OpenAI but OpenAI compatible endpoints? E.g., LiteLLM + Ollama.

marklysze · 2024-05-11T01:07:37Z

Thanks! I think this addition is useful! Have you tested on non-OpenAI but OpenAI compatible endpoints? E.g., LiteLLM + Ollama.

I haven't yet, I've been trying to clarify with LiteLLM dev and they don't think Ollama takes it in. I will run this test program and see if it does utilise name. If not, then we definitely need to consider getting the name into the content, either through message transforms or some other way.

marklysze · 2024-05-11T08:40:56Z

Okay, testing with LiteLLM + Ollama...

Name included in messages:

[
{'content': 'Your name is Cathy and you are a part of a duo of comedians.', 'role': 'system'},
{'content': 'Tell me a joke and include both of our names in it!', 'role': 'user', 'name': 'joe'}
]

Output

joe (to cathy):

Tell me a joke and include both of our names in it!

--------------------------------------------------------------------------------
cathy (to joe):

Here's one:

Why did Cathy and I go to the doctor?

Because we were feeling a little "off-beat"! Get it? Off-beat, like a comedy duo that's always trying to be funny!

Your turn, can you come up with a joke about us too?

So, no name included.

Tried again with a more direct question.

joe (to cathy):

Tell me my name.

--------------------------------------------------------------------------------
cathy (to joe):

Your name is Cathy, and you're one half of the hilarious comedy duo, along with your partner in crime (and laughter)!

Again, no name included and it also referred to Joe as Cathy. I noticed this on a couple of different message sets.

Now, will try with adding the name directly into the content.

[
{'content': 'You are a person named Cathy.', 'role': 'system'},
{'content': 'joe said:\nTell me my name.', 'role': 'user', 'name': 'joe'}
]

joe (to cathy):

joe said:
Tell me my name.

--------------------------------------------------------------------------------
cathy (to joe):

Your name is Cathy.

--------------------------------------------------------------------------------

So, adding joe said: in the message didn't help.

Now, adding a more direct system message about there being multiple people in the conversation.

[
{'content': 'You are a person named Cathy. You are in a conversation with other people.', 'role': 'system'},
{'content': 'joe said:\nTell me my name.', 'role': 'user', 'name': 'joe'}
]

joe (to cathy):

joe said:
Tell me my name.

--------------------------------------------------------------------------------
cathy (to joe):

Joe, your friend asked you to tell him his own name? I think there might be some confusion here! But if Joe really wants to know, I can tell him that his name is... Joe!

--------------------------------------------------------------------------------

This shows improvement, but it shows that the cathy agent thinks Joe was asking someone else.

So, we'll try a different set of messages to try and get them to have a conversation with each other. In this one we put in who is speaking to who.

[
{'content': 'You are a person named Cathy. You are in a conversation with other people.', 'role': 'system'},
{'content': 'Joe says to Cathy:\nTell me my name.', 'role': 'user', 'name': 'joe'}
]

And by adding Joe says to Cathy: the result is

joe (to cathy):

Joe says to Cathy:
Tell me my name.

--------------------------------------------------------------------------------
cathy (to joe):

Hi Joe! Your name is Joe, right?

--------------------------------------------------------------------------------
joe (to cathy):

That's correct! My name is indeed Joe. Thanks for reminding me!

--------------------------------------------------------------------------------
cathy (to joe):

You're welcome, Joe! I'm glad I could help you remember your own name.

--------------------------------------------------------------------------------

Looks much better, they act as their namesakes and feel like they are interacting with each other.

I also ran this again and reduced the system messages back to You are a person named Cathy., similar output, it felt like a conversation with each other and they replied with the name Joe.

Okay - so I think we can make the following observations fore LiteLLM+Ollama and likely other non-OpenAI setups:

The name field isn't assisting the LLM in knowing who is speaking.
Without the name field, there needs to be names and, I believe, conversation direction in the content field, such as Joe said to Cathy: or Joe said to you:.

I'll continue to try different system messages and content adjustments, but I think we would need to be able to transform messages and be able to put in the sender name and the recipient name.

Here's the joke one again with Joe says to Cathy: added:

joe (to cathy):

Joe says to Cathy:
Tell me a joke and include both of our names in it!

--------------------------------------------------------------------------------
cathy (to joe):

Cathy: Okay, Joe! Here's one:

Why did I, Cathy, bring my comedy partner, Joe, to the doctor?

Because we were feeling a little "off-beat" and wanted to get to the root of our comedic problems! (get it? off-beat, like in music, but also we're comedians... haha)

How was that, Joe? Did I make you laugh?

--------------------------------------------------------------------------------
joe (to cathy):

Joe: Ahahahaha! Oh man, Cathy, that's a great one! You really "punched" me with that joke! I love how you incorporated our names and the whole off-beat music pun. You're a comedic genius!

By the way, want to come up with another one? Maybe something about our latest comedy routine or our crazy fans?

--------------------------------------------------------------------------------
cathy (to joe):

Cathy: Ahahahaha, thanks Joe! I'm glad I could "hit the right note" with that one!

Oh, absolutely! Let's do it! What about this: Why did we, Cathy and Joe, have to kick our crazy fan out of our comedy club?

(Wait for it...)

Because she was trying to "steal the show"... by stealing all the props and improvising her own routine on stage!

Haha, what do you think? Did I "bring the house down" with that one?

--------------------------------------------------------------------------------

ekzhu · 2024-05-11T16:59:00Z

Thanks for the update! This is indeed very interesting. Looks like adding the name field doesn't break the set up. So that's a good.

For endpoints that bark at the name field, we can build a translation layer through model clients.

ekzhu · 2024-05-13T10:07:59Z

Does the name field also plays nice with LM Studio and Mistral AI API -- probably not used, but not breaking.

marklysze · 2024-05-14T08:01:08Z

Does the name field also plays nice with LM Studio and Mistral AI API -- probably not used, but not breaking.

In LM Studio it's accepted, but it's not being used.

For Mistral AI API, I'll check that.

On a side note with LM Studio, it won't accept messages where content is blank, which does happen sometimes with local LLMs which I'm investigating.

marklysze · 2024-05-20T05:26:40Z

Does the name field also plays nice with LM Studio and Mistral AI API -- probably not used, but not breaking.

@ekzhu, I've tested the Mistral models through Together.ai and the "name" field is being accepted (not breaking), but isn't being used. The only way to get the name known is to inject it into the content itself.

However, testing it through Mistral.ai's API it did fail when it has the name key on any message.

Through the testing, to me it seems that having the name key on the message isn't useful for non-OpenAI scenarios. Was there any plan or ideas on being able to have the name key removed? It seems like a broad requirement.

ekzhu · 2024-05-21T17:37:08Z

Through the testing, to me it seems that having the name key on the message isn't useful for non-OpenAI scenarios. Was there any plan or ideas on being able to have the name key removed? It seems like a broad requirement.

I feel one way to do this is to have a built-in client for each API endpoint, similar to AzureOpenAI so user can specify the API type using api_style.

marklysze · 2024-05-21T18:57:34Z

@ekzhu, I've created a PR #2748 regarding Mistral.AI.

Separate to that, the changes in this PR would be good to have reviewed as the code is currently missing adding name to the initial message created through initiate_chat.

ekzhu

Thanks! The analysis of alt APIs are super helpful. We can merge this once the #2748 is addressed -- so that the changes in this PR is not going to break two agent chats for Mistral AI API.

marklysze · 2024-05-24T07:53:43Z

Included "name" on messages in select speaker nested chat, as per this comment.

Update to ensure name on initial messages

26711c1

marklysze self-assigned this May 9, 2024

marklysze requested a deployment to openai1 May 9, 2024 09:54 — with GitHub Actions Waiting

marklysze requested a review from ekzhu May 9, 2024 19:41

Corrected test cases for messages now including names.

8074f30

marklysze requested a deployment to openai1 May 9, 2024 19:59 — with GitHub Actions Waiting

ekzhu added the llm issues related to LLM label May 11, 2024

marklysze mentioned this pull request May 11, 2024

Ignore Some Messages When Transforming #2661

Merged

3 tasks

sonichi requested review from davorrunje and GregorD1A1 May 19, 2024 13:38

marklysze mentioned this pull request May 21, 2024

[Bug]: 'name' field on messages not working when using Mistral.AI's API #2748

Open

ekzhu approved these changes May 22, 2024

View reviewed changes

ekzhu mentioned this pull request May 22, 2024

Ability to add MessageTransforms to the GroupChat's Select Speaker nested chat (speaker_selection_method='auto') #2719

Open

3 tasks

Added name to messages within select speaker nested chat

fa0d560

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ensure 'name' on initial message #2635

Ensure 'name' on initial message #2635

marklysze commented May 9, 2024 •

edited

marklysze commented May 9, 2024

ekzhu commented May 10, 2024

marklysze commented May 11, 2024

marklysze commented May 11, 2024 •

edited

ekzhu commented May 11, 2024

ekzhu commented May 13, 2024 •

edited

marklysze commented May 14, 2024

marklysze commented May 20, 2024

ekzhu commented May 21, 2024

marklysze commented May 21, 2024

ekzhu left a comment

marklysze commented May 24, 2024

Ensure 'name' on initial message #2635

Are you sure you want to change the base?

Ensure 'name' on initial message #2635

Conversation

marklysze commented May 9, 2024 • edited

Why are these changes needed?

Related issue number

Checks

marklysze commented May 9, 2024

ekzhu commented May 10, 2024

marklysze commented May 11, 2024

marklysze commented May 11, 2024 • edited

ekzhu commented May 11, 2024

ekzhu commented May 13, 2024 • edited

marklysze commented May 14, 2024

marklysze commented May 20, 2024

ekzhu commented May 21, 2024

marklysze commented May 21, 2024

ekzhu left a comment

Choose a reason for hiding this comment

marklysze commented May 24, 2024

marklysze commented May 9, 2024 •

edited

marklysze commented May 11, 2024 •

edited

ekzhu commented May 13, 2024 •

edited