Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sync history to stream #2640

Draft
wants to merge 4 commits into
base: main
Choose a base branch
from
Draft

Sync history to stream #2640

wants to merge 4 commits into from

Conversation

enyst
Copy link
Collaborator

@enyst enyst commented Jun 26, 2024

In a recent PR #2595 we have fixed an issue with errors when the budget check is done in the controller, and it seems that now the flow isn't quite as expected: an event is added to history, a pending action may be stored (waiting for an obs), then the budget check is done and if it's not right (error) then we return, before adding that event to the stream.

I think that if an event is in history it should be because it's in the stream. Otherwise, I don't know, it's... a hallucination. 😅 This PR proposes to do it the other way around, add the event, then check budget or what else we need to check.

In addition, adding it will create the observation if suitable, and thus complete the step. Also, the pending action doesn't seem reset in any other way than doing this, currently. An error afterwards will still stop the next step.

@enyst enyst requested a review from li-boxuan June 26, 2024 00:16
@@ -319,13 +319,13 @@ async def _step(self):
else:
await self.add_history(action, NullObservation(''))

if not isinstance(action, NullAction):
await self.event_stream.add_event(action, EventSource.AGENT)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would cause some trouble for delegation at least. When you add an event to the stream, its callback is called. In delegation scenario, the control is then handed over to child agent (controller). Even if we just exceeded the budget during this step, the child agent would still continue because we haven't computed and checked budget yet.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this scenario you're thinking of at the beginning of delegation, so when the event is a delegate action?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, when event is AgentDelegateAction

Copy link
Collaborator Author

@enyst enyst Jun 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, we could stop it at some point ( perhaps on_event or follow up), though I'm not sure it will be pretty. Or when the controller is about to start and initializes the budget (oops error). I guess we could even, if it seems better, set the agent to ERROR, then add the event; to check for that state and stop the process. I'm not anymore at the computer, will give it some thought. If we can't, IMHO an alternative is to cancel, that is, this action "doesn't exist". I still don't see it in history unless it's in the stream...

(On a side note, the funny thing here is that I literally couldn't merge the current logic. Because histories are copies of the event stream in main, and the copies will vanish when we refactor to have the event stream as the source of truth. There's no 'space' between add_history and add_event anymore 😅 )

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the worst case, we could merge this PR as it is, and don't care about the budget thing I mentioned. It won't be ideal but not too bad.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would agree, but maybe the simplest is to just move the budget check before the call to the agent step. It seems like it would work, to stop the next step... Will check.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but maybe the simplest is to just move the budget check before the call to the agent step

Oh yeah sounds like a plan.

@enyst
Copy link
Collaborator Author

enyst commented Jun 27, 2024

@li-boxuan I think we can't merge this, it doesn't work with delegates errors.

First, the issue we talked about is on main too, and there's a bit more, the behavior I see is like:

  • it doesn't seem to reset the task/state, so the accumulated cost continues to apply. On main, one more action over budget goes through every step, it just happens starting from the next step, not the one with the error
  • it blocks the ability of the user to ask for some other task, until they refresh or press stop (wiping history); even though the sandbox was restarted after the error, so the controller was re-initialized. (the reason for the latter might be that it's saved in state, and the state is restored implicitly if it can)

On main, with 0.1 budget per task:

Image 2024-06-26 at 23 34

23:31:42 - opendevin:INFO: llm.py:239 - Cost: 0.02 USD | Accumulated Cost: 0.13 USD
23:31:42 - ACTION
**IPythonRunCellAction**
THOUGHT: I will first print the number 7, and then I will find out the current Vice President of the USA.
CODE:
print(7)
23:31:42 - OBSERVATION
ErrorObservation(content='Task budget exceeded. Current cost: 0.13, Max budget: 0.10', observation='error')
23:31:42 - opendevin:INFO: agent_controller.py:196 - [Agent Controller 65d1bfe0-16c7-4690-a534-5f8928d2163b] Setting agent(CodeActAgent) state from AgentState.RUNNING to AgentState.ERROR

But on the PR branch the delegates errors just don't work at all:

  • when it gets the error, it gets "frozen". I want to say what happens has to do with the timing of the async. I think it leads to some deadlock where this error code got executed, the delegate got closed, but the parent still keeps a reference to it (not sure why we don't set it to None), so the parent still forwards to it here. The delegate is in STOPPED state (iirc) here, in which case this just goes back looping. So the two controllers just invite each other to step up and no one dares. 😅
  • it's not clear to me what exactly is missing there. Maybe we could consider ERROR state a sub-case of delegate_done along with FINISHED and REJECTED and try to finish it off in that block? After all, the delegate really won't continue, I suppose.

Either way, I think we can't merge this, because it breaks delegates errors worse than minor issues on main. FWIW I played a bit with it at the end of the step, and at the start of the step, same freeze. Gonna have to try to see if I broke it too when I merged... 😅

@li-boxuan
Copy link
Collaborator

not sure why we don't set it to None

I think we could - would that solve the problem?

@neubig neubig marked this pull request as draft June 27, 2024 08:17
Copy link
Contributor

@neubig neubig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I converted this to a draft until we've resolved the issue with delegation!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants