-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add GPTSwarm (Graph-based Workflow) #2460
base: main
Are you sure you want to change the base?
Conversation
Just about your last question, have you looked at Perplexity.ai? They also have online search and and API etc. |
Can you explain more on this? What do you mean agentskill function currently returns nothing.
For example what file readers are need?
Will take a look on this, but I may check your paper firstly. I find you didn't implememnt the
Actually I am thinking, if currently agent skill and web navigation are curently not well done. What is the good point that opendevin can help? I am supprise that opendevin can achive that performance under above problem 🤔
Will help recently when I am free. |
@mczhuge If you have no time in the future, maybe I will directly push on your PR. Or I can only add some review comments. WDYT? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks interesting! I hope we can get this agent incorporated.
However, I am a bit worried about all the dependencies that we would inherit from gptswarm:
https://github.com/metauto-ai/GPTSwarm/blob/ac058b80158027da8ffad8ed0bd464dc952eaa49/pyproject.toml#L23-L66
Maybe we can separate out gptswarm
to be a different group in poetry that is optionally installed only if we want to use gptswarm.
@mczhuge or @yufansong , would either of you be able to see this to completion now? |
Thanks, any simple code demo to use? |
I re-writte the comments a bit: #2460 (comment) Regarding I think the mechanism of OD is to output information in the terminal, and then CodeActAgent gets such information as an observation from the terminal. Right? Currently, I do not follow this and just want to directly get a return value from a function. However, I think OD GPTSwarm could also follow the CodeActAgent observation style. Regarding In the original GPTSwarm readers, many different readers are used, but in #1914 and the following #2016, some of them have been deleted. This likely influences more than the potentially solvable 2-5 questions, making them unsolvable in GAIA. Regarding Current OD GPTSwarm is followed some predefined human SOP or workflow in a graph-based design. I think OD GPTSwarm could fully follow the agenthub style (it may not be hard), but I am personally not 100% familiar with this, so I think help is needed here. Regarding I think the current web navigation is good but specific to some benchmarks. For GAIA or many applications, just returning compressed but informative observations is enough. |
Of course you can!! 🥰 The first several days as an intern may require some time for onboarding and settling in. I also have time to discuss more and may also contribute some code. If you could help push this PR forward, that would be super great! Two potential new features may come to OpenDevin if we finish this PR:
|
Hi @neubig , some of them has already merged in OD. See #1914 and #2016. Currently, we only need to improve the usage. |
There is an actual Python package that would make it easier: Their sample code on that page reads like this: client = PerplexityClient() \
print(client.query('What is the meaning of 42?') \
for result in client.queryStreamable('List of all US presidents'): \
print(result) Hope, this helps. 😃 |
If @mczhuge have no time, I am glad to take this task. |
Great Yufan! Please go ahead, I am on the intern orientation |
NOTE: GPTSwarm need manually export... | ||
TODO: Fix it | ||
export SANDBOX_ENV_OPENAI_API_KEY="sk-***" | ||
""" | ||
OPENAI_API_KEY = os.getenv( | ||
'OPENAI_API_KEY', os.getenv('SANDBOX_ENV_OPENAI_API_KEY', '') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Within the sandbox, this would be exported again without the prefix, i.e. as "OPENAI_API_KEY".
Sorry for the late response. I have been on vacation for the last 10 days.
I have merged GPTSwarm (ICML 2024 Oral Presentation) into this codebase. More details can be found in the Paper and the Code.
Despite this progress, there are five areas that require further enhancement to improve users' experiences with OpenDevin:
agentskill
Functionality: Theagentskill
function currently returns nothing. This issue arises because OpenDevin receives observations from the terminal. GAIA requires numerous file readers to retrieve return information. Consequently, the current OD GPTSwarm v1.0 is impacted. For instance, the original GPTSwarm can easily achieve 32.07% in GAIA, but due to the lack of multiple file readers, the best performance of OD GPTSwarm is also 32.07%.Stateless to Stateful Transition: The original GPTSwarm is stateless, whereas the current OD GPTSwarm could be improved by making it stateful. This improvement requires collaboration with others. It is possible that Q1 could be resolved by this transition, as OD receives observations from terminal output rather than a definitive return from a function. Does this assessment seem accurate?
Web Navigation Skills: My initial use of web navigation skills resulted in very long context returns, which negatively impacted GAIA's performance. Consequently, I refrained from using it. However, as stated in the original paper and our previous paper (which also emphasized the importance of web navigation for performance improvement but did not include it), web navigation can significantly enhance performance.
Web Searching Solutions: In line with the original GPTSwarm paper, I spent $40 to acquire searchapi for web searching. Could you recommend any free and high-quality web searching alternatives?
Autonomous Pattern and SOP Integration: The current agent operates in a fully autonomous manner, but at times, an SOP or a human-predefined workflow (similar to MetaGPT) proves beneficial. For example, CodeActAgent resolves a GAIA problem using $5, whereas GPTSwarm solved 53 problems for under $5. Further improvements to OD GPTSwarm could enable OpenDevin to incorporate graph-based SOP and related functionalities.
I think better improving OD GPTSwarm can make OpenDevin has graph-based SOP and related functions.