-
Notifications
You must be signed in to change notification settings - Fork 728
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEATURE] [馃拵 $400 bounty] Integrate SearchApi as a WebSearchEngine and as a Tool #1132
Comments
馃拵 SearchApi is offering a $400 bounty for this issue |
@mark-searchapi can i get this assigned? |
Given the speed at which we're moving, we don't assign issues or "give" issues to anyone. @abhishek818 In the event of multiple attempts, the one that is going to be merged by the maintainers of the repository will pick the bounty. |
Hello @mark-searchapi, I just started work on the SearchApi integration as a search engine and tool. |
Please adhere to these two implementations: |
Yes of course. |
@jemiluv8 |
@ayewo. Just wanted to know so I may halt my attempt since I just started. |
Hi, was setting this up. Just confused as to how I am supposed to set up env like OPENAI_API_KEY. |
I have raised a draft PR with my current work. Need help with this to continue. Also pls assign this to me if satisfactory |
The comment by @mark-searchapi above clearly says:
|
@abhishek818 why do you need open ai key? In any way, you can register with openai and get it there: https://github.com/langchain4j/langchain4j?tab=readme-ov-file#how-to-get-an-api-key |
Hi @mark-searchapi The docs itself lists about 34 API examples which I pulled out into the list of JSON elements below:
Since the task here is to "Integrate SearchApi as a The
Can you clarify which of the JSON elements are to scope and which ones are to be skipped for this issue? |
Hey @mark-searchapi, looks like the |
how many tests should one write for this? |
@ayewo I have reviewed the Langchain4j codebase and can provide the following insights: WebSearchEngine as ContentRetriever Integration (for RAG use case)After #642 PR, the core implementation seems to provide:
Cons:
I think the most important part of WebSearchResults is WebSearchOrganicResult. To construct it, we iterate through the
{
"organic_results": [
{
"title": "String",
"link": "String",
"snippet": "String",
...
},
...
]
} If you need error handling, rely on the After implementing SearchApi as a core WebSearchEngine, the basic tool should work out of the box: googleSearchApi = SearchApiWebSearchEngine.builder()
.apiKey(System.getenv("SEARCHAPI_API_KEY"))
.engine("google")
WebSearchTool webSearchTool = WebSearchTool.from(googleSearchEngine);
... Integration as a Tool (Function calling)
However, it does not seem to be very useful when used as a function since it constructs the String for LLMs only based on I think we should create independent tool definitions and start with a couple of engines for this issue:
This way we can control the final string that is being built. For instance, in youtube_transcripts engine, we are interested only in transcript text. You can also check sample Google Search string construction and Google News example in our recent other integration. You can also add extra fields to the tools that you think might be important for LLMs. Documentation and ExamplesThrowing some ideas:
Most of the stuff (apart custom tools) is already implemented in langchain4j Google Custom Search directory. Tests contains hints on how to implement examples for documentation (how to use web search engine, content retriever, tool, etc..). Hope the above information helps! |
@mark-searchapi I opened a PR with the idea of adding new search api engines in the future using an interface that handles the requests and responses |
@zambrinf, you can try using the default num parameter and verify that the total amount of organic results is greater than 0. This issue happens with the "LangChain4j" query because Google counts different elements toward the num parameter. WebSearchEngine results only consider organic results, so elements like inline videos are not included as organic results. |
@mark-searchapi I鈥檝e already implemented it as you described in my PR. Please have a look at #1215. |
Hi @mark-searchapi and everybody,
Speaking only of the organic contents. If the integration supports URL scrapping and is capable of retrieving the complete content of the website in For now, the You can check in this post the flow included in the v1 https://x.com/c_zela/status/1785522559791808650 I hope it gave a little more context and I welcome your ideas to include in v2 I'll put some comments on your PRs :) Thank you! |
@czelabueno I will then assign both PRs to you? 馃檹 |
Feature Overview
Integrate SearchApi as a
WebSearchEngine
and as aTool
for function calling.Requirements
Adhere to langchain4j contribution guidelines
Related work
Related issues:
Existing SearchApi integrations
Design considerations
SearchApi support not only Google Search, but 30+ other APIs such as Youtube Search, Transcripts, Bing Search (similar JSON response keys). In a statically typed language, it is probably better to have separate tools for separate APIs, but implementation could be flexible enough to extend SearchApi and easily adapt to other engines. All engines use the same HTTP GET request. The only difference is the parameters they accept. And how you want the response to be parsed.
Other notes
Bounty
There is a $400 bounty on this awarded by SearchApi to the community.
In the event of multiple attempts, the one that is going to be merged by the maintainers of the repository will pick the bounty.
The text was updated successfully, but these errors were encountered: