Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fetch surrounding chunks notebook #257

Open
sunilemanjee opened this issue Jun 2, 2024 · 3 comments
Open

Fetch surrounding chunks notebook #257

sunilemanjee opened this issue Jun 2, 2024 · 3 comments

Comments

@sunilemanjee
Copy link
Contributor

sunilemanjee commented Jun 2, 2024

I have built a notebook to demonstrate how to fetch surrounding chunks within Elasticsearch. This is not the only way to do it, but it is definitely a valid approach. I am interested in contributing this notebook to our examples

Notebook: Google Colab

The example uses text from a Harry Potter book, splitting it by chapter and then into chunks. Each chunk is a nested passage containing the text, dense, and sparse representations of the chunk.

When searching, the demo will fetch the matching passage chunk along with surrounding chunks. If the chunk is the first chunk in the chapter, it will fetch n, n+1, and n+2. If it is the last chunk, it will fetch n, n-1, and n-2. Otherwise it will fetch n-1, n, n+1.

@joemcelroy
Copy link
Member

Hey Sunile - Worth creating a search labs article on this. i would be hesitant that it should be something we advise and in future, we hope that semantic_text will have this option of expanding hit passages.

@sunilemanjee
Copy link
Contributor Author

sunilemanjee commented Jun 5, 2024 via email

@joemcelroy
Copy link
Member

joemcelroy commented Jun 6, 2024

Oh im fine with advising this approach given the current situation today. It makes sense to be in the supporting-blog-content folder however and search labs article giving visibility that they can do this.

When we have this baked into Elasticsearch, we can add a new example into notebooks which is a more formalised and supported way.

/supporting-blog-content = Search labs article + notebook example
/notebooks = product feature of elasticsearch thats well supported

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants