Support lightweight ingesting optional index #9036

JaySon-Huang · 2024-05-11T09:33:16Z

Enhancement

In some scenarios, we need to create new indexes for existing data to speed up queries. Typically this can be done by triggering a major compaction (Segment DeltaMerge) that creates new index data during the major compaction. However, this involves rewriting all the columns, which makes indexing slow and has a large impact on system IO and CPU usage.

To mitigate the impact of index creation on the system, we can create new indexes only for the stable layer data in a more lightweight way. Because the stable layer contains 95% of the whole dataset, usually indexing the stable layer can bring sufficient performance boosting.

The overall procedure:

Prepare - create a background task that reads only the columns relevant to index generation, then store the generated the index file to persisted storage.
Ingest - ingest the index file into stable's DMFile. In a DMFile that uses meta v2, this only requires modifying the information in the ExtendColumnStat block in meta v2. DMFile::ingestIndex creates a new meta v2 file and atomically replaces the old meta v2 file.
Apply - update the epoch value of the StableValueSpace in the Segment. Under store-compute separation, the ComputeNode needs to decide whether it needs to clean up the old DMFile meta file cache and re-download the file from S3 based on this value.

This ingesting index mechanism can help us handle indexes that are needed for Vector Search (#9032) / Full Text search, or other optional indexes.

ref #9036

ref #6233, ref #9036 Signed-off-by: Lloyd-Pottiger <[email protected]>

JaySon-Huang added the type/enhancement Issue or PR for enhancement label May 11, 2024

JaySon-Huang assigned Lloyd-Pottiger May 11, 2024

Lloyd-Pottiger mentioned this issue May 16, 2024

Storage: let stable meta using protobuf format #9054

Merged

12 tasks

ti-chi-bot bot pushed a commit that referenced this issue May 17, 2024

Storage: let stable meta using protobuf format (#9054)

d33545d

ref #9036

JaySon-Huang mentioned this issue May 28, 2024

*: fix Exception #9094

Merged

12 tasks

ti-chi-bot bot pushed a commit that referenced this issue May 28, 2024

*: fix Exception (#9094)

38ab3f9

ref #6233, ref #9036 Signed-off-by: Lloyd-Pottiger <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support lightweight ingesting optional index #9036

Support lightweight ingesting optional index #9036

JaySon-Huang commented May 11, 2024 •

edited

Loading

Support lightweight ingesting optional index #9036

Support lightweight ingesting optional index #9036

Comments

JaySon-Huang commented May 11, 2024 • edited Loading

Enhancement

JaySon-Huang commented May 11, 2024 •

edited

Loading