-
Notifications
You must be signed in to change notification settings - Fork 367
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve the memory usage of our own HLL implementation #8098
base: sliding_hyperloglog
Are you sure you want to change the base?
Conversation
3222 tests run: 3079 passed, 1 failed, 142 skipped (full report)Failures on Postgres 14
Test coverage report is not availableThe comment gets automatically updated with the latest test results
2faa0df at 2024-06-18T17:46:27.550Z :recycle: |
We don't use sliding windows, so we can just drop the historical snapshot requirement from the implementation, thus removing some tracking overhead.
b89e6fe
to
2faa0df
Compare
{ | ||
cState->regs[index].fpm[j++] = cState->regs[index].fpm[i]; | ||
} | ||
cState->regs[index][i] = now; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder why do we need tp assign current timestamp to all elements with index less or equal then count?
Why it is not enough just to assign:
cState->regs[index][count] = now;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This way we don't have to start at the end of the array of timestamps when we try to get the summary.
Your suggestion would work as well, but it'd require a scan from high counts, rather than low counts. As higher counts are less often set when the summary would result in a low distinct count, generating a summary could be more expensive on average.
I don't mind it going either way though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I do not understand it.
We need to find maximal count, right?
So your loop:
for (size_t i = 0; i < HLL_C_BITS + 1; i++)
{
if (reg[i] >= since)
{
max = i;
}
}
is actually locating the largest R with timestamp within specified period.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
correct, but by also updating all registers < count, you can do a binary search or linear search with early break, as all values in the array are sorted (with potential duplicates, but still, sorted).
for (size_t i = 0; i < HLL_C_BITS + 1; i++)
{
if (reg[i] >= since)
max = i;
else
break;
}
Thus saving a few compares for this bucket.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Binary search in array of size 22 seems to have no sense.
- Time of calculating estimation is not critical: it is needed rarely when autoscaler agent needs to update its working set estimation: once per second or even per minute.
- Time of updating hashes is much more critical because it is called on each SMGR read operation
So we should optimise addHLL
rather than getMaximum
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense.
We don't use sliding windows, so we can just drop the historical snapshot requirement from the implementation, thus removing the 50%+ tracking overhead.
A nice benefit is that this allows working set cardinality estimation across the full lifetime of the instance, rather than just a recent window. A demerit is that you can't take an arbitrary window, the end of any estimation window must be current or in the future.
Problem
Summary of changes
Checklist before requesting a review
Checklist before merging