-
Notifications
You must be signed in to change notification settings - Fork 556
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
storage: fallocate at least one chunk to avoid appender hangs #20131
base: dev
Are you sure you want to change the base?
Conversation
In the `segment_appender::do_append` method we call `do_next_adaptive_fallocation` repeatedly until there is enough space in the file for the incoming operation. If `segment_fallocation_step` is set to 0 then this would result in an infinite loop. The code comments hint that there is another case when the fallocation step is 0, when the disk is full. However, `storage_resources::calc_falloc_step` clamps the minimum to append chunk size too: ``` // At the minimum, falloc one chunk's worth of space. if (step < min_falloc_step) { // If we have less than the minimum step, don't both falloc'ing at all. step = _append_chunk_size; } ```
54c9601
to
4f433d7
Compare
a potential alternative is to return the fallocated byte count and if 0 avoid looping at redpanda/src/v/storage/segment_appender.cc Line 164 in f007c20
|
vassert( | ||
step != 0, "falloc step must be non-zero for appender to make progress"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this be a post-condition on get_falloc_step
?
// Don't fallocate. This happens if we're low on disk, or if | ||
// the user has configured a 0 max falloc step. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ahh yeh this was probably present prior to adaptive falloc step work
return step; | ||
|
||
// Allocate at least one chunk's worth of space. | ||
return std::max(step, _append_chunk_size); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we look to see if there are any clusters out there that have a configured chunk size would be too large? i see the max in the configuration.cc is 32mb, but that seems rediculous. but even smaller values like 1mb, that are a lot larger than what i ewould expect in practice (like 32kb) could make this dynamic fallocation optimization fail for cases where we have huge numbers of partitions?
Backports Required
Release Notes