-
-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LibCompress(+AK+LibHTTP): Implement streamable asynchronous deflate and zlib decompression #24567
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nothing looks particularly out of place (other than what I marked using inline comments), but effectively reviewing an entire second deflate implementation is somewhat hard.
What I'd be most interested in by now is actually seeing how existing things (like our existing Deflate and Zlib implementations) are gradually converted. We won't realistically be able to duplicate everything before starting to delete the non-async stuff.
// These are defined just to replace some 4s and 8s with meaningful expressions. | ||
using WordType = u32; | ||
using DoubleWordType = u64; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apart from the fact that Word
and DoubleWord
are Microsoft/Intel terminology that don't actually make sense in practice, why are we splitting here in the first place?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No particular reason, these are defined just to replace some 4s and 8s with meaningful expressions.
It is a bit nicer to think of reading in words and, therefore, chunks of sizeof(Word)
bytes. As for DoubleWordType
, I need to represent two consecutive words somewhere too.
In fact, BufferBitView
works correctly if one replaces WordType
with u64 and DoubleWordType
with unsigned __int128
(I haven't done this only because of the performance and the fact that we only read at most 15 bits in deflate).
auto ptr = reinterpret_cast<FlatPtr>(bytes.data()); | ||
auto buffer_offset_in_bytes = ptr % alignof(WordType); | ||
auto bytes_in_current_word_to_fill = sizeof(WordType) - buffer_offset_in_bytes; | ||
|
||
m_bit_position = buffer_offset_in_bytes * 8 + bit_position; | ||
m_bits_left = bytes.size() * 8 - bit_position; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does aligned vs unaligned really make that much difference, especially since we just memcpy
it over anyways?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was mostly thinking about sanitizer warnings and avoiding theoretical UB
DeflateDecompressor::DeflateDecompressor(NonnullOwnPtr<AsyncInputStream>&& input) | ||
DeflateDecompressor::DeflateDecompressor(MaybeOwned<AsyncInputStream>&& input) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks to have ended up in the wrong commit (same for the header file).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, this was deliberate: non-owning inflate is not particularly useful by itself and is only used in zlib decompression. I can fix up this change, though, if you wish
0177526
to
9692e29
Compare
There is nearly not enough async-specific stuff in AsyncStreamBuffer for it to carry "Async" prefix.
9692e29
to
fadc4ae
Compare
fadc4ae
to
927c396
Compare
This PR builds upon previous work in the foundational coroutines PR and implements streamable asynchronous, error-safe, and EOF-correct decompression. Incidentally, new asynchronous implementation is about 2 times faster than our previous synchronous one.
In the future, to address code duplication the PR introduces, I plan to create a AsyncStream -> Stream translation mechanism and use it to reroute old classes to the new implementation.
(Deflate algorithm itself is pretty much directly copied from the old implementation, so it probably doesn't require as much attention as scaffolding.)
@timschumi, I'm sorry for yet another gigantic PR :).