Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LibCompress(+AK+LibHTTP): Implement streamable asynchronous deflate and zlib decompression #24567

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

DanShaders
Copy link
Contributor

@DanShaders DanShaders commented Jun 14, 2024

This PR builds upon previous work in the foundational coroutines PR and implements streamable asynchronous, error-safe, and EOF-correct decompression. Incidentally, new asynchronous implementation is about 2 times faster than our previous synchronous one.

In the future, to address code duplication the PR introduces, I plan to create a AsyncStream -> Stream translation mechanism and use it to reroute old classes to the new implementation.

(Deflate algorithm itself is pretty much directly copied from the old implementation, so it probably doesn't require as much attention as scaffolding.)

@timschumi, I'm sorry for yet another gigantic PR :).

@github-actions github-actions bot added the 👀 pr-needs-review PR needs review from a maintainer or community member label Jun 14, 2024
Copy link
Member

@timschumi timschumi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing looks particularly out of place (other than what I marked using inline comments), but effectively reviewing an entire second deflate implementation is somewhat hard.

What I'd be most interested in by now is actually seeing how existing things (like our existing Deflate and Zlib implementations) are gradually converted. We won't realistically be able to duplicate everything before starting to delete the non-async stuff.

Comment on lines +21 to +23
// These are defined just to replace some 4s and 8s with meaningful expressions.
using WordType = u32;
using DoubleWordType = u64;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apart from the fact that Word and DoubleWord are Microsoft/Intel terminology that don't actually make sense in practice, why are we splitting here in the first place?

Copy link
Contributor Author

@DanShaders DanShaders Jun 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No particular reason, these are defined just to replace some 4s and 8s with meaningful expressions. It is a bit nicer to think of reading in words and, therefore, chunks of sizeof(Word) bytes. As for DoubleWordType, I need to represent two consecutive words somewhere too.

In fact, BufferBitView works correctly if one replaces WordType with u64 and DoubleWordType with unsigned __int128 (I haven't done this only because of the performance and the fact that we only read at most 15 bits in deflate).

Comment on lines +29 to +34
auto ptr = reinterpret_cast<FlatPtr>(bytes.data());
auto buffer_offset_in_bytes = ptr % alignof(WordType);
auto bytes_in_current_word_to_fill = sizeof(WordType) - buffer_offset_in_bytes;

m_bit_position = buffer_offset_in_bytes * 8 + bit_position;
m_bits_left = bytes.size() * 8 - bit_position;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does aligned vs unaligned really make that much difference, especially since we just memcpy it over anyways?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was mostly thinking about sanitizer warnings and avoiding theoretical UB

AK/AsyncBitStream.h Outdated Show resolved Hide resolved
AK/AsyncStreamBuffer.h Outdated Show resolved Hide resolved
AK/AsyncStreamBuffer.h Outdated Show resolved Hide resolved
DeflateDecompressor::DeflateDecompressor(NonnullOwnPtr<AsyncInputStream>&& input)
DeflateDecompressor::DeflateDecompressor(MaybeOwned<AsyncInputStream>&& input)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks to have ended up in the wrong commit (same for the header file).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, this was deliberate: non-owning inflate is not particularly useful by itself and is only used in zlib decompression. I can fix up this change, though, if you wish

@DanShaders DanShaders added ⏳ pr-waiting-for-author PR is blocked by feedback / code changes from the author and removed 👀 pr-needs-review PR needs review from a maintainer or community member labels Jun 24, 2024
@github-actions github-actions bot added 👀 pr-needs-review PR needs review from a maintainer or community member and removed ⏳ pr-waiting-for-author PR is blocked by feedback / code changes from the author labels Jun 25, 2024
There is nearly not enough async-specific stuff in AsyncStreamBuffer for
it to carry "Async" prefix.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
👀 pr-needs-review PR needs review from a maintainer or community member
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants