LibCompress(+AK+LibHTTP): Implement streamable asynchronous deflate and zlib decompression #24567

DanShaders · 2024-06-14T03:45:51Z

This PR builds upon previous work in the foundational coroutines PR and implements streamable asynchronous, error-safe, and EOF-correct decompression. Incidentally, new asynchronous implementation is about 2 times faster than our previous synchronous one.

In the future, to address code duplication the PR introduces, I plan to create a AsyncStream -> Stream translation mechanism and use it to reroute old classes to the new implementation.

(Deflate algorithm itself is pretty much directly copied from the old implementation, so it probably doesn't require as much attention as scaffolding.)

@timschumi, I'm sorry for yet another gigantic PR :).

timschumi

Nothing looks particularly out of place (other than what I marked using inline comments), but effectively reviewing an entire second deflate implementation is somewhat hard.

What I'd be most interested in by now is actually seeing how existing things (like our existing Deflate and Zlib implementations) are gradually converted. We won't realistically be able to duplicate everything before starting to delete the non-async stuff.

timschumi · 2024-06-22T14:43:37Z

AK/AsyncBitStream.h

+    // These are defined just to replace some 4s and 8s with meaningful expressions.
+    using WordType = u32;
+    using DoubleWordType = u64;


Apart from the fact that Word and DoubleWord are Microsoft/Intel terminology that don't actually make sense in practice, why are we splitting here in the first place?

No particular reason, these are defined just to replace some 4s and 8s with meaningful expressions. It is a bit nicer to think of reading in words and, therefore, chunks of sizeof(Word) bytes. As for DoubleWordType, I need to represent two consecutive words somewhere too.

In fact, BufferBitView works correctly if one replaces WordType with u64 and DoubleWordType with unsigned __int128 (I haven't done this only because of the performance and the fact that we only read at most 15 bits in deflate).

timschumi · 2024-06-22T15:51:36Z

AK/AsyncBitStream.h

+        auto ptr = reinterpret_cast<FlatPtr>(bytes.data());
+        auto buffer_offset_in_bytes = ptr % alignof(WordType);
+        auto bytes_in_current_word_to_fill = sizeof(WordType) - buffer_offset_in_bytes;
+
+        m_bit_position = buffer_offset_in_bytes * 8 + bit_position;
+        m_bits_left = bytes.size() * 8 - bit_position;


Does aligned vs unaligned really make that much difference, especially since we just memcpy it over anyways?

I was mostly thinking about sanitizer warnings and avoiding theoretical UB

AK/AsyncBitStream.h

AK/AsyncStreamBuffer.h

timschumi · 2024-06-22T17:15:34Z

Userland/Libraries/LibCompress/AsyncDeflate.cpp

-DeflateDecompressor::DeflateDecompressor(NonnullOwnPtr<AsyncInputStream>&& input)
+DeflateDecompressor::DeflateDecompressor(MaybeOwned<AsyncInputStream>&& input)


Looks to have ended up in the wrong commit (same for the header file).

Well, this was deliberate: non-owning inflate is not particularly useful by itself and is only used in zlib decompression. I can fix up this change, though, if you wish

There is nearly not enough async-specific stuff in AsyncStreamBuffer for it to carry "Async" prefix.

DanShaders requested review from alimpfard and timschumi as code owners June 14, 2024 03:45

github-actions bot added the 👀 pr-needs-review PR needs review from a maintainer or community member label Jun 14, 2024

timschumi reviewed Jun 22, 2024

View reviewed changes

DanShaders added ⏳ pr-waiting-for-author PR is blocked by feedback / code changes from the author and removed 👀 pr-needs-review PR needs review from a maintainer or community member labels Jun 24, 2024

DanShaders force-pushed the coro-decompress branch from 0177526 to 9692e29 Compare June 25, 2024 23:23

github-actions bot added 👀 pr-needs-review PR needs review from a maintainer or community member and removed ⏳ pr-waiting-for-author PR is blocked by feedback / code changes from the author labels Jun 25, 2024

DanShaders added 2 commits June 25, 2024 19:27

AK: Rename AsyncStreamBuffer -> StreamBuffer

2deef28

There is nearly not enough async-specific stuff in AsyncStreamBuffer for it to carry "Async" prefix.

AK+LibCompress: Implement streamable asynchronous deflate decompression

37b1a5e

DanShaders force-pushed the coro-decompress branch from 9692e29 to fadc4ae Compare June 25, 2024 23:35

DanShaders added 3 commits June 27, 2024 14:59

Tests/LibCompress: Add basic tests for asynchronous deflate

b14025a

LibCompress: Add asynchronous zlib decompressor

fcc5b88

LibHTTP: Support Content-Encoding: deflate in asynchronous HTTP client

927c396

DanShaders force-pushed the coro-decompress branch from fadc4ae to 927c396 Compare June 27, 2024 18:59

DanShaders mentioned this pull request Jun 27, 2024

LibCompress: Decompressing a file will result in a "Faile decompressing file [...]: Reached end-of-stream without collecting the required number of bits" #22130

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LibCompress(+AK+LibHTTP): Implement streamable asynchronous deflate and zlib decompression #24567

LibCompress(+AK+LibHTTP): Implement streamable asynchronous deflate and zlib decompression #24567

DanShaders commented Jun 14, 2024 •

edited

Loading

timschumi left a comment

timschumi Jun 22, 2024

DanShaders Jun 23, 2024 •

edited

Loading

timschumi Jun 22, 2024

DanShaders Jun 23, 2024

timschumi Jun 22, 2024

DanShaders Jun 23, 2024

		DeflateDecompressor::DeflateDecompressor(NonnullOwnPtr<AsyncInputStream>&& input)
		DeflateDecompressor::DeflateDecompressor(MaybeOwned<AsyncInputStream>&& input)

LibCompress(+AK+LibHTTP): Implement streamable asynchronous deflate and zlib decompression #24567

Are you sure you want to change the base?

LibCompress(+AK+LibHTTP): Implement streamable asynchronous deflate and zlib decompression #24567

Conversation

DanShaders commented Jun 14, 2024 • edited Loading

timschumi left a comment

Choose a reason for hiding this comment

timschumi Jun 22, 2024

Choose a reason for hiding this comment

DanShaders Jun 23, 2024 • edited Loading

Choose a reason for hiding this comment

timschumi Jun 22, 2024

Choose a reason for hiding this comment

DanShaders Jun 23, 2024

Choose a reason for hiding this comment

timschumi Jun 22, 2024

Choose a reason for hiding this comment

DanShaders Jun 23, 2024

Choose a reason for hiding this comment

DanShaders commented Jun 14, 2024 •

edited

Loading

DanShaders Jun 23, 2024 •

edited

Loading