Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: Using SIMD for byte search #1264

Open
He-Pin opened this issue Apr 6, 2024 · 4 comments
Open

Feature request: Using SIMD for byte search #1264

He-Pin opened this issue Apr 6, 2024 · 4 comments
Labels
performance Related to performance

Comments

@He-Pin
Copy link
Member

He-Pin commented Apr 6, 2024

Motivation:
When reading the code in #1247 , I recall that we can improve performance with SIDM.
related blog: https://richardstartin.github.io/posts/finding-bytes

and for reference in Netty: netty/netty#10737

Would you like to take a look at this too @JD557

int position = firstInstance(getWord(new byte[]{1, 2, 0, 3, 4, 10, (byte)'\n', 5}, 0), compilePattern((byte)'\n');
...

private static long compilePattern(byte byteToFind) {
    long pattern = byteToFind & 0xFFL;
    return pattern
        | (pattern << 8)
        | (pattern << 16)
        | (pattern << 24)
        | (pattern << 32)
        | (pattern << 40)
        | (pattern << 48)
        | (pattern << 56);
}

private static int firstInstance(long word, long pattern) {
    long input = word ^ pattern;
    long tmp = (input & 0x7F7F7F7F7F7F7F7FL) + 0x7F7F7F7F7F7F7F7FL;
    tmp = ~(tmp | input | 0x7F7F7F7F7F7F7F7FL);
    return Long.numberOfLeadingZeros(tmp) >>> 3;
}
@He-Pin He-Pin added the performance Related to performance label Apr 6, 2024
@samueleresca
Copy link
Member

@He-Pin Thanks for sharing the blog post 😄. Interesting stuff

@He-Pin
Copy link
Member Author

He-Pin commented Apr 7, 2024

@He-Pin Thanks for sharing the blog post 😄. Interesting stuff

Yes, but have no time to test it locally. with it, a single loop will test 8 bytes.

But without the vector api that what we can do now.

@laglangyue
Copy link
Contributor

so cool

@JD557
Copy link
Contributor

JD557 commented Apr 8, 2024

Would you like to take a look at this too @JD557

Unfortunately, I don't think I'll have capacity to look into this in the near future. Also, my knowledge of SIMD is pretty limited.

Having said that:

But without the vector api that what we can do now.

I think there's value in first trying something that is not using the Vector API (like the code suggested above). That way Pekko can keep supporting old Java versions, as well as being easier to cross compile to JS and Native.

Hopefully that would hit the autovectorizaton logic in some of those platforms, but I haven't tested that 😅

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Related to performance
Projects
None yet
Development

No branches or pull requests

4 participants