-
-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LibCrypto: Implement AES by using x86 intrinsics #24538
base: master
Are you sure you want to change the base?
Conversation
Co-Authored-By: Leon Albrecht <[email protected]>
Co-Authored-By: Leon Albrecht <[email protected]>
The key expansion algorithms were inspired by (copy & pasted from) this paper: https://www.intel.com/content/dam/doc/white-paper/advanced-encryption-standard-new-instructions-set-paper.pdf |
AK/SIMDExtras.h
Outdated
} | ||
|
||
template<SIMDVector TDst, SIMDVector TSrc> | ||
ALWAYS_INLINE static TDst as(TSrc const& v) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is essentially a bit_cast
please prefer that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, reverted, used bit_cast
instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Side note: Please run clang-format over your code
namespace | ||
{ | ||
u32* round_key; | ||
static AK::SIMD::i64x2 AESCipherKey_expand_encrypt_key_128_assist(AK::SIMD::i64x2 const& a, AK::SIMD::i64x2 const& b) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's usually either static
or namespace {}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, removed anonymous namespace
, using only static
instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK clang-formatted.
#endif | ||
|
||
expand_encrypt_key(user_key, bits); | ||
#if (ARCH(I386) || ARCH(X86_64)) | ||
namespace | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One big #if
should be enough
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, merged to single #if
block.
td = AK::SIMD::i32x4 { 0, ta[0], ta[1], ta[2] }; | ||
ta ^= td; | ||
td = AK::SIMD::i32x4 { 0, td[0], td[1], td[2] }; | ||
ta ^= td; | ||
td = AK::SIMD::i32x4 { 0, td[0], td[1], td[2] }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to be fundamentally different to the 128 version where it redistributes ta
, is this intentional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It works correctly, unit tests are passing, it is based on that white paper I linked above. If you want, I could rename the variables to be more consistent.
union aligned_buf | ||
{ | ||
unsigned char m_buf[16]; | ||
AK::SIMD::i32x4 m_vec; | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like it could be done better,
will send a diff on discord
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, used the diff.
tc = __builtin_ia32_aeskeygenassist128(tb, 0x08); AESCipherKey_expand_encrypt_key_256_assist_a(ta, tc); AK::SIMD::store_unaligned(&round_key[ 8 * 4], ta); AESCipherKey_expand_encrypt_key_256_assist_b(ta, tb); AK::SIMD::store_unaligned(&round_key[ 9 * 4], tb); | ||
tc = __builtin_ia32_aeskeygenassist128(tb, 0x10); AESCipherKey_expand_encrypt_key_256_assist_a(ta, tc); AK::SIMD::store_unaligned(&round_key[10 * 4], ta); AESCipherKey_expand_encrypt_key_256_assist_b(ta, tb); AK::SIMD::store_unaligned(&round_key[11 * 4], tb); | ||
tc = __builtin_ia32_aeskeygenassist128(tb, 0x20); AESCipherKey_expand_encrypt_key_256_assist_a(ta, tc); AK::SIMD::store_unaligned(&round_key[12 * 4], ta); AESCipherKey_expand_encrypt_key_256_assist_b(ta, tb); AK::SIMD::store_unaligned(&round_key[13 * 4], tb); | ||
tc = __builtin_ia32_aeskeygenassist128(tb, 0x40); AESCipherKey_expand_encrypt_key_256_assist_a(ta, tc); AK::SIMD::store_unaligned(&round_key[14 * 4], ta); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Last round does not get an assist_b
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It works correctly, unit tests are passing, it is based on that white paper I linked above.
#if (ARCH(I386) || ARCH(X86_64)) | ||
static void __attribute__((target("aes"), used)) AESCipher_decrypt_block(AESCipherKey const& key, AESCipherBlock const& in, AESCipherBlock& out) | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe move this up so that this is in the same big #if X86
block
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to keep the encrypt-portable and encrypt-accelerated functions close together, same for decrypt-portable and decrypt-accelerated, key-expand-portable and key-expand-accelerated and so on. Other ordering would be to do all the portable functions followed by all the accelerated functions. Which ordering are you preffering?
Hello! One or more of the commit messages in this PR do not match the SerenityOS code submission policy, please check the |
Related to: