Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update test and benchmark generation for blockwise kr>2 #6557

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

GregoryComer
Copy link
Contributor

@GregoryComer GregoryComer commented Jun 12, 2024

This PR updates test generation for blockwise (qb4w) kernels in preparation for ISA-specific kernels with kr > 2. Blockwise kernels currently enforce several constraints:

  1. Kc is divisible by block size (no partial blocks in a column).
  2. Block size is divisible by 2*kr*sr (inner loop is 2x unrolled for 2 planes, no partial epilogue).

The majority of the existing gemm tests cover edge cases that are incompatible with the above constraints, so this PR conditionally disables the irrelevant tests for blockwise kernels. Blockwise kernels still have tests for strides, M subtiles, N subtiles, and varying block/k sizes.

Blockwise tests also benefit from linear stepping of nc and kc. Existing test logic always calls NextPrime after incrementing nc and kc values. To allow for linear stepping in certain blockwise tests while also maintaining the existing behavior, I've tentatively added a step_type parameter to LoopParams, allowing for either linear or prime steps. The default value for step_type is set in GemmTestParams to maintain the existing behavior of prime steps for nc and kc.

Additionally, benchmark logic for blockwise kernels now rounds kc up to a multiple of 2*kr*sr.

All tests were regenerated via scripts/generate-tests.sh.

struct LoopParams {
LoopParams() = default;
explicit LoopParams(size_t from, size_t to, size_t step)
: is_set(true), from(from), to(to), step(step) {}
explicit LoopParams(size_t from, size_t to, size_t step, LoopStepType step_type)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If preferred, I could change this to a boolean step_prime variable or similar. I figured the enum class approach is slightly more readable, but does feel less idiomatic to the codebase. Feel free to weigh in on this if you prefer it differently.

@@ -549,23 +583,3 @@ struct GemmTestParams {
};

using GemmTest = testing::TestWithParam<GemmTestParams>;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above comment - methods are moved to the top of the header.

@@ -236,93 +236,52 @@ def split_ukernel_name(name):
, test_func, isa_check)
.loop_n(1, nr)
.loop_m(1, mr));
if (k_block > 1) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The diff looks messy, but all that actually changed here is that block of tests that are incompatible with the blockwise kernel assumptions are wrapped in the $if KERNELTYPE not in ['qb4w']. I did re-generate all tests via scripts/generate-tests.sh and there are no functional changes to existing tests.

@GregoryComer GregoryComer marked this pull request as ready for review June 12, 2024 09:25
@GregoryComer
Copy link
Contributor Author

@alankelly Hey Alan, could you take a look at this PR when you have time? Once this is merged, we can start posting the ISA-specific qb4w kernels. Thanks.

@alankelly
Copy link
Collaborator

@gonnet Can you please review?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants