-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BLAS compatibility library #7
base: main
Are you sure you want to change the base?
Conversation
I've just added a unit test (currently failing) for SYRK although we probably don't want to run it with the regular unit tests because it requires a hardware kernel. Do you have any suggestions here? |
You can see what I did for GEMM, it's a test but not a unit test |
There's a nice routine to search for the kernel at |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are still several unaddressed comments from last review. Did you miss them, or is it stuff you want to postpone?
Co-authored-by: definelicht <definelicht@inf.ethz.ch>
const int k) { | ||
#pragma HLS INLINE | ||
ReadA_N: | ||
for (int n1 = 0; n1 < kTileSizeN; ++n1) { | ||
#pragma HLS PIPELINE II = 1 | ||
#pragma HLS LOOP_FLATTEN | ||
DramLine num[1]; | ||
num[0] = mem[(n0 * kTileSizeN + n1) * size_k + k]; | ||
num[0] = mem[((n0 * kTileSizeN + n1) + k * size_n) * kLinesPerNumber]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
num[0] = mem[((n0 * kTileSizeN + n1) + k * size_n) * kLinesPerNumber]; | |
num[0] = mem[k * size_n + n0 * kTileSizeN + n1]; |
@@ -19,7 +24,7 @@ void ReadAInner(DramLine const *const mem, hlslib::Stream<PackedFloat> &a_to_fee | |||
for (int i = 0; i < kLinesPerNumber; ++i) { | |||
#pragma HLS PIPELINE II = 1 | |||
#pragma HLS LOOP_FLATTEN | |||
num[i] = mem[((n0 * kTileSizeN + n1) * size_k + k) * kLinesPerNumber + i]; | |||
num[i] = mem[((n0 * kTileSizeN + n1) + k * size_n) * kLinesPerNumber + i]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
num[i] = mem[((n0 * kTileSizeN + n1) + k * size_n) * kLinesPerNumber + i]; | |
num[i] = mem[(k * size_n + n0 * kTileSizeN + n1) * kLinesPerNumber + i]; |
@@ -100,7 +105,7 @@ void ReadBInner(DramLine const *const mem, hlslib::Stream<PackedFloat> &b_to_fee | |||
for (int i = 0; i < kLinesPerNumber; ++i) { | |||
#pragma HLS PIPELINE II = 1 | |||
#pragma HLS LOOP_FLATTEN | |||
num[i] = mem[(k * size_m + m0 * kTileSizeM + m1) * kLinesPerNumber + i]; | |||
num[i] = mem[(k + (m0 * kTileSizeM + m1) * size_k) * kLinesPerNumber + i]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
num[i] = mem[(k + (m0 * kTileSizeM + m1) * size_k) * kLinesPerNumber + i]; | |
num[i] = mem[((m0 * kTileSizeM + m1) * size_k + k) * kLinesPerNumber + i]; |
const int k) { | ||
#pragma HLS INLINE | ||
ReadB_M: | ||
for (int m1 = 0; m1 < kTileSizeM; ++m1) { | ||
#pragma HLS PIPELINE II = 1 | ||
#pragma HLS LOOP_FLATTEN | ||
DramLine num[1]; | ||
num[0] = mem[k * size_m + m0 * kTileSizeM + m1]; | ||
num[0] = mem[(k + (m0 * kTileSizeM + m1) * size_k) * kLinesPerNumber]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
num[0] = mem[(k + (m0 * kTileSizeM + m1) * size_k) * kLinesPerNumber]; | |
num[0] = mem[(m0 * kTileSizeM + m1) * size_k + k]; |
43d8a88
to
3860e6e
Compare
We will want some integration tests for this, but it's probably more important to not have a huge diff piling up