-
Notifications
You must be signed in to change notification settings - Fork 28
Enable 512-bit vector support #20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the contribution. It looks good to me. I've left one minor comment.
I'm wondering how to test both 256-bit and 512-bit implementations. Currently, only the one that is preferred by the machine where the tests run is being tested. Perhaps adding an option to force the implementation and then configuring CI to run both variants would be a good idea. We can merge your PR, and I'll create a new issue for this. Please let me know what you think about it.
void step(byte[] buffer, int offset, int blockIndex) { | ||
ByteVector chunk0 = ByteVector.fromArray(SPECIES_256, buffer, offset); | ||
ByteVector chunk1 = ByteVector.fromArray(SPECIES_256, buffer, offset + 32); | ||
public void step(byte[] buffer, int offset, int blockIndex) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method can have package-private visibility, while step1
, step2
and finishStep
should be private.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, the method visibilities are fixed.
Regarding testing with both vector widths, what you suggest sounds good.
Normally, SPECIES_PREFERRED equals the widest vector the platform supports. You can override this with a JVM option -XX:UseAVX=2 for 256-bit or -XX:UseAVX=3 for 512-bit. If -XX:UseAVX specifies a vector width not supported by the platform (e.g. UseAVX=3 on a 256-bit platform), a warning is issued and SPECIES_PREFERRED gives the widest vector supported by the platform.
As an exploration, I added the following to my gradle.build file:
task test256(type: Test) {
dependsOn downloadTestData
useJUnitPlatform()
jvmArgs += [
'--add-modules', 'jdk.incubator.vector',
'-Xmx2g',
'-XX:UseAVX=2'
]
}
check.dependsOn test256
On my 512-bit-enable laptop, the 'check' task runs the tests twice, first 'test' with SPECIES_PREFERRED giving SPECIES_512 then 'test256' with SPECIES_PREFERRED giving SPECIES_256. At the moment, however, I'm too new to gradle to know if this is anything like a good way to do things.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I've created an issue to address this: #23.
* 512-bit experiment * use preferred species * nwidth_array initial commit * nwidth_array, make classifier array static * WIP comparing nwidth switch vs method handle dispatch of ::step * WIP - nwitdh * candidate PR code * Fix step methods visibility, remove unused MethodHandle imports
Adds support for use of 512-bit vectors. Vector length is determined by ByteVector.SPECIES_PREFERRED. Support for 256-bit and 512-bit vectors is created by overloading existing methods that handle ByteVector chunks. Dispatch in StructuralIndexer::step is done with a switch on the (static final) number of chunks (1 or 2). I also tried this dispatch with a static final MethodHandle; performance was the same.
Results for ParseBenchmark.simdjson on my Linux laptop (Intel i7-1185G7) going from 256-bit to 512-bit show a 12% speedup with twitter data and a 15% speedup with github_events data.