-
-
Notifications
You must be signed in to change notification settings - Fork 806
ICU-23061 LowercaseTransliterator has poor scalability due to synchronized state #3417
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
ICU-23061 LowercaseTransliterator has poor scalability due to synchronized state #3417
Conversation
95a0533
to
2af99d6
Compare
Notice: the branch changed across the force-push!
~ Your Friendly Jira-GitHub PR Checker Bot |
2af99d6
to
2197adb
Compare
Notice: the branch changed across the force-push!
~ Your Friendly Jira-GitHub PR Checker Bot |
2197adb
to
f332c3c
Compare
Hooray! The files in the branch are the same across the force-push. 😃 ~ Your Friendly Jira-GitHub PR Checker Bot |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It all looks good to me.
NOTE: I can't assess the additional dependencies; someone else should check that they look secure.
f332c3c
to
886f66f
Compare
Notice: the branch changed across the force-push!
~ Your Friendly Jira-GitHub PR Checker Bot |
I needed to add the license header to the performance test. |
886f66f
to
268eab2
Compare
Notice: the branch changed across the force-push!
~ Your Friendly Jira-GitHub PR Checker Bot |
I removed the |
The change itself looks reasonable, assuming the performance advantages are as billed. I can't speak to the POM changes or the performance test, as that's not stuff I'm really familiar with. Usually, bug fixes in ICU need to be done in both C++ and Java when appropriate. Do we have a similar problem on the C++ side? If so, I'd like to see a fix for that here too. |
These changes remind me of something. I think this Java lowercase transliterator still uses the locale. The ICU4C implementation removed that long ago because of how the registry works with sharing the singleton objects. Though the implementation in Java hard codes it to be US English casing only. So it's not much different. I thought the transliterator casing functionality in ICU4C was doing something funny with thread safety long ago, but I can't seem to find the concerning code anymore. Perhaps it was removed, or I missed it. |
The introduction of JMH is optional, and if the project maintainers don't want to introduce it, I can remove these changes. |
We really need one or more people from the ICU4J side of things to comment here... |
3463237
to
94ec9ff
Compare
Notice: the branch changed across the force-push!
~ Your Friendly Jira-GitHub PR Checker Bot |
I rebased and resolved conflicts. |
Anything I should do to move this PR forward? If the JMH benchmark is acceptable, I can squash the commits together. I didn't want to do that until we confirmed it would be accepted. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added some comments.
Thank you very much!
Mihai
icu4j/main/translit/src/main/java/com/ibm/icu/text/LowercaseTransliterator.java
Show resolved
Hide resolved
icu4j/perf-tests/src/main/java/com/ibm/icu/dev/test/perf/LowercaseTransliteratorPerf.java
Outdated
Show resolved
Hide resolved
icu4j/perf-tests/src/main/java/com/ibm/icu/dev/test/perf/LowercaseTransliteratorPerf.java
Outdated
Show resolved
Hide resolved
94ec9ff
to
c76e791
Compare
Notice: the branch changed across the force-push!
~ Your Friendly Jira-GitHub PR Checker Bot |
icu4j/main/translit/src/main/java/com/ibm/icu/text/BreakTransliterator.java
Show resolved
Hide resolved
icu4j/main/translit/src/main/java/com/ibm/icu/text/BreakTransliterator.java
Show resolved
Hide resolved
c76e791
to
9bb5d33
Compare
Notice: the branch changed across the force-push!
~ Your Friendly Jira-GitHub PR Checker Bot |
icu4j/main/translit/src/main/java/com/ibm/icu/text/BreakTransliterator.java
Show resolved
Hide resolved
@@ -416,4 +416,7 @@ public void addSourceTargetSet(UnicodeSet inputFilter, UnicodeSet sourceSet, Uni | |||
} | |||
} | |||
|
|||
private static class WordBreakIteratorHolder { | |||
static fin 8000 al BreakIterator BI = BreakIterator.getWordInstance(new ULocale("th_TH")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, using the Thai break iterator is generally pointless. It might load the Thai dictionary preemptively, but using ULocale.ROOT would generally be more appropriate.
int[] boundaries = new int[50]; | ||
int boundaryCount = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is being way overoptimized, and it probably exists due to ICU's long history. Now that ArrayList exists in Java, which is not synchronized when compared to the older Vector implementation, this should probably transition to ArrayList.
This change may affect the garbage collector. I think the initial capacity is 10 for an ArrayList, but we could guess an initial capacity of an ArrayList with something like ensureCapacity(Math.max(text.length()/CODEUNITS_PER_WORD, 4))
where CODEUNITS_PER_WORD could be 8 as a good starting point. CODEUNITS_PER_WORD could be 4 if it's reallocating too much, or the minimum capacity could be raised.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One concern is that ArrayList<Integer>
would also need to box the primitive values into wrapper objects, with an additional performance penalty. The existing primitive int[]
does not have this problem
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
icu4j/main/translit/src/main/java/com/ibm/icu/text/BreakTransliterator.java
Show resolved
Hide resolved
9bb5d33
to
9715cf6
Compare
Notice: the branch changed across the force-push!
~ Your Friendly Jira-GitHub PR Checker Bot |
I re-ran the benchmarks with the latest changes. synchronized, 1T:
local-alloc, 1T:
local-alloc, 8T:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you very much for the updates.
I only have a few comments, hopefully non-controversial.
icu4j/perf-tests/src/main/java/com/ibm/icu/dev/test/perf/BreakTransliteratorPerf.java
Outdated
Show resolved
Hide resolved
icu4j/perf-tests/src/main/java/com/ibm/icu/dev/test/perf/LowercaseTransliteratorPerf.java
Outdated
Show resolved
Hide resolved
icu4j/perf-tests/src/main/java/com/ibm/icu/dev/test/perf/TitlecaseTransliteratorPerf.java
Outdated
Show resolved
Hide resolved
icu4j/perf-tests/src/main/java/com/ibm/icu/dev/test/perf/UppercaseTransliteratorPerf.java
Outdated
Show resolved
Hide resolved
icu4j/perf-tests/src/main/java/com/ibm/icu/text/BreakTransliteratorAccess.java
Show resolved
Hide resolved
…ention Current implementation of lower, upper, title, etc synchronizes the whole operation, so it is always single-threaded Removing the shared state lets us utilize multiple CPU cores
9715cf6
to
bef0d2a
Compare
Notice: the branch changed across the force-push!
~ Your Friendly Jira-GitHub PR Checker Bot |
I made the suggested renamings. Please let me know what you would like to do with the BreakTransliterator non-public access case. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you very much!
LowercaseTransliterator tries to reuse buffers to avoid allocation. Historically, this may have been an important optimization.
However on modern machines you are much more likely to use multiple CPUs, and modern JVMs are very efficient at GC, especially for local temporary data. Contending on a monitor is a big deal nowadays, especially for something so low level and hidden.
I've prepared a benchmark using JMH to show the change in scalability:
The current implementation exhibits negative scalability: adding more threads hurts throughput (~5% loss going from 1T to 8T)
By removing the
synchronized
state, the implementation now scales perfectly (~8x increase going from 1T to 8T for the longer test case)There is a very minor (~1%) penalty for pure single-threaded throughput since we no longer reuse state.
I believe this is acceptable in the context of allowing multiple threads to work without coordination.
Checklist