-
Notifications
You must be signed in to change notification settings - Fork 3.1k
Changed HashMap.getOrElseUpdate to only calculate the index once #5528
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 1 commit
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
Fixes https://issues.scala-lang.org/browse/SI-10049 Since `groupBy` uses this method extensively and suffered a measurable slowdown in `2.12.0`, this modification restores (and exceeds) its original speed. --- included benchmarks: (`ns/op` → smaller is better) `before (2.12.0):` ```java Benchmark (size) Mode Cnt Score Error Units s.c.immutable.VectorMapBenchmark.groupBy 10 avgt 20 865.693 ± 7.869 ns/op s.c.immutable.VectorMapBenchmark.groupBy 100 avgt 20 3095.657 ± 56.438 ns/op s.c.immutable.VectorMapBenchmark.groupBy 1000 avgt 20 28247.005 ± 470.513 ns/op s.c.mutable.HashMapBenchmark.get 10 avgt 20 679.448 ± 11.809 ns/op s.c.mutable.HashMapBenchmark.get 100 avgt 20 7240.178 ± 61.734 ns/op s.c.mutable.HashMapBenchmark.get 1000 avgt 20 95725.127 ± 2373.458 ns/op s.c.mutable.HashMapBenchmark.getOrElseUpdate 10 avgt 20 836.561 ± 20.085 ns/op s.c.mutable.HashMapBenchmark.getOrElseUpdate 100 avgt 20 7891.368 ± 56.808 ns/op s.c.mutable.HashMapBenchmark.getOrElseUpdate 1000 avgt 20 97478.629 ± 1782.497 ns/op s.c.mutable.HashMapBenchmark.put 10 avgt 20 243.422 ± 2.915 ns/op s.c.mutable.HashMapBenchmark.put 100 avgt 20 5810.927 ± 60.054 ns/op s.c.mutable.HashMapBenchmark.put 1000 avgt 20 82175.539 ± 1690.296 ns/op ``` `after:` ```java Benchmark (size) Mode Cnt Score Error Units s.c.immutable.VectorMapBenchmark.groupBy 10 avgt 20 627.007 ± 9.718 ns/op s.c.immutable.VectorMapBenchmark.groupBy 100 avgt 20 2086.955 ± 19.042 ns/op s.c.immutable.VectorMapBenchmark.groupBy 1000 avgt 20 19515.234 ± 173.647 ns/op s.c.mutable.HashMapBenchmark.get 10 avgt 20 683.977 ± 11.843 ns/op s.c.mutable.HashMapBenchmark.get 100 avgt 20 7345.675 ± 41.092 ns/op s.c.mutable.HashMapBenchmark.get 1000 avgt 20 95085.926 ± 1702.997 ns/op s.c.mutable.HashMapBenchmark.getOrElseUpdate 10 avgt 20 503.208 ± 2.643 ns/op s.c.mutable.HashMapBenchmark.getOrElseUpdate 100 avgt 20 5526.483 ± 28.262 ns/op s.c.mutable.HashMapBenchmark.getOrElseUpdate 1000 avgt 20 69265.900 ± 674.958 ns/op s.c.mutable.HashMapBenchmark.put 10 avgt 20 252.481 ± 7.597 ns/op s.c.mutable.HashMapBenchmark.put 100 avgt 20 5708.034 ± 110.360 ns/op s.c.mutable.HashMapBenchmark.put 1000 avgt 20 82051.378 ± 1432.009 ns/op ``` i.e. for the given benchmark conditions `~40%` faster `groupBy` and `getOrElseUpdate`
- Loading branch information
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -72,6 +72,37 @@ extends AbstractMap[A, B] | |
else Some(e.value) | ||
} | ||
|
||
override def getOrElseUpdate(key: A, defaultValue: => B): B = { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
val i = index(elemHashCode(key)) | ||
val entry = findEntry(key, i) | ||
if (entry != null) entry.value | ||
else addEntry(createNewEntry(key, defaultValue), i) | ||
} | ||
|
||
/* inlined HashTable.findEntry0 to preserve its visibility */ | ||
private[this] def findEntry(key: A, h: Int): Entry = { | ||
var e = table(h).asInstanceOf[Entry] | ||
while (notFound(key, e)) | ||
e = e.next | ||
e | ||
} | ||
private[this] def notFound(key: A, e: Entry): Boolean = (e != null) && !elemEquals(e.key, key) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. extracted to make |
||
|
||
/* inlined HashTable.addEntry0 to preserve its visibility */ | ||
private[this] def addEntry(e: Entry, h: Int): B = { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
if (tableSize >= threshold) addEntry(e) | ||
else addEntry0(e, h) | ||
e.value | ||
} | ||
|
||
/* extracted to make addEntry inlinable */ | ||
private[this] def addEntry0(e: Entry, h: Int) { | ||
e.next = table(h).asInstanceOf[Entry] | ||
table(h) = e | ||
tableSize += 1 | ||
nnSizeMapAdd(h) | ||
} | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This implementation looks fine, but what about using There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The idea is very good, but we don't have a The other alternative would be to call override def getOrElseUpdate(key: A, defaultValue: => B): B = {
val e = findOrAddEntry(key, defaultValue)
if (e ne null) e.value else defaultValue
} There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Correct. Also, we mustn't evaluate There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Oops, didn't read the signatures correctly. It's not evaluating twice which is a problem; it's evaluating more than zero times when the entry exists. You're right; we can't do it this way. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks! |
||
|
||
override def put(key: A, value: B): Option[B] = { | ||
val e = findOrAddEntry(key, value) | ||
if (e eq null) None | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In case the element was missing from the map, the hash code (with some added variance) was calculated first for verification, and again for the
put
operation.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure this is a binary-compatible change. @retronym - is this? I would not intuitively have thought it would be with the new trait encoding, but I'm not sure of the new rules.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Ichoran It should be binary compatible. A class gets a forwarder method that delegates to the static implementation method of the interface. Adding an override merely replaces the implementation of the class method without any changes to the signatures. I'll add a proper test case for this to MiMa.