8000 Changed HashMap.getOrElseUpdate to only calculate the index once by l0rinc · Pull Request #5528 · scala/scala · GitHub
[go: up one dir, main page]

Skip to content

Changed HashMap.getOrElseUpdate to only calculate the index once #5528

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Nov 22, 2016
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Changed HashMap.getOrElseUpdate to only calculate the index once
Fixes https://issues.scala-lang.org/browse/SI-10049

Since `groupBy` uses this method extensively and suffered a measurable slowdown in `2.12.0`, this modification restores (and exceeds) its original speed.

---
included benchmarks:

(`ns/op` → smaller is better)

`before (2.12.0):`
```java
Benchmark                                     (size)  Mode  Cnt      Score      Error  Units
s.c.immutable.VectorMapBenchmark.groupBy          10  avgt   20    865.693 ±    7.869  ns/op
s.c.immutable.VectorMapBenchmark.groupBy         100  avgt   20   3095.657 ±   56.438  ns/op
s.c.immutable.VectorMapBenchmark.groupBy        1000  avgt   20  28247.005 ±  470.513  ns/op
s.c.mutable.HashMapBenchmark.get                  10  avgt   20    679.448 ±   11.809  ns/op
s.c.mutable.HashMapBenchmark.get                 100  avgt   20   7240.178 ±   61.734  ns/op
s.c.mutable.HashMapBenchmark.get                1000  avgt   20  95725.127 ± 2373.458  ns/op
s.c.mutable.HashMapBenchmark.getOrElseUpdate      10  avgt   20    836.561 ±   20.085  ns/op
s.c.mutable.HashMapBenchmark.getOrElseUpdate     100  avgt   20   7891.368 ±   56.808  ns/op
s.c.mutable.HashMapBenchmark.getOrElseUpdate    1000  avgt   20  97478.629 ± 1782.497  ns/op
s.c.mutable.HashMapBenchmark.put                  10  avgt   20    243.422 ±    2.915  ns/op
s.c.mutable.HashMapBenchmark.put                 100  avgt   20   5810.927 ±   60.054  ns/op
s.c.mutable.HashMapBenchmark.put                1000  avgt   20  82175.539 ± 1690.296  ns/op
```
`after:`
```java
Benchmark                                     (size)  Mode  Cnt      Score      Error  Units
s.c.immutable.VectorMapBenchmark.groupBy          10  avgt   20    627.007 ±    9.718  ns/op
s.c.immutable.VectorMapBenchmark.groupBy         100  avgt   20   2086.955 ±   19.042  ns/op
s.c.immutable.VectorMapBenchmark.groupBy        1000  avgt   20  19515.234 ±  173.647  ns/op
s.c.mutable.HashMapBenchmark.get                  10  avgt   20    683.977 ±   11.843  ns/op
s.c.mutable.HashMapBenchmark.get                 100  avgt   20   7345.675 ±   41.092  ns/op
s.c.mutable.HashMapBenchmark.get                1000  avgt   20  95085.926 ± 1702.997  ns/op
s.c.mutable.HashMapBenchmark.getOrElseUpdate      10  avgt   20    503.208 ±    2.643  ns/op
s.c.mutable.HashMapBenchmark.getOrElseUpdate     100  avgt   20   5526.483 ±   28.262  ns/op
s.c.mutable.HashMapBenchmark.getOrElseUpdate    1000  avgt   20  69265.900 ±  674.958  ns/op
s.c.mutable.HashMapBenchmark.put                  10  avgt   20    252.481 ±    7.597  ns/op
s.c.mutable.HashMapBenchmark.put                 100  avgt   20   5708.034 ±  110.360  ns/op
s.c.mutable.HashMapBenchmark.put                1000  avgt   20  82051.378 ± 1432.009  ns/op
```

i.e. for the given benchmark conditions `~40%` faster `groupBy` and `getOrElseUpdate`
  • Loading branch information
l0rinc committed Nov 18, 2016
commit b67ca7dc6bb84758f9c9f64d68b0b11c20995aa0
31 changes: 31 additions & 0 deletions src/library/scala/collection/mutable/HashMap.scala
Original file line number Diff line number Diff line change
Expand Up @@ -72,6 +72,37 @@ extends AbstractMap[A, B]
else Some(e.value)
}

override def getOrElseUpdate(key: A, defaultValue: => B): B = {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case the element was missing from the map, the hash code (with some added variance) was calculated first for verification, and again for the put operation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this is a binary-compatible change. @retronym - is this? I would not intuitively have thought it would be with the new trait encoding, but I'm not sure of the new rules.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Ichoran It should be binary compatible. A class gets a forwarder method that delegates to the static implementation method of the interface. Adding an override merely replaces the implementation of the class method without any changes to the signatures. I'll add a proper test case for this to MiMa.

Sorry, something went wrong.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

scala.collection.mutable.HashMap::getOrElseUpdate (48 bytes) inline (hot)

val i = index(elemHashCode(key))
val entry = findEntry(key, i)
if (entry != null) entry.value
else addEntry(createNewEntry(key, defaultValue), i)
}

/* inlined HashTable.findEntry0 to preserve its visibility */
private[this] def findEntry(key: A, h: Int): Entry = {
var e = table(h).asInstanceOf[Entry]
while (notFound(key, e))
e = e.next
e
}
private[this] def notFound(key: A, e: Entry): Boolean = (e != null) && !elemEquals(e.key, key)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extracted to make findEntry inlinable (and the method more readable):
scala.collection.mutable.HashMap::findEntry (32 bytes) inline (hot)


/* inlined HashTable.addEntry0 to preserve its visibility */
private[this] def addEntry(e: Entry, h: Int): B = {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

scala.collection.mutable.HashMap::addEntry (30 bytes) inline (hot)

if (tableSize >= threshold) addEntry(e)
else addEntry0(e, h)
e.value
}

/* extracted to make addEntry inlinable */
private[this] def addEntry0(e: Entry, h: Int) {
e.next = table(h).asInstanceOf[Entry]
table(h) = e
tableSize += 1
nnSizeMapAdd(h)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This implementation looks fine, but what about using findOrAddEntry in HashTable? Does that still show the same speed improvement?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea is very good, but we don't have a value in HashTable's Entry, only key and next.

The other alternative would be to call findOrAddEntry directly, but then defaultValue would be evaluated twice, which should probably be avoided, right?

override def getOrElseUpdate(key: A, defaultValue: => B): B = {
  val e = findOrAddEntry(key, defaultValue)
  if (e ne null) e.value else defaultValue
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

which should probably be avoided, right?

Correct. Also, we mustn't evaluate defaultValue if the key already exists.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, didn't read the signatures correctly. It's not evaluating twice which is a problem; it's evaluating more than zero times when the entry exists. You're right; we can't do it this way.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!
In this particular case I think it cannot be evaluated once, only zero or two times (doesn't change the fact).


override def put(key: A, value: B): Option[B] = {
val e = findOrAddEntry(key, value)
if (e eq null) None
Expand Down
0