8000 Reduce the overhead of specialization by retronym · Pull Request #5812 · scala/scala · GitHub
[go: up one dir, main page]

Skip to content

Reduce the overhead of specialization #5812

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Apr 21, 2017

Conversation

retronym
Copy link
Member
@retronym retronym commented Mar 29, 2017

On my machine, running the benchmark sbt 'hot -psource=' (jointly compiling the sources of scalap and better-files) shows 7.5% improvement over f5ce29b (the parent of this PR).

[info] Benchmark                                   (extraArgs)  (source)    Mode  Cnt     Score   Error  Units
[info] HotScalacBenchmark.compile                                         sample  210  1465.140 ± 4.608  ms/op

[info] Benchmark                                   (extraArgs)  (source)    Mode  Cnt     Score   Error  Units
[info] HotScalacBenchmark.compile                                         sample  240  1355.127 ± 4.500  ms/op

Per-commit performance attribution is available in this spreadsheet for the corresponding commits in the larger branch in #5785.

@scala-jenkins scala-jenkins added this to the 2.12.2 milestone Mar 29, 2017
@retronym retronym modified the milestones: 2.12.3, 2.12.2 Mar 29, 2017
@retronym
Copy link
Member Author

(Assigning to 2.12.3, as I don't want to rush these changes into 2.12.2)

@retronym retronym requested a review from lrytz March 29, 2017 06:40
Any types that needed to be specialized to support callsites in the current
run would have already been info transformed during the specalization
tree transform of those call sites.

The backend requires further type information, e.g, to know about
inner/enclosing class relationships. This involves calls to `sym.info`
for classes on the classpath that haven't yet been info transformed.

During that process, all base classes of such types are also info
transformed. The specialization info transformer for classes then
looks at the members of the classes to add specialialized variants.

This is undesirable on grounds of performance and the risk of
encountering stub symbols (references to types absent from the
current compilation classpath) which can manifest as compiler crashes.
We know that `subst(tp1) <:< subst(tp2)` a priori (and cheaply!)
if `tp2` is `Any`, which is commonly the case.
Reworks e4b5c00 to perform the flag mutation once per run,
at the conclusion of the specialization tree transform, rather
than once per compilation unit. The old approach was O(NxM), where
N is the number of compilation units and M is the number of
specialized overloads.
Most commonly, this method will return an empty set. This commit focuses on
making that happen with a minimum of garbage, indirection, and info forcing.

  - Use mutable buffers to collect results, rather than appending sets
  - Avoid forcing the specialization info transform on all referenced
    types just to see if they have specialzied type parmeteters, we can
    phase travel back to typer to lookup this.
@retronym retronym force-pushed the faster/specialization branch from e877143 to e416a25 Compare March 30, 2017 00:23
@retronym
Copy link
Member Author
retronym commented Mar 30, 2017

##### Log file '/home/jenkins/workspace/scala-2.12.x-validate-test/test/files/jvm/future-spec-jvm.log' from failed test #####

warning: there was one deprecation warning (since 2.11.0)
warning: there were 19 deprecation warnings (since 2.12.0)
warning: there were 20 deprecation warnings in total; re-run with -deprecation for details
A future with custom ExecutionContext should:
- shouldHandleThrowables
[FAILED] java.lang.AssertionError: assertion failed: 2 is not 4
MinimalScalaTest$$anon$1.mustBe(main.scala:68)
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
java.lang.reflect.Method.invoke(Method.java:498)
FutureTests.$anonfun$new$2(FutureTests.scala:79)

image

The pattern of failures in this PR suggests this is non-deterministic. So the root cause could be a change in this PR, or this could be a latent problem that appears under load. I haven't been able to reproduce on my machine yet.

I've rebased this PR on a commit that adds a better diagnostic to that test case, let's see what happens...

Copy link
Member
@lrytz lrytz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work!!

@@ -198,6 +198,13 @@ abstract class SpecializeTypes extends InfoTransform with TypingTransformers {
override def newPhase(prev: scala.tools.nsc.Phase): StdPhase = new SpecializationPhase(prev)
class SpecializationPhase(prev: scala.tools.nsc.Phase) extends super.Phase(prev) {
override def checkable = false 8000
override def run(): Unit = {
super.run()
exitingSpecialize {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe add a line here // see comment in transformInfo

* - naked
* - as arguments to type constructors in @specialized positions
* (arrays are considered as Array[@specialized T])
*/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe remove that duplicated doc

@lrytz
Copy link
Member
lrytz commented Mar 30, 2017

Not sure what to make of this test failure that you saw earlier..

@retronym
Copy link
Member Author
retronym commented Mar 30, 2017 via email

@retronym
Copy link
Member Author

The only bytecode diff I see when bootstrapping through this PR is one from a previous change (#5631) to the compiler.

⚡ git show
commit 532c7ee2d20a02fbd72aab6cd10a36d51a2698d1
Author: Jason Zaugg <jzaugg@gmail.com>
Date:   Tue Apr 18 15:42:38 2017 +1000

    Changed

diff --git a/scala/tools/nsc/interpreter/Scripted$Factory.class.asm b/scala/tools/nsc/interpreter/Scripted$Factory.class.asm
index 451824e..4750c5a 100644
--- a/scala/tools/nsc/interpreter/Scripted$Factory.class.asm
+++ b/scala/tools/nsc/interpreter/Scripted$Factory.class.asm
@@ -203,8 +203,8 @@ public class scala/tools/nsc/interpreter/Scripted$Factory implements javax/scrip
     MAXSTACK = 8
     MAXLOCALS = 4

-  // access flags 0x1001
-  public synthetic getMethodCallSyntax(Ljava/lang/String;Ljava/lang/String;[Ljava/lang/String;)Ljava/lang/String;
+  // access flags 0x1
+  public getMethodCallSyntax(Ljava/lang/String;Ljava/lang/String;[Ljava/lang/String;)Ljava/lang/String;
     // parameter final  obj
     // parameter final  m
     // parameter final  args
@@ -344,8 +344,8 @@ public class scala/tools/nsc/interpreter/Scripted$Factory implements javax/scrip
     MAXSTACK = 4
     MAXLOCALS = 2

-  // access flags 0x1001
-  public synthetic getProgram([Ljava/lang/String;)Ljava/lang/String;
+  // access flags 0x1
+  public getProgram([Ljava/lang/String;)Ljava/lang/String;
     // parameter final  statements
     ALOAD 0
     GETSTATIC scala/Predef$.MODULE$ : Lscala/Predef$;

@retronym
Copy link
Member Author

/synch

@retronym
Copy link
Member Author

Not sure what to make of this test failure that you saw earlier..

@lrytz I'm going to merge this based on the identical bytecode after bootstrap. But let's keep an eye out for that test failure afterwards.

@retronym retronym merged commit 81459e4 into scala:2.12.x Apr 21, 2017
@jvican
Copy link
Member
jvican commented Apr 24, 2017

Thank you for this excellent performance catch @retronym.

@adriaanm adriaanm added the performance the need for speed. usually compiler performance, sometimes runtime performance. label May 25, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance the need for speed. usually compiler performance, sometimes runtime performance.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants
0