8000 Improve performance of the backend by retronym · Pull Request #5820 · scala/scala · GitHub
[go: up one dir, main page]

Skip to content

Improve performance of the backend #5820

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Apr 21, 2017
Merged

Conversation

retronym
Copy link
Member
@retronym retronym commented Apr 3, 2017

In aggregate, I measure a 3.5% improvement from this PR.

retronym added 6 commits April 3, 2017 14:44
  - Record the entry for the RHS of the DefDef is a dedicated
    field to avoid immediately looking it up in a hash map after
    traversal
  - Use an AnyRefMap to avoid BoxesRuntime hashCode/equals
  - Use getOrElse rather than withDefaultValue to profit from a fast
    path in AnyRefMap.
The existing implementation pessimistically checks that all
parent directories of the about-to-be-written class file are
indeed directories.

This commit bypasses this logic for the common case of writing
to a regular directory on disk, and optimistically assumes that
the parent directory exists. If an exception is thrown during
writing, it attempts to create the parent directory.

This still avoids a compiler crash if a parent directory is
actually a file, which is tested by the existing test,
`run/t5717.scala`.
This is a hot method in the backend, and we save some cycles by
avoiding BoxesRuntime.
The backend uses this string as a key to the map of BTypes,
and as such needs to call `javaBinaryNameString` for each method
or field reference in the code. Even though we have previously
optimized the creation of this string by bypassing the Name
abstraction and by correctly sizing string builders, we can
still speed things up by caching the resulting String on its
ClassSymbol.
These assertions don't seem to pay their way anymore, the surrounding
implementation of the backend has matured, and they involve collections
operations that are too costly to be called in such a hot path in the backend.
Thread a single StringBuilder through the component's stringification
methods, rather than using string concat (ie, separates StringBuilders)
at each level.
@scala-jenkins scala-jenkins added this to the 2.12.3 milestone Apr 3, 2017
@@ -461,10 +461,10 @@ abstract class BCodeSkelBuilder extends BCodeHelpers {
locals.reset(isStaticMethod = methSymbol.isStaticMember)
jumpDest = immutable.Map.empty[ /* LabelDef */ Symbol, asm.Label ]
// populate labelDefsAtOrUnder
val ldf = new LabelDefsFinder
val ldf = new LabelDefsFinder(dd.rhs)
ldf.traverse(dd.rhs)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could be .apply() now

try {
Files.write(outfile.file.toPath, jclassBytes)
} catch {
case _: java.nio.file.NoSuchFileException =>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it be too slow to check java.nio.file.Files.exists(p.getParent)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found some small but measurable improvement from this approach, as there are typically many classfiles per directory, which makes the optimism pays off. I could split this into two commits so that we could reproduce that measurement.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that there is a better solution in #5815, that will only check a given directory once, and will not call createDirectories unless it the first time that the directory is reached

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lrytz each stat call is expensive in windows particually

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the approach of caching the results of file stats, but opted not to do that in this PR because it takes a bit of care to get the lifecycle of that cache sorted out (we need to start fresh on a new compiler run, but the current design of ByteCodeWriters didn't have a hook for that part of the lifecycle.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@retronym it is hooked in this my PR as the OutputDirectories are created new for each run

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have split out an alternative as #5834

will run performance test overnight and post up is successful

@@ -108,8 +108,6 @@ class BTypesFromSymbols[G <: Global](val global: G) extends BTypes {

assert(classSym != NoSymbol, "Cannot create ClassBType from NoSymbol")
assert(classSym.isClass, s"Cannot create ClassBType from non-class symbol $classSym")
assertClassNotArrayNotPrimitive(classSym)
assert(!primitiveTypeToBType.contains(classSym) || isCompilingPrimitive, s"Cannot create ClassBType for primitive class symbol $classSym")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit sad to see those go.. They were helpful for finding bugs, and they are also useful documentation. Can we put them under a flag or elide them? Or even leave them as comments.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll resurrect under -Xdev (which is also enabled by -Ydebug)

@retronym
Copy link
Member Author
retronym commented Apr 3, 2017
Internal error when running tests: sbt.ForkMain$Run$RunAborted: java.net.SocketException: Broken pipe (Write failed)
[error] Error: Total 0, Failed 0, Errors 0, Passed 0

@retronym
Copy link
Member Author
retronym commented Apr 3, 2017

/rebuild

case ClassBType(internalName) => "L" + internalName + ";"
case ArrayBType(component) => "[" + component
case MethodBType(args, res) => "(" + args.mkString + ")" + res
final override def toString: String = {
Copy link
Contributor
@mkeskells mkeskells Apr 4, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is non optimal still

There is a better version I think in https://github.com/scala/scala/pull/5815/files#diff-61481becb11478d78c0b577ecad981aaR320

that version caches the result and shares the (java) StringBuilder when building a compound type, and avoid builder generation for primative types

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mkeskells it would be good to extract small, self-contained pieces of #5815 that can be submitted as individual PRs. This will make reviewing much easier and also avoid duplicate work.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caching all the intermediate strings is an interesting idea. Usually I'd be a bit cautious about that for a recursive structure, as it requires exponential space wrt the depth of the tree. BTypes, however, only have "deep" trees for higher dimensional arrays, so this might not be a practical concern here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@retronym should be easy to bake off as we have the profiler and this is a measurable part of the backend CPU, and we have stable allocation counters

Not sure if the allocation counters are in the version that you have

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lrytz I will be working on that but I also have a day job. I was trying to get more tests to pass

I have seperately opened a discussion with @adriaanm about this coordination

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lrytz split out several PRs for this #5831 #5834 #5833 #5832

haven't test them or got he performance figures - will run overnight and post up if successful

@adriaanm
Copy link
Contributor
adriaanm commented Apr 5, 2017 via email

Copy link
Member
@lrytz lrytz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM, but it overlaps with @mkeskells' #5831 (BType descriptor generation) and #5834 (Classfile writing IO). These PRs are more comprehensive than the corresponding changes here. We can either remove the two commits here or rebase the other PRs.

@retronym
Copy link
Member Author

My impression is that both of those PRs are worth pursuing, but that the initial versions herein will be a good starting point (simpler code changes / deliver much of the performance improvement), so it would make sense to merge this and rebase the other changes. That way, if we have to revert the more involved changes due to a regression or undesirable performance consequence, we'd be back to square 2, instead of square 1.

@lrytz
Copy link
Member
lrytz commented Apr 19, 2017

Sounds good to me.

@retronym retronym merged commit 3009880 into scala:2.12.x Apr 21, 2017
@adriaanm adriaanm added the performance the need for speed. usually compiler performance, sometimes runtime performance. label May 25, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance the need for speed. usually compiler performance, sometimes runtime performance.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants
0