8000 optimise Btypes descriptor generation by mkeskells · Pull Request #5831 · scala/scala · GitHub
[go: up one dir, main page]

Skip to content

optimise Btypes descriptor generation #5831

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from

Conversation

mkeskells
Copy link
Contributor
@mkeskells mkeskells commented Apr 5, 2017

part of #5815 isolated to aid review

performance result look reasonable. The 2 results are for the 2 commits

after 10 90% jvm, no GC
                  RunName	Count	           AllWallMS	                   CPU_MS	                Allocated
              00_baseline	36	  2499.87 [+1.05% -0.95%]	  2472.66 [+1.06% -0.96%]	   602.25 [+1.00% -1.00%]
             14_cacheDesc	38	  2336.37 [+1.03% -0.96%]	  2321.96 [+1.03% -0.96%]	   618.26 [+1.00% -1.00%]
         15_cacheDesc_ASM	37	  2325.06 [+1.03% -0.97%]	  2302.36 [+1.02% -0.97%]	   608.97 [+1.00% -1.00%]

@scala-jenkins scala-jenkins added this to the 2.12.3 milestone Apr 5, 2017
@mkeskells mkeskells force-pushed the 2.12.x_backend_BTypes branch from 96cc3f5 to 29d8466 Compare April 6, 2017 00:08
* generally used by descriptor to help build the descriptor. Allows several descriptors to be build from the same stringbuilder
* @param sb
*/
private[BTypes] def buildDescriptor(sb: java.lang.StringBuilder): Unit
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer to use a pattern match here instead of virtual dispatch. With an overload override def descriptor(sb: java.lang.StringBuilder): String in Class/Array/MethodBType this should be possible.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what you are getting at
As I see it you either use a pattern match or virtual dispatch
Getting run of the 'this match { case ...} ' pattern in the code aids encapsulation, safety and performance IMHO

Why an overload? They are behaviourly different?

I am turn as to whether the buildDescriptor should return the descriptor or generate i as a side effect. Either way there are 2 effects, and I could not come up with a a better name before my attention moved

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we generally prefer matching on this for sealed data types like BType

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lrytz I think that it is a very bad idea. I did see large amounts or this elsewhere, and it is on my todo list. Is this personal preference or lightbend policy?

My view -

Given that you dont have warnings as fatal, it is fragile

From a perfomance perspective I would expect it to be non-optimal. Has this been benchmarked? I know that @retronym is a dab hand with JMH :-)
I would suggest that we need some figures to compare and determine what the effective costs and benefits are

Virtual dispatch, particularly from abstract classes rather than traits is very fast. Pattern matching is a sequential operation, slowed with unapply calls. Inner classes (e.g., from the global cake) make this even more onerous

I agree that is some cases it may be appropriate - code share, cold paths, but these paths are hot

Copy link
Member
@retronym retronym Sep 21, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pattern matching is a sequential operation, slowed with unapply calls

In general, your point is valid, and in some cases the performance does make a difference (e.g. in mapOver #5906 ), but it is important to note that most pattern matching is on case class constructors or plain old instanceOfs and doesn't involve unapply calls. So we should balance readability (which is sometimes enhanced with a pattern match) vs performance by confirming that there is an improvement in benchmarks from a refactoring.

override def descriptor: String = {
if (descriptorCache eq null) {
val sb = new java.lang.StringBuilder(internalName.length + 2)
buildDescriptor(sb)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there will probably be concurrent accesses to this method once we parallelize the backend, no?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it would be concurrent, but I believe that it is threadsafe

I don't believe that the cost of lock is worth the potential gain in lost CPU

the answer is the same (==) so 2 threads race one will win. Small chance that we will have 2 thread with identical strings, but this is not harmful

There is no chance of a incomplete string being visible to another thread as String constructor has a memory fence

@rorygraves rorygraves force-pushed the 2.12.x_backend_BTypes branch from 29d8466 to 1bee4de Compare May 20, 2017 08:14
@rorygraves rorygraves force-pushed the 2.12.x_backend_BTypes branch from 1bee4de to 14132b4 Compare May 24, 2017 20:38
@retronym
Copy link
Member
retronym commented May 30, 2017

I wasn't able to clearly measure an improvement from this change on our benchmark server.

That might be because we compile with an oversized heap (initial/max=2G) to try to avoid GC noise in the benchmark run, which could tilt things in favour of the status quo.

We're still getting the hang of benchmarking, so this is just a datapoint, not a conclusive answer.

@rorygraves
Copy link
Contributor

I'm planning on re-profiling this one during the week - hopefully I can add some more clarity on it.

@SethTisue SethTisue modified the milestones: 2.12.4, 2.12.3 Jun 27, 2017
@SethTisue SethTisue added WIP performance the need for speed. usually compiler performance, sometimes runtime performance. labels Jun 27, 2017
@adriaanm
Copy link
Contributor

What's the latest on the benchmarks? The window for 2.12.4 is closing soon (sign-off must happen by Wed Sep 27).

@mkeskells
Copy link
Contributor Author

I will not have time to met that cut off. Trying to work on the parallelization of the jvm first, but I doubt that will be ready in time either

@adriaanm
Copy link
Contributor

No problem, thanks for the update.

@adriaanm adriaanm modified the milestones: 2.12.4, 2.12.5 Sep 21, 2017
@SethTisue SethTisue modified the milestones: 2.12.5, 2.12.6 Feb 22, 2018
@SethTisue SethTisue modified the milestones: 2.12.6, 2.12.7 Mar 23, 2018
@adriaanm
Copy link
Contributor

I'm closing this PR because it has been dormant for a while now 95F8 . Happy to receive a resubmission!

@adriaanm adriaanm closed this Apr 30, 2018
@SethTisue SethTisue removed this from the 2.12.7 milestone Sep 10, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance the need for speed. usually compiler performance, sometimes runtime performance.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants
0