8000 reduce backend IO load by mkeskells · Pull Request #5834 · scala/scala · GitHub
[go: up one dir, main page]

Skip to content

reduce backend IO load #5834

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from
Closed

Conversation

mkeskells
Copy link
Contributor
@mkeskells mkeskells commented Apr 5, 2017

cache created directory,
faster IO and parsing of dir structure
dont use AbstractFile for files that you want to create

performance results - looks good

after 10 90% jvm, no GC
                  RunName	Count	           AllWallMS	                   CPU_MS	                Allocated
              00_baseline	36	  2499.87 [+1.05% -0.95%]	  2472.66 [+1.06% -0.96%]	   602.25 [+1.00% -1.00%]
              13_reduceIO	38	  1713.87 [+1.04% -0.96%]	  1701.48 [+1.05% -0.96%]	   600.59 [+1.00% -1.00%]

@scala-jenkins scala-jenkins added this to the 2.12.3 milestone Apr 5, 2017
@mkeskells
Copy link
Contributor Author
mkeskells commented Apr 6, 2017

will have a look at the test failure tonight

@lrytz
Copy link
Member
lrytz commented Apr 6, 2017

The two failures are:

!!  293 - run/macro-range                           [non-zero exit code]

Exception in thread "main" java.lang.ClassFormatError: Extra bytes at the end of class file Test
	at java.lang.ClassLoader.defineClass1(Native Method)
!! 1205 - run/t5717.scala                           [non-zero exit code]

/home/jenkins/workspace/scala-2.12.x-validate-test/test/files/run/t5717-run.obj/a
java.nio.file.FileAlreadyExistsException: /home/jenkins/workspace/scala-2.12.x-validate-test/test/files/run/t5717-run.obj/a
	at sun.nio.fs.UnixException.translateToIOException(UnixException.java:88)
	at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
	at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
	at sun.nio.fs.UnixFileSystemProvider.createDirectory(UnixFileSystemProvider.java:384)
	at java.nio.file.Files.createDirectory(Files.java:674)
	at java.nio.file.Files.createAndCheckIsDirectory(Files.java:781)
	at java.nio.file.Files.createDirectories(Files.java:727)
	at scala.tools.nsc.backend.jvm.OutputDirectories$PathDirInfo$PathDir.dir$lzycompute(BytecodeWriters.scala:48)
	at scala.tools.nsc.backend.jvm.OutputDirectories$PathDirInfo$PathDir.dir(BytecodeWriters.scala:48)
	at scala.tools.nsc.backend.jvm.OutputDirectories$PathDirInfo$PathDir.file(BytecodeWriters.scala:49)
	at scala.tools.nsc.backend.jvm.BytecodeWriters.getFile(BytecodeWriters.scala:121)
	at scala.tools.nsc.backend.jvm.BytecodeWriters.getFile$(BytecodeWriters.scala:114)
	at scala.tools.nsc.backend.jvm.BCodeHelpers.getFile(BCodeHelpers.scala:23)
	at scala.tools.nsc.backend.jvm.GenBCode$BCodePhase.sendToDisk$1(GenBCode.scala:433)

@mkeskells
Copy link
Contributor Author
mkeskells commented Apr 6, 2017

@lrytz thanks for the diagnostic - will have a look

Is the expectation that the compiler will delete whatever files to create the directories needed, and similarly delete directories to create files, when the names conflict. The test case is not specific (t5717), though it should be easy to fix to whatever is the expectation (either of the above seems dangerous though)

Always wary about deleting trees :(

I can fix this when I get a clarification on the expectation above

For the other issue - I suspect that this is due to there being duplicate classes with the same name in the test - as I read macro-range both "Range" and "Test" are duplicated in the root package. Does this test compile all 3 files are they incrementally compiler in phases

Sorry for my ignorance of this but haven't really looked at these tests and how then run

@lrytz
Copy link
Member
lrytz commented Apr 7, 2017

Running tests is easy in sbt (https://github.com/scala/scala#using-the-sbt-build)

About t5717: I also got confused and it took me a while. The compiler in fact reports an error message. Before SI-5717 was fixed, the compiler was crashing with an exception. The test only ensures that the compiler doesn't crash. I clarified that here: #5835

About macro-range: Partest supports separate compilation based on filenames. It compiles sources with the same _n ending in groups, and in increasing order. Each stage uses a new compiler instance.

In the example, it first compiles Common_1.scala, then Expansion_Impossible_2.scala, and finally Expansion_Possible_3.scala. When compiling the second file, a warning is emitted (not visible in the .check file):

warning: some macros could not be expanded and code fell back to overridden methods;
recompiling with generated classfiles on the classpath might help.
one warning found

The third file contains the exact same definitions of Range and Test, but because it's compiled with the macro implementation on the classpath, macro expansion succeeds. Compiling the third file should just the corresponding, existing classfiles.

@som-snytt
Copy link
Contributor

Paulp's first fix for SI-5717 was destructive, but the use case of outputting to the current dir is not far-fetched. I typically have a scratch or tmp dir where I have misc test classes, and occasionally a text file that is unfortunately named and clashes with a package. Maybe there's a useful heuristic, such as specifying an output dir makes it trashable.

@mkeskells
Copy link
Contributor Author

@lrytz thanks for that detail -

I will add a catch and some appropriate diagnostic. Probably Monday night when I am back

I have add a fix for the macro-range issue in 8dd934a

I think that this was caught by luck though. It was by chance that this test ( and this test only) wrote a slightly smaller file in the second pass ( not sure why)
I think that there should be a test to ensure that the file is just the content of the bytecode written, and will add one for that.

What is the naming convention of these tests?

I was aware of how to run the tests from sbt (https://github.com/scala/scala#using-the-sbt-build), but sbt partest fails on windows, and the ++ syntax doesn't work to override the build dir (from cmd)

@lrytz
Copy link
Member
lrytz commented Apr 10, 2017

What is the naming convention of these tests?

If there's a ticket on https://github.com/scala/bug/issues, it's tNNNN, otherwise be there's none..

@lrytz
Copy link
Member
lrytz commented Apr 10, 2017

sbt partest fails on windows, and the ++ syntax doesn't work to override the build dir (from cmd)

Ah! We have a nightly build on windows, and it's passing (https://scala-ci.typesafe.com/job/scala-2.12.x-integrate-windows/). However, this runs in cygwin, not cmd. @szeiger might be able to help out. Our README just says "Windows may work if you use Cygwin"..

@szeiger
Copy link
Contributor
szeiger commented Apr 10, 2017

"Runs in Cygwin" is overstating it a bit. In Cygwin you get the sbt launcher bash script instead of the Windows cmd script but that's it. The JVM processes are not Cygwin-aware, so once you're in the JVM there shouldn't be any difference. The only exception I can think of is calling an external "diff" tool to show diffs in case of partest failures. If there's no POSIX-compatible diff on the path, this will fail. We should replace it with a JVM-based diffing library.

I'd go looking at the sbt launcher and its configuration as possible sources of errors. AFAIR configuration of JVM parameters is done differently in the bash and cmd launchers, so this could explain different results (when the parameters that you expect are not actually used).

@som-snytt
Copy link
Contributor
som-snytt commented Apr 10, 2017

I was just looking at partest to make it pass -Xlint and also scala/scala-partest#55 oh wait I guess scala/scala-partest#81 about diff order, and I noticed that it will use git diff or else difflib as available, so possibly under --update-check it should always use difflib for reproducibility. Edit: or I guess it always just logs the log output; the diff is just for human consumption, either diff suffices to pass/fail the test. I'll look again later when I have time, but that seems obvious.

@szeiger
Copy link
Contributor
szeiger commented Apr 10, 2017

@som-snytt That's probably the right explanation. We already know from the problems leading to #5733 that @mkeskells' git output is not to be trusted.

@mkeskells
Copy link
Contributor Author

@som-snytt @szeiger I ran on a clean build from the start or last week after #5733 was merged

@mkeskells mkeskells force-pushed the 2.12.x_io_reduction branch 4 times, most recently from d65c7ea to 7b32e78 Compare April 11, 2017 07:55
@szeiger
Copy link
Contributor
szeiger commented Apr 11, 2017

@mkeskells #5733 wouldn't fix this but it's the same symptom. Executing git on your machine produces output that it shouldn't produce. The correct fix here is to install git properly so that this doesn't happen. But in order to make our build more robust we shouldn't rely on the output of external commands for things that can be easily done in the sbt process.

@mkeskells
Copy link
Contributor Author

@szeiger I preumed that it fixed git usage. I suppose this was just to get some minimal set running for the mkPack?

@mkeskells mkeskells force-pushed the 2.12.x_io_reduction branch from 7b32e78 to 3a0023a Compare April 11, 2017 21:36
@mkeskells
Copy link
Contributor Author

all test finally pass - can squash, but will leave so the reviewers can see the changes, and squash when I get the nod

Copy link
Member
@lrytz lrytz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes, they look good! Also thanks for the new tests.

I implemented a few cleanups here: rorygraves#2. We're trying to keep the formatting (more or less) consistent in this repo, so it'd be nice if you could use t: T (not t:T or t :T). Some other details:

  • No empty line before the first / after the last expression in a block (class, method)
  • Empty line between methods, classes
  • No space in foo(arg
  • Space in if (..), C {, foo = {

You can squash the changes into one commit.

Finally, the comment #5832 (comment) about RunnableInPhase also applies here.

@@ -13,9 +13,19 @@ object Test extends StoreReporterDirectTest {
// Let the compiler tell us output dir
// val sc = newCompiler("-cp", classpath, "-d", testOutput.path)
// val out = sc.settings.outputDirs.getSingleOutput.get
// TODO check the reported errors
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for this, just rebase this PR on current 2.12.x, #5835 is merged

@szeiger
Copy link
Contributor
szeiger commented Apr 12, 2017

@mkeskells That fix was only for using git to get the commit SHA and date, not for any other uses (but git diff should be the only one left now).

@mkeskells
Copy link
Contributor Author

is there a intellij formatter for the code style anywhere

@lrytz
Copy link
Member
lrytz commented Apr 13, 2017

I don't think so, also we don't have our conventions written down, I just mentioned a few things that people seem to follow here. Things are not too strict either.

It would be nice to adopt scalfmt, but require a bit of work..

@lrytz
Copy link
Member
lrytz commented Apr 13, 2017

This patch looks good to me now, I think you can squash everything into one. I'd also remove the first commit (RunnableInPhase) and treat that in a separate PR.

@rorygraves rorygraves force-pushed the 2.12.x_io_reduction branch from af1c05a to ece7fe2 Compare April 18, 2017 17:55
@mkeskells mkeskells force-pushed the 2.12.x_io_reduction branch from ece7fe2 to ba8f25e Compare April 18, 2017 21:19
cache created directory,
faster IO and parsing of dir structure
dont use AbstractFile for files that you want to create
@mkeskells mkeskells force-pushed the 2.12.x_io_reduction branch from ba8f25e to c293ac1 Compare April 18, 2017 21:21
@mkeskells
Copy link
Contributor Author

Squashed and removed the first commit (profiler enabler), and rebased as requested

The rebase changed so that t5717 now has validated output so I have changed the expectation to match the modified test
was

error: error writing a/B: t5717-run.obj/a/B.class: t5717-run.obj/a is not a directory

now

error: error writing a/B: t5717-run.obj/a
error: error writing a/b/B: t5717-run.obj/a/b: Not a directory
error: error writing a: t5717-run.obj/./a.class: Is a directory

@mkeskells
Copy link
Contributor Author

will revist this after @lrytz has completed his refactor

@mkeskells mkeskells closed this May 24, 2017
@SethTisue SethTisue removed this from the 2.12.3 milestone Jun 27, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants
0