Share one Spark cluster for all tests #1290
Draft
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Draft implementation of having one Spark cluster for all the tests. This is implemented as two ctest fixtures, one to spawn a Spark master and one to spawn a Spark worker in the background.
According to
https://spark.apache.org/docs/latest/spark-standalone.html#resource-scheduling, the Spark standalone cluster supports a basic FIFO scheduling method. As such, the only way to benefit from having multiple Spark applications running concurrently is to have a Spark worker with many cores (the more the better), and then launch Spark applications that only use 2 cores each (through the spark.cores.max config option).
This is a draft PR, just to document the progress done during the ROOT hackathon in March 2025. Anecdotally, I see some tangible improvement in the best case scenario on my laptop running only ctest, idle otherwise, with the following config:
test_all
suite in master: around 80s.py
test file concurrently, all internally creating a SparkContext that connects to the system process Spark worker: around 50s.The benefit is not yet completely clear, since these numbers would probably change heavily when put in the context of a real ROOT CI run. Specifically, creating a Spark worker process that takes all the cores of the CI machine means leaving little to no room for the other ctests.