-
Notifications
You must be signed in to change notification settings - Fork 256
Closed
Labels
Description
Describe the bug
Spark 3.5.5, graphframes 0.9.3
Trying to set broadcastThreshold to -1 fails with error:
java.lang.IllegalArgumentException: requirement failed: Broadcast threshold must be non-negative but got -1.
at scala.Predef$.require(Predef.scala:281)
at org.graphframes.WithBroadcastThreshold.setBroadcastThreshold(mixins.scala:76)
at org.graphframes.WithBroadcastThreshold.setBroadcastThreshold$(mixins.scala:75)
at org.graphframes.lib.ConnectedComponents.setBroadcastThreshold(ConnectedComponents.scala:50)
... 47 elided
To Reproduce
Steps to reproduce the behavior:
- Run
spark-shell --packages io.graphframes:graphframes-spark3_2.12:0.9.3 \
--conf spark.checkpoint.dir=/tmp/example-checkpoint
import org.graphframes.GraphFrame
val g = GraphFrame(
spark.range(5).select("id"),
spark.range(10)
.selectExpr(
"id",
"id % 5 as src",
"(id + 1) % 5 as dst"
)
)
val results = (
g.connectedComponents
.setAlgorithm("graphframes")
.setBroadcastThreshold(-1)
.run()
)
java.lang.IllegalArgumentExceptionis thrown
The same happens with the Python API
Expected behavior
According to docs here and example here, we should be able to set broadcast threshold to -1
System [please complete the following information]:
- OS: MacOS 14.8.1
- Python Version (if applied): Python 3.11.13
- Spark / PySpark version: Spark 3.5.5 / PySpark 3.5.5
- GraphFrames version: graphframes-0.9.3
Component
- Scala Core Internal
- Scala API
- Spark Connect Plugin
- PySpark Classic
- PySpark Connect
Additional context
I could not find the message 'Broadcast threshold must be non-negative but got' in the repo, which makes me wonder where it's coming from, so it might not be a bug directly in graphframes. I'm going to continue with the default value for now and investigate this further soon.
Are you planning on creating a PR?
- I'm willing to make a pull-request