8000 [SPARK-53773][SQL] Recover alphabetic ordering of rules in `RuleIdCol… · apache/spark@2ed58ab · GitHub
[go: up one dir, main page]

Skip to content

Commit 2ed58ab

Browse files
committed
[SPARK-53773][SQL] Recover alphabetic ordering of rules in RuleIdCollection
### What changes were proposed in this pull request? This PR aims to recover alphabetic ordering of rules in `RuleIdCollection` class for Apache Spark 4.1.0. ### Why are the changes needed? Since `rulesNeedingIds` was originally defined to be in an alphabetic order like the following, we had better recover the ordering according to the original intention. https://github.com/apache/spark/blob/e04fd595370808bbf12b4c50980a86085fd20782/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/rules/RuleIdCollection.scala#L43-L44 > Rules here are in alphabetical order. Currently, (1) it has several outliers in terms of ordering and (2) the ordering is mixed with full class name and simple class name. For instance, `AnsiCombinedTypeCoercionRule` should be placed at the second if we consider simple class name. This PR makes it consistent via *full name* to fix the inconsistency including the following. https://github.com/apache/spark/blob/e04fd595370808bbf12b4c50980a86085fd20782/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/rules/RuleIdCollection.scala#L57-L59 https://github.com/apache/spark/blob/e04fd595370808bbf12b4c50980a86085fd20782/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/rules/RuleIdCollection.scala#L111-L116 https://github.com/apache/spark/blob/e04fd595370808bbf12b4c50980a86085fd20782/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/rules/RuleIdCollection.scala#L181-L182 https://github.com/apache/spark/blob/e04fd595370808bbf12b4c50980a86085fd20782/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/rules/RuleIdCollection.scala#L83 ### Does this PR introduce _any_ user-facing change? No behavior change. ### How was this patch tested? Pass the CIs. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #52495 from dongjoon-hyun/SPARK-53773. Authored-by: Dongjoon Hyun <dongjoon@apache.org> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
1 parent ca9c054 commit 2ed58ab

File tree

1 file changed

+20
-20
lines changed

1 file changed

+20
-20
lines changed

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/rules/RuleIdCollection.scala

Lines changed: 20 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -51,20 +51,18 @@ object RuleIdCollection {
5151
"org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveAggregateFunctions" ::
5252
"org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveAliases" ::
5353
"org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveBinaryArithmetic" ::
54-
"org.apache.spark.sql.catalyst.analysis.ResolveCollationName" ::
5554
"org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveDeserializer" ::
5655
"org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveEncodersInUDF" ::
5756
"org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveFunctions" ::
58-
"org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveProcedures" ::
5957
"org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveGenerate" ::
6058
"org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveGroupingAnalytics" ::
61-
"org.apache.spark.sql.catalyst.analysis.ResolveHigherOrderFunctions" ::
6259
"org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveInsertInto" ::
6360
"org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveNaturalAndUsingJoin" ::
6461
"org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveNewInstance" ::
6562
"org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveOrdinalInOrderByAndGroupBy" ::
6663
"org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveOutputRelation" ::
6764
"org.apache.spark.sql.catalyst.analysis.Analyzer$ResolvePivot" ::
65+
"org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveProcedures" ::
6866
"org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRandomSeed" ::
6967
"org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveReferences" ::
7068
"org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations" ::
@@ -86,34 +84,36 @@ object RuleIdCollection {
8684
"org.apache.spark.sql.catalyst.analysis.DeduplicateRelations" ::
8785
"org.apache.spark.sql.catalyst.analysis.EliminateSubqueryAliases" ::
8886
"org.apache.spark.sql.catalyst.analysis.EliminateUnions" ::
87+
"org.apache.spark.sql.catalyst.analysis.ResolveCollationName" ::
8988
"org.apache.spark.sql.catalyst.analysis.ResolveDefaultColumns" ::
89+
"org.apache.spark.sql.catalyst.analysis.ResolveExecuteImmediate" ::
9090
"org.apache.spark.sql.catalyst.analysis.ResolveExpressionsWithNamePlaceholders" ::
91+
"org.apache.spark.sql.catalyst.analysis.ResolveGroupByAll" ::
92+
"org.apache.spark.sql.catalyst.analysis.ResolveHigherOrderFunctions" ::
9193
"org.apache.spark.sql.catalyst.analysis.ResolveHints$ResolveCoalesceHints" ::
9294
"org.apache.spark.sql.catalyst.analysis.ResolveHints$ResolveJoinStrategyHints" ::
93-
"org.apache.spark.sql.catalyst.analysis.ResolveGroupByAll" ::
9495
"org.apache.spark.sql.catalyst.analysis.ResolveInlineTables" ::
9596
"org.apache.spark.sql.catalyst.analysis.ResolveLambdaVariables" ::
9697
"org.apache.spark.sql.catalyst.analysis.ResolveLateralColumnAliasReference" ::
9798
"org.apache.spark.sql.catalyst.analysis.ResolveOrderByAll" ::
9899
"org.apache.spark.sql.catalyst.analysis.ResolveRowLevelCommandAssignments" ::
99100
"org.apache.spark.sql.catalyst.analysis.ResolveSetVariable" ::
100-
"org.apache.spark.sql.catalyst.analysis.ResolveExecuteImmediate" ::
101+
"org.apache.spark.sql.catalyst.analysis.ResolveTableConstraints" ::
101102
"org.apache.spark.sql.catalyst.analysis.ResolveTableSpec" ::
102103
"org.apache.spark.sql.catalyst.analysis.ResolveTimeZone" ::
103104
"org.apache.spark.sql.catalyst.analysis.ResolveUnion" ::
105+
"org.apache.spark.sql.catalyst.analysis.ResolveUnresolvedHaving" ::
106+
"org.apache.spark.sql.catalyst.analysis.ResolveUpdateEventTimeWatermarkColumn" ::
104107
"org.apache.spark.sql.catalyst.analysis.ResolveWindowTime" ::
105108
"org.apache.spark.sql.catalyst.analysis.SessionWindowing" ::
106109
"org.apache.spark.sql.catalyst.analysis.SubstituteUnresolvedOrdinals" ::
107110
"org.apache.spark.sql.catalyst.analysis.TimeWindowing" ::
108111
"org.apache.spark.sql.catalyst.analysis.TypeCoercionBase$CombinedTypeCoercionRule" ::
109-
"org.apache.spark.sql.catalyst.analysis.UpdateOuterReferences" ::
110112
"org.apache.spark.sql.catalyst.analysis.UpdateAttributeNullability" ::
111-
"org.apache.spark.sql.catalyst.analysis.ResolveUpdateEventTimeWatermarkColumn" ::
113+
"org.apache.spark.sql.catalyst.analysis.UpdateOuterReferences" ::
112114
"org.apache.spark.sql.catalyst.expressions.EliminatePipeOperators" ::
113-
"org.apache.spark.sql.catalyst.expressions.ValidateAndStripPipeExpressions" ::
114-
"org.apache.spark.sql.catalyst.analysis.ResolveUnresolvedHaving" ::
115-
"org.apache.spark.sql.catalyst.analysis.ResolveTableConstraints" ::
116115
"org.apache.spark.sql.catalyst.expressions.ExtractSemiStructuredFields" ::
116+
"org.apache.spark.sql.catalyst.expressions.ValidateAndStripPipeExpressions" ::
117117
// Catalyst Optimizer rules
118118
"org.apache.spark.sql.catalyst.optimizer.BooleanSimplification" ::
119119
"org.apache.spark.sql.catalyst.optimizer.CollapseProject" ::
@@ -135,6 +135,8 @@ object RuleIdCollection {
135135
"org.apache.spark.sql.catalyst.optimizer.EliminateOuterJoin" ::
136136
"org.apache.spark.sql.catalyst.optimizer.EliminateSerialization" ::
137137
"org.apache.spark.sql.catalyst.optimizer.EliminateWindowPartitions" ::
138+
"org.apache.spark.sql.catalyst.optimizer.EvalInlineTables" ::
139+
"org.apache.spark.sql.catalyst.optimizer.GenerateOptimization" ::
138140
"org.apache.spark.sql.catalyst.optimizer.InferWindowGroupLimit" ::
139141
"org.apache.spark.sql.catalyst.optimizer.LikeSimplification" ::
140142
"org.apache.spark.sql.catalyst.optimizer.LimitPushDown" ::
@@ -145,12 +147,12 @@ object RuleIdCollection {
145147
"org.apache.spark.sql.catalyst.optimizer.OptimizeCsvJsonExprs" ::
146148
"org.apache.spark.sql.catalyst.optimizer.OptimizeIn" ::
147149
"org.apache.spark.sql.catalyst.optimizer.OptimizeJoinCondition" ::
148-
"org.apache.spark.sql.catalyst.optimizer.OptimizeRand" ::
149150
"org.apache.spark.sql.catalyst.optimizer.OptimizeOneRowPlan" ::
150-
"org.apache.spark.sql.catalyst.optimizer.Optimizer$OptimizeSubqueries" ::
151+
"org.apache.spark.sql.catalyst.optimizer.OptimizeRand" ::
151152
"org.apache.spark.sql.catalyst.optimizer.OptimizeRepartition" ::
152-
"org.apache.spark.sql.catalyst.optimizer.OptimizeWindowFunctions" ::
153153
"org.apache.spark.sql.catalyst.optimizer.OptimizeUpdateFields"::
154+
"org.apache.spark.sql.catalyst.optimizer.OptimizeWindowFunctions" ::
155+
"org.apache.spark.sql.catalyst.optimizer.Optimizer$OptimizeSubqueries" ::
154156
"org.apache.spark.sql.catalyst.optimizer.PropagateEmptyRelation" ::
155157
"org.apache.spark.sql.catalyst.optimizer.PruneFilters" ::
156158
"org.apache.spark.sql.catalyst.optimizer.PushDownLeftSemiAntiJoin" ::
@@ -159,40 +161,39 @@ object RuleIdCollection {
159161
"org.apache.spark.sql.catalyst.optimizer.PushLeftSemiLeftAntiThroughJoin" ::
160162
"org.apache.spark.sql.catalyst.optimizer.ReassignLambdaVariableID" ::
161163
"org.apache.spark.sql.catalyst.optimizer.RemoveLiteralFromGroupExpressions" ::
162-
"org.apache.spark.sql.catalyst.optimizer.GenerateOptimization" ::
163164
"org.apache.spark.sql.catalyst.optimizer.RemoveNoopOperators" ::
164165
"org.apache.spark.sql.catalyst.optimizer.RemoveRedundantAggregates" ::
165166
"org.apache.spark.sql.catalyst.optimizer.RemoveRepetitionFromGroupExpressions" ::
166167
"org.apache.spark.sql.catalyst.optimizer.ReorderAssociativeOperator" ::
167168
"org.apache.spark.sql.catalyst.optimizer.ReorderJoin" ::
169+
"org.apache.spark.sql.catalyst.optimizer.ReplaceDistinctWithAggregate" ::
168170
"org.apache.spark.sql.catalyst.optimizer.ReplaceExceptWithAntiJoin" ::
169171
"org.apache.spark.sql.catalyst.optimizer.ReplaceExceptWithFilter" ::
170-
"org.apache.spark.sql.catalyst.optimizer.ReplaceDistinctWithAggregate" ::
171-
"org.apache.spark.sql.catalyst.optimizer.ReplaceNullWithFalseInPredicate" ::
172172
"org.apache.spark.sql.catalyst.optimizer.ReplaceIntersectWithSemiJoin" ::
173+
"org.apache.spark.sql.catalyst.optimizer.ReplaceNullWithFalseInPredicate" ::
174+
"org.apache.spark.sql.catalyst.optimizer.RewriteAsOfJoin" ::
173175
"org.apache.spark.sql.catalyst.optimizer.RewriteExceptAll" ::
174176
"org.apache.spark.sql.catalyst.optimizer.RewriteIntersectAll" ::
175-
"org.apache.spark.sql.catalyst.optimizer.RewriteAsOfJoin" ::
176177
"org.apache.spark.sql.catalyst.optimizer.SimplifyBinaryComparison" ::
177178
"org.apache.spark.sql.catalyst.optimizer.SimplifyCaseConversionExpressions" ::
178179
"org.apache.spark.sql.catalyst.optimizer.SimplifyCasts" ::
179180
"org.apache.spark.sql.catalyst.optimizer.SimplifyConditionals" ::
180181
"org.apache.spark.sql.catalyst.optimizer.SimplifyExtractValueOps" ::
181182
"org.apache.spark.sql.catalyst.optimizer.TransposeWindow" ::
182-
"org.apache.spark.sql.catalyst.optimizer.EvalInlineTables" ::
183183
"org.apache.spark.sql.catalyst.optimizer.UnwrapCastInBinaryComparison" :: Nil
184184
}
185185

186186
if (Utils.isTesting) {
187187
rulesNeedingIds = rulesNeedingIds ++ {
188188
// In the production code path, the following rules are run in CombinedTypeCoercionRule, and
189189
// hence we only need to add them for unit testing.
190-
"org.apache.spark.sql.catalyst.analysis.AnsiTypeCoercion$PromoteStringLiterals" ::
191190
"org.apache.spark.sql.catalyst.analysis.AnsiTypeCoercion$DateTimeOperations" ::
192191
"org.apache.spark.sql.catalyst.analysis.AnsiTypeCoercion$GetDateFieldOperations" ::
192+
"org.apache.spark.sql.catalyst.analysis.AnsiTypeCoercion$PromoteStringLiterals" ::
193193
"org.apache.spark.sql.catalyst.analysis.DecimalPrecision" ::
194194
"org.apache.spark.sql.catalyst.analysis.TypeCoercion$BooleanEquality" ::
195195
"org.apache.spark.sql.catalyst.analysis.TypeCoercion$DateTimeOperations" ::
196+
"org.apache.spark.sql.catalyst.analysis.TypeCoercion$PromoteStrings" ::
196197
"org.apache.spark.sql.catalyst.analysis.TypeCoercionBase$CaseWhenCoercion" ::
197198
"org.apache.spark.sql.catalyst.analysis.TypeCoercionBase$ConcatCoercion" ::
198199
"org.apache.spark.sql.catalyst.analysis.TypeCoercionBase$Division" ::
@@ -203,7 +204,6 @@ object RuleIdCollection {
203204
"org.apache.spark.sql.catalyst.analysis.TypeCoercionBase$InConversion" ::
204205
"org.apache.spark.sql.catalyst.analysis.TypeCoercionBase$IntegralDivision" ::
205206
"org.apache.spark.sql.catalyst.analysis.TypeCoercionBase$MapZipWithCoercion" ::
206-
"org.apache.spark.sql.catalyst.analysis.TypeCoercion$PromoteStrings" ::
207207
"org.apache.spark.sql.catalyst.analysis.TypeCoercionBase$StackCoercion" ::
208208
"org.apache.spark.sql.catalyst.analysis.TypeCoercionBase$StringLiteralCoercion" ::
209209
"org.apache.spark.sql.catalyst.analysis.TypeCoercionBase$WindowFrameCoercion" :: Nil

0 commit comments

Comments
 (0)
0