[SPARK-46167][PS] Add axis implementation to DataFrame.rank #54009

devin-petersohn · 2026-01-27T15:55:14Z

What changes were proposed in this pull request?

Adds axis implementation to DataFrame.rank

Why are the changes needed?

Implements missing API

Does this PR introduce any user-facing change?

Yes, new API support

How was this patch tested?

CI

Was this patch authored or co-authored using generative AI tooling?

Co-authored-by: Claude Sonnet 4.5

Signed-off-by: Devin Petersohn <devin.petersohn@gmail.com> Co-authored-by: Devin Petersohn <devin.petersohn@snowflake.com>

github-actions · 2026-01-27T16:10:19Z

JIRA Issue Information

=== Sub-task SPARK-46167 ===
Summary: Add axis, pct and na_option parameter to DataFrame.rank
Assignee: None
Status: Open
Affected: ["4.0.0"]

This comment was automatically generated by GitHub Actions

Signed-off-by: Devin Petersohn <devin.petersohn@gmail.com>

HyukjinKwon · 2026-01-28T05:46:28Z

cc @gaogaotiantian FYI

gaogaotiantian · 2026-01-28T06:07:54Z

python/pyspark/pandas/frame.py

    def rank(
-        self, method: str = "average", ascending: bool = True, numeric_only: bool = False
+        self,
+        method: str = "average",


Probably need to change to a Literal here for type hint.

gaogaotiantian · 2026-01-28T06:14:36Z

python/pyspark/pandas/frame.py

+        method: str = "average",
+        ascending: bool = True,
+        numeric_only: bool = False,
+        axis: Axis = 0,


We need to make a decision for where axis should be. pandas has it at the very beginning - we are doing a different thing, which means if the user is sending the argument positionally, we would have a different result. On the other hand, if they are doing that, moving axis to the beginning would break their existing code too.

On a side note, pandas is moving towards keyword-only APIs very eagerly. We could also consider doing that here to avoid user sending the wrong argument.

We are incompatible with pandas now - might be a good chance to fix that and hurt the users early.

It's a good point, do we break compatibility with prior versions of Spark or pandas here? Moving to keyword only could break everyone, but most pandas code I've seen explicitly uses the keyword args. It makes sense to me to move to strict keyword support for this API and others moving forward to make the API more explicit (and Pythonic) anyway. Thoughts?

gaogaotiantian

I think the test is comprehensive. Took a quick look at the implementation - not the pandas/pyspark df expert but the pattern looks familiar.

The only thing I have is the API, which is brought up inline.

Signed-off-by: Devin Petersohn <devin.petersohn@gmail.com>

[SPARK-46167][PS] Add axis implementation to DataFrame.rank

5596d61

Signed-off-by: Devin Petersohn <devin.petersohn@gmail.com> Co-authored-by: Devin Petersohn <devin.petersohn@snowflake.com>

github-actions bot added PYTHON PANDAS API ON SPARK labels Jan 27, 2026

Fix

73a69b4

Signed-off-by: Devin Petersohn <devin.petersohn@gmail.com>

gaogaotiantian reviewed Jan 28, 2026

View reviewed changes

devin-petersohn added 2 commits January 28, 2026 09:56

Fix type hint

3dcac68

Signed-off-by: Devin Petersohn <devin.petersohn@gmail.com>

Fix again

3f93f00

Signed-off-by: Devin Petersohn <devin.petersohn@gmail.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-46167][PS] Add axis implementation to DataFrame.rank #54009

[SPARK-46167][PS] Add axis implementation to DataFrame.rank #54009

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[SPARK-46167][PS] Add axis implementation to DataFrame.rank #54009

Are you sure you want to change the base?

[SPARK-46167][PS] Add axis implementation to DataFrame.rank #54009

Conversation

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Was this patch authored or co-authored using generative AI tooling?

Uh oh!

JIRA Issue Information

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants