8000 Move the udf module to user_defined by timsaucer · Pull Request #1112 · apache/datafusion-python · GitHub
[go: up one dir, main page]

Skip to content

Move the udf module to user_defined #1112

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
May 17, 2025

Conversation

timsaucer
Copy link
Contributor

Which issue does this PR close?

None

Rationale for this change

Right now we have some minor confusion when importing from the udf module because at the root of datafusion we also have a udf method. This is a minor inconvenience, but it has shown up a few times where a user wants to import one or the other. This removes the uncertainty by changing the udf module to user_defined. Also since this module includes scalar, aggregate, and window functions it is a cleaner organization.

What changes are included in this PR?

  • Move the datafusion.udf module to datafusion.user_defined. This is a breaking change. Fortunately most historical users have been using the helper functions which exist in the root module.
  • Correct documentation build errors. Specifically there were examples in markdown instead of rst format.
  • Update the text of the UDF documentation to be more consistent with other portions of the code base.

Are there any user-facing changes?

Users will need to update their module imports from datafusion.udf to datafusion.user_defined. The datafusion.udf function is NOT impacted.

… rest of the site and to correct some errors when building the rst files using autoapi
@timsaucer
Copy link
Contributor Author

@crystalxyz FYI this does touch on the documentation you wrote. Please let me know if you think I should change back the formatting. I did need to switch from markdown to rst format for our documentation build system.

@crystalxyz
Copy link
Contributor

@timsaucer The documentation changes look good to me! Do you think that adding a comment in python/datafusion/user_defined to explain the renaming would be helpful? People reading this file might get confused.

@timsaucer
Copy link
Contributor Author
timsaucer commented May 16, 2025

@mesejo @kosiew @chenkovsky @kevinjqliu Would any of you mind doing a review? We haven't released datafusion-python in a while and this is one of the PRs waiting to go in. 1/3

@kosiew
Copy link
Contributor
kosiew commented May 16, 2025

Looks good to me.

Considering there is a breaking change, I think it would be a good idea to also add a follow up task to add a migration guide in the ../dev/changelog/.. for the new release.

Copy link
Contributor
@kevinjqliu kevinjqliu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

I want to double check that this is fine since this is the udf function
https://grep.app/search?q=from+datafusion+import+udf

@timsaucer timsaucer merged commit e8aa671 into apache:main May 17, 2025
17 checks passed
@timsaucer timsaucer deleted the feat/udf-organization branch May 17, 2025 20:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants
0