8000 NumPy support in torch.compile by lezcano · Pull Request #106211 · pytorch/pytorch · GitHub
[go: up one dir, main page]

Skip to content

Conversation

@lezcano
Copy link
Collaborator
@lezcano lezcano commented Jul 28, 2023

RFC: pytorch/rfcs#54
First commit is the contents of https://github.com/Quansight-Labs/numpy_pytorch_interop/

We have already been using this in core for the last few months as a external dependency. This PR pulls all these into core.

In the next commits, I do a number of things in this order

  • Fix a few small issues
  • Make the tests that this PR adds pass
  • Bend backwards until lintrunner passes
  • Remove the optional dependency on torch_np and simply rely on the upstreamed code
  • Fix a number dynamo tests that were passing before (they were not tasting anything I think) and are not passing now.

Missing from this PR (but not blocking):

All the tests in tests/torch_np take about 75s to run.

This was a work by @ev-br, @rgommers @honno and I. I did not create this PR via ghstack (which would have been convenient) as this is a collaboration, and ghstack doesn't allow for shared contributions.

cc @mruberry @rgommers @albanD @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @Xia-Weiwen @wenzhe-nrv @jiayisunx @peterbell10 @ipiszy @ngimel @yf225 @chenyang78 @kadeng @muchulee8 @aakhundov @anijain2305 @ev-br

@lezcano lezcano added module: numpy Related to numpy support, and also numpy compatibility of our operators module: dynamo release notes: dynamo labels Jul 28, 2023
@pytorch-bot
Copy link
pytorch-bot bot commented Jul 28, 2023

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/106211

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure, 4 Unrelated Failures

As of commit 237489d:

NEW FAILURE - The following job has failed:

BROKEN TRUNK - The following job failed but were present on the merge base 22bc08d:

👉 Rebase onto the `viable/strict` branch to avoid these failures

UNSTABLE - The following jobs failed but were likely due to flakiness present on trunk and has been marked as unstable:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@linux-foundation-easycla
Copy link
linux-foundation-easycla bot commented Jul 28, 2023

CLA Signed

The committers listed above are authorized under a signed CLA.

@lezcano
Copy link
Collaborator Author
lezcano commented Jul 28, 2023

@honno @rgommers @ev-br, could you please sign the CLA? Not sure why @ev-br does not show as a coauthor. I'll look into that. nvm he's there.

@lezcano
Copy link
Collaborator Author
lezcano commented Jul 28, 2023

For the reviewers, could you have a look at commits you feel comfortable reviewing? All the large ones are just either automated changes or trying to make lintrunner pass.

@lezcano
Copy link
Collaborator Author
lezcano commented Jul 28, 2023

Also, someone should comment on, legally speaking, how correct is it to upstream the NumPy tests. @rgommers @ezyang @malfet

@larryliu0820
Copy link
Contributor

I think all authors need to sign CLA

@larryliu0820
Copy link
Contributor

Does this mean numpy becomes a dependency of pytorch?

@rgommers
Copy link
Collaborator

Also, someone should comment on, legally speaking, how correct is it to upstream the NumPy tests. @rgommers @ezyang @malfet

This should be fine, we've included code from many other compatibly-licensed projects in the code base before. I suggest adding an entry in third_party/LICENSES_BUNDLED.txt pointing at the relevant directory. If it's within a (part of) a single file, then also add a comment (e.g., see at the top of test/test_typing.py which already says "based on NumPy ...").

Does this mean numpy becomes a dependency of pytorch?

No, no dependency is added in this PR, the changes are self-contained. numpy is already an optional runtime dependency, and required at build time (see pyproject.toml).

@janeyx99
Copy link
Contributor

I notice there are lots of places where tests rely on pytest and pytest tools.

If these tests are to be upstreamed, their test infra dependencies will have to be reconciled with our current test infra--we have our own nuanced equivalents of parametrize() (see

class parametrize(_TestParametrizer):
) as well as xfails and our general OpInfos (see ). In general, we don't want to introduce a pytest dependency + we have special handling in our test infra so bringing consistency there will be important.

For example, I have not gotten a chance to go through the numpy op testing, but we probably want to:

  1. reuse the OpInfo infra if there is sufficient 1:1 correspondence between torch ops and np ops, e.g., introduce a flag in OpInfo to do additional np checks if the flag is set
  2. Write out NumpyOpInfo if numpy ops do not match 1:1 with torch ops and plug into a common way to parametrize on dtype/certain levels of support.

@lezcano
Copy link
Collaborator Author
lezcano commented Jul 29, 2023

@janeyx99 these tests are copied with minor edits from NumPy's test suite. I don't think it'd be realistic to port them to our own suite. In fact, we tried very hard to make them pass with as few modifications as possible, as to be sure that we had implemented a compat layer that was replicating NumPy faithfully.

That being said, as I mentioned in in the OP, I am still to remove imports and dependencies on internal NumPy objects, as to not create dependencies on non-public NumPy objects.

If people really feel strongly about it, I could try to make the tests not depend on NumPy at all, but that would require significantly more effort, of course.

As to the pytest dependency, I see that it must be now at least an optional dependency, as there are many files like test_typing.py that use it, so this would be no different. Now, if people feel rather strongly about it, we could port it to use our internal tools, of course.

@lezcano
Copy link
Collaborator Author
lezcano commented Jul 29, 2023

Also, we discussed doing something like having our own OpInfos at the beginning of this project and we discarded it. Getting to the point where we are now when it comes to OpInfos would involve much more budget than we had for this project (the current OpInfos were implemented by many engineers throughout many months of work) so, even if it were cleaner, I don't think that the cost/benefit would be on our side.

@lezcano lezcano force-pushed the torch_np branch 2 times, most recently from 0c7925f to 9cb447c Compare July 29, 2023 01:58
@ezyang
Copy link
Contributor
ezyang commented Jul 29, 2023

Supporting @lezcano here; I also agree that it would be a lot of "makework" to "port" Numpy's test suite to PyTorch style. In fact, I have the opposite question, which is that in an ideal world, we would occasionally take updates from Numpy's test suite as Numpy evolves and we evolve with it. Is the test suite identical enough that this would be possible to do? (Removing the dependencies to Numpy internals would make this harder! But maybe it is more important to be broadly compatible with many versions of Numpy than it is to be able to copy paste in the Numpy suite?)

@lezcano
Copy link
Collaborator Author
lezcano commented Jul 29, 2023

We discussed the point of automatising the process of bringing tests from the NumPy test suite, but it's not particularly easy.

To put together this testing suite, we went through a reasonable curation of these tests, where some were skipped as they used features we were never going to support (endianness, arrays of strings...), those that we would potentially like to eventually support (we would xfail these) and we edited some removing the parts we were not going to support but leaving the rest (e.g. tests that would iterate over all the datatypes, we would remove long double and int16 and so on, while keeping the rest).

I think it'd be rather difficult to separate the signal from the noise if this process were to be automatised.

@ezyang
Copy link
Contributor
ezyang commented Jul 29, 2023

Disappointing but understandable

@lezcano
Copy link
Collaborator Author
lezcano commented Jul 29, 2023

cc @williamwen42 for all the dynamo errors in 3.11. I will try to file self-contained issues next week.

@janeyx99
Copy link
Contributor

@lezcano Yea I don’t feel strongly at all about having the test suite be strictly following PyTorch test infra (vs sticking to numpy’s) and what you and Ed said makes sense regarding expanding the scope of our tests. My observation is moreso that this change would certainly introduce lots of tests that vary from what PyTorch has been used to. If our current test infra (and people on the DevInfra team would know more) already automatically supports these tests with our features (like disable bot, test reporting, …), then that’s great! If not, then certain things may have to change in PyTorch test infra or there might be constraints worth bringing to light earlier rather than later.

@lezcano
Copy link
Collaborator Author
lezcano commented Aug 8, 2023

oopsie daisy

@facebook-github-bot
Copy link
Contributor

@ezyang has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@ezyang has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

1 similar comment
@facebook-github-bot
Copy link
Contributor

@ezyang has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@pytorchbot merge

(Initiating merge automatically since Phabricator Diff has merged)

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

atalman added a commit to atalman/pytorch that referenced this pull request Aug 11, 2023
@atalman
Copy link
Contributor
atalman commented Aug 15, 2023

This is failing windows smoke tests: https://github.com/pytorch/pytorch/actions/runs/5830091528/job/15823773511#step:13:471
As per discussion we don't want to keep hard dependency on numpy now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/inductor ciflow/periodic Trigger jobs ran periodically on master (periodic.yml) on the PR ciflow/trunk Trigger trunk jobs on your pull request Merged module: dynamo module: inductor module: numpy Related to numpy support, and also numpy compatibility of our operators open source release notes: dynamo skip-pr-sanity-checks triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0