8000 [v.2.1.0] Release Tracker · Issue #108055 · pytorch/pytorch · GitHub
[go: up one dir, main page]

Skip to content

[v.2.1.0] Release Tracker #108055

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
atalman opened this issue Aug 28, 2023 · 81 comments
Closed

[v.2.1.0] Release Tracker #108055

atalman opened this issue Aug 28, 2023 · 81 comments
Labels
triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Milestone

Comments

@atalman
Copy link
Contributor
atalman commented Aug 28, 2023

🐛 Describe the bug

We cut a release branch for the 2.1.0 release.

Our plan from this point from this point is roughly:

  • Phase 1 (until 09/11/23): work on finalizing the release branch
  • Phase 2 (after 09/11/23): perform extended integration/stability/performance testing based on Release Candidate builds.

This issue is for tracking cherry-picks to the release branch.

Cherry-Pick Criteria

Phase 1 (until 09/11/23):

Only low-risk changes may be cherry-picked from master:

  1. Fixes to regressions against the most recent minor release (e.g. 2.0.x for this release; see module: regression issue list)
  2. Critical fixes for: silent correctness, backwards compatibility, crashes, deadlocks, (large) memory leaks
  3. Critical fixes to new features introduced in the most recent minor release (e.g. 2.0.x for this release)
  4. Test/CI fixes
  5. Documentation improvements
  6. Compilation fixes or ifdefs required for different versions of the compilers or third-party libraries
  7. Release branch specific changes (e.g. change version identifiers)

Any other change requires special dispensation from the release managers (currently @atalman, @osalpekar, @huydhn, @malfet). If this applies to your change please write "Special Dispensation" in the "Criteria Category:" template below and explain.

Phase 2 (after 09/11/23):

Note that changes here require us to rebuild a Release Candidate and restart extended testing (likely delaying the release). Therefore, the only accepted changes are Release-blocking critical fixes for: silent correctness, backwards compatibility, crashes, deadlocks, (large) memory leaks

Changes will likely require a discussion with the larger release team over VC or Slack.

Cherry-Pick Process

  1. Ensure your PR has landed in master. This does not apply for release-branch specific changes (see Phase 1 criteria).

  2. Create (but do not land) a PR against the release branch.

    # Find the hash of the commit you want to cherry pick
    # (for example, abcdef12345)
    git log
    
    git fetch origin release/2.1
    git checkout release/2.1
    git cherry-pick abcdef12345
    
    # Submit a PR based against 'release/1.13' either:
    # via the GitHub UI
    git push my-fork
    
    # via the GitHub CLI
    gh pr create --base release/2.1
  3. Make a request below with the following format:

Link to landed master PR (if applicable):
* 

Link to release branch PR:
* 

Criteria Category:
* 
  1. Someone from the release team will reply with approved / denied or ask for more information.
  2. If approved, someone from the release team will merge your PR once the tests pass. Do not land the release branch PR yourself.

NOTE: Our normal tools (ghstack / ghimport, etc.) do not work on the release branch.

Please note HUD Link with branch CI status and link to the HUD to be provided here.
HUD

Versions

2.1.0

@atalman atalman added this to the 2.1.0 milestone Aug 28, 2023
@atalman atalman pinned this issue Aug 28, 2023
@atalman
Copy link
Contributor Author
atalman commented Aug 28, 2023

Link to landed master PR (if applicable):

  • NA

Link to release branch PR:

Criteria Category:

  • Release only changes

@atalman merged

@atalman
Copy link
Contributor Author
atalman commented Aug 28, 2023

Link to landed master PR (if applicable):

  • NA

Link to release branch PR:

Criteria Category:

  • Release only changes

@atalman merged

@mikaylagawarecki
Copy link
Contributor
mikaylagawarecki commented Aug 28, 2023

Link to landed master PR (if applicable):

Link to release branch PR:

Criteria Category:

  • Critical fixes to new feature

@atalman merged

@jansel
Copy link
Contributor
jansel commented Aug 29, 2023

Link to landed master PR (if applicable):

Link to release branch PR:

Criteria Category:

  • regression in pyhpc_turbulent_kinetic_energy versus PyTorch 2.0

@jansel could you please include a link to an issue in OSS or failing test in torchbench ? for this one and next 3 cherry-picks. Its not clear when regression appeared. Our criteria is:

jansel: done


@atalman merged

@jansel
Copy link
Contributor
jansel commented Aug 29, 2023

Link to landed master PR (if applicable):

Link to release branch PR:

Criteria Category:

  • regression in tacotron2 versus PyTorch 2.0

@atalman merged

@jansel
Copy link
Contributor
jansel commented Aug 29, 2023

Link to landed master PR (if applicable):

Link to release branch PR:

Criteria Category:

  • regression in DALLE2_pytorch versus PyTorch 2.0

@atalman merged

@jansel
Copy link
Contributor
jansel commented Aug 29, 2023

Link to landed master PR (if applicable):

Link to release branch PR:

Criteria Category:

  • regression in nanogpt_generate versus PyTorch 2.0

@atalman merged

@shunting314
Copy link
Contributor
shunting314 commented Aug 29, 2023

Link to landed master PR (if applicable):

Link to release branch PR:

Criteria Category:

  • critical fix for new feature

cc @jansel @albanD As per our discussion will not be merging this, this change was not tested enough

jansel: What discussion are you talking about? Why wasn't the author and myself included in it? cc @atalman

albanD: The discussion you were in @jansel on Tuesday where we decided not to upgrade the triton pin beyond what was there at branch cut time.

jansel: @alban this PR doesn't change the pin, it leaves it the same. This just fixes compatibility if users manually upgrade Triton (which is common, and what we and OpenAI will be telling H100 users to do).


@atalman merged

@XiaobingSuper
Copy link
Collaborator
XiaobingSuper commented Aug 29, 2023

Link to landed master PR (if applicable):

Link to release branch PR:

Criteria Category:

  • ci fix

@atalman merged

@huydhn
Copy link
Contributor
huydhn commented Aug 29, 2023

Link to landed master PR (if applicable):

Link to release branch PR:

Criteria Category:

  • Critical ci fix

@atalman merged

@huydhn
Copy link
Contributor
huydhn commented Aug 30, 2023

Link to landed master PR (if appl 8000 icable):

Link to release branch PR:

Criteria Category:

  • Critical ci fix

@atalman merged

@jataylo
Copy link
Collaborator
jataylo commented Aug 30, 2023

Link to landed master PR (if applicable):

Link to release branch PR:

Criteria Category:

  • Critical fixes for triton wheel functionality on ROCm

@atalman merged

@kshitij12345
Copy link
Collaborator
kshitij12345 commented Aug 30, 2023

Link to landed master PR (if applicable):

  • NA

Link to release branch PR:

Criteria Category:

  • Release only change

cc: @zou3519


@atalman merged

@colesbury colesbury added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Aug 30, 2023
@huydhn
Copy link
Contributor
huydhn commented Aug 30, 2023

Link to landed master PR (if applicable):

Link to release branch PR:

Criteria Category:

  • Critical ci fix

@atalman merged

@andrewor14
Copy link
Contributor
andrewor14 commented Aug 30, 2023

Link to landed master PR (if applicable):

Link to release branch PR:

Criteria Category:

  • Silent correctness

albanD: What is the critical silent correctness fixed here? This looks like new feature work given that this is a new API that the user is expected to use? And this doesn't solve the ignore of eval() mentioned in the issue either?


andrewor14: Yeah the silent correctness issue is not that obvious. Basically today we implicitly do the "dropout subgraph rewriting" in the convert call, so the UX for quantization looks like:

model = capture_pre_autograd_graph(model)
model = prepare_qat_pt2e(model, ...)
train(model)
model = convert_pt2e(model)  # dropout subgraph rewriting happens here
inference(model)

However, we aligned on making the "dropout subgraph rewriting" explicit in separate call outside of convert, so the new UX for quantization looks like:

model = capture_pre_autograd_graph(model)
model = prepare_qat_pt2e(model, ...)
train(model)
model = convert_pt2e(model)
torch.ao.quantization.move_model_to_eval(model)  # dropout subgraph rewriting happens here
inference(model)

In other words, if the move_model_to_eval is not available in 2.1 and the user continues to use the old UX (assuming that convert will implicitly take care of the dropout issue for them), then it'll be a silent BC breaking change when they upgrade from 2.1 to 2.2, since they're now supposed to call this new API move_model_to_eval, but this didn't exist in their code so they would just get a silently incorrect graph.

albanD: Thanks for the details, sounds good as a critical fix for new feature.


@osalpekar merged

@andrewor14
Copy link
Contributor
andrewor14 commented Aug 30, 2023

Link to landed master PR (if applicable):

Link to release branch PR:

Criteria Category:

  • Silent correctness

albanD: Is this actually to solve #103681 ? The issue is not marked as silent-correctness?


andrewor14: Yes exactly. I added the tag to the issue. The silent correctness issue here is users may export and then call model.eval() or model.train() and expect the graph to change behavior accordingly, but in fact nothing happens when you call those APIs. This affects quantization users especially, since they may be coming from other modes of quantization that does support model.eval() and model.train(). This is also part (2) of #108255.


@osalpekar merged

@jerryzh168
Copy link
Contributor
jerryzh168 commented Aug 30, 2023

Link to landed master PR (if applicable):

Link to release branch PR:

Criteria Category:

  • Special Dispensation

Need this for quantization 2.1 prototype release, we'd like to ask people to try out reference quantized model representation and give feedbacks, without this, we either have to say there is one missing op in 2.1 stable release or ask people to use nightlies instead

Also this is a very low risk change, related to quantization only


albanD: Why is this a critical fix for this feature? What is the reference model in question (Linear sounds like it would be used by any reference model you tested before branch cut)?

@jansel
Copy link
Contributor
jansel commented Aug 30, 2023

Link to landed master PR (if applicable):

Link to release branch PR:

Criteria Category:

  • Silent correctness

@jansel could you please include a link to an issue in OSS or failing test in torchbench ?

jansel: #108472


@atalman merged

@huydhn
Copy link
Contributor
huydhn commented Aug 31, 2023

Link to landed master PR (if applicable):

Link to release branch PR:

Criteria Category:

  • Critical ci fix

@atalman merged

@CaoE
Copy link
Collaborator
CaoE commented Aug 31, 2023

Link to landed master PR (if applicable):

Link to release branch PR:

Criteria Category:

  • Special Dispensation

Need this to support the channels last memory format for 3D models. Without this, when using channels last for 3D models, it may introduce a lot of memory format conversion, resulting in reduced performance. In addition, this PR uses is_channels_last passed in by torch for deconv to notify ideep whether to go channels last or not to align with the memory format check of torch.


@atalman merged

@Valentine233
Copy link
Collaborator
Valentine233 commented Aug 31, 2023

Link to landed master PR (if applicable):

Link to release branch PR:

Criteria Category:

  • Special Dispensation

Need this for SDPA (scaled dot product attention) optimization 2.1 release (2.1 prototype feature). A recent commit (#106274) breaks the pattern matching for SDPA, which makes the SDPA optimization in TorchInductor not take effect. This is a fix for the issue.


albanD: Isn't this a critical fix for a new feature from the feature list? Also why wasn't this caught before?

Valentine233: Answer @albanD, cc @atalman @Guobing-Chen

Isn't this a critical fix for a new feature from the feature list?

Yes, this is a critical fix for the feature SDPA. Without this fix, almost all SDPA-related models do not take effect.

Also why wasn't this caught before?

Because the commit breaking the pattern matching for SDPA was merged very closely (8/24) to PT2.1 code freezing day (8/25).

albanD: But I would expect that we have test in trunk that this is happening properly? Cherry picking sounds good
Can we add test to make sure that core sdpa is actually properly detected by this pattern?

Valentine233: We have tests for current SDPA pattern matchers. However, when the model graph changes, new pattern matchers are required. So I suppose that the test on the whole model is needed to make sure one model would have SDPA. cc @eellison @jansel


@atalman merged

@atalman
Copy link
Contributor Author
atalman commented Sep 12, 2023

Link to landed master PR (if applicable):

Link to release branch PR:

Criteria Category:

  • Critical fix, including some headers in C++ extension raises compilation error

@atalman merged

@thiagocrepaldi
Copy link
Collaborator
thiagocrepaldi commented Sep 12, 2023

Link to landed master PR (if applicable):
#108895

Link to release branch PR:
#109114

Criteria Category:
Critical fixes to new features introduced in the most recent minor release (e.g. 2.0.x for this release)

The new ONNX exporter based Torch Dynamo needs the latest ONNX Runtime version to run several models


@atalman merged

@atalman
Copy link
Contributor Author
atalman commented Sep 14, 2023

Link to landed master PR (if applicable):

Link to release branch PR:

Criteria Category:


@malfet merged

@angelayi
Copy link
Contributor
angelayi commented Sep 14, 2023

Link to landed master PR (if applicable):

Link to release branch PR:

Criteria Category:


@atalman merged

@atalman
Copy link
Contributor Author
atalman commented Sep 19, 2023

@gchanan
Copy link
Contributor
gchanan commented Sep 20, 2023

@atalman what about "Documentation improvements" -- presumably these don't require a new RC?

EDIT: context is it would be good to get the documentation for numpy compile updated for the release (#109710).

@gchanan yes we can still submit Documentation improvements

@ezyang
Copy link
Contributor
ezyang commented Sep 21, 2023

Doc only: #109764

Link to release branch PR: #109787


@atalman merged

@lezcano
Copy link
Collaborator
lezcano commented Sep 21, 2023

The PR @gchanan mentioned:

Doc only: #109710

Link to release branch PR: #109789


@atalman merged

@atalman
Copy link
Contributor Author
atalman commented Sep 21, 2023

Link to landed master PR (if applicable):

Link to release branch PR:

Criteria Category:

  • Critical Docker build fix

@atalman merged

@malfet
Copy link
Contributor
malfet commented Sep 21, 2023

Link to the landed main PR:

Link to the release branch PR:

Criteria Category:

  • Documentation fixes

@malfet merged

@malfet
Copy link
Contributor
malfet commented Sep 21, 2023

Link to the landed main PR:

Link to the release branch PR:

Criteria Category:

  • Documentation fixes

@malfet merged

@malfet
Copy link
Contributor
malfet commented Sep 21, 2023

Link to the landed main PR:

Link to the release branch PR:

Criteria Category:

  • Critical fix to a new feature (Python-3.11 support)

@malfet merged

@malfet
Copy link
Contributor
malfet commented Sep 21, 2023

Link to the landed main PR:

Link to the release branch PR:

Criteria Category:

  • Critical fix to a new feature (float8 support)

@malfet merged

@atalman
Copy link
Contributor Author
atalman commented Sep 21, 2023

Link to landed master PR (if applicable):

  • NA

Link to release branch PR:

Criteria Category:

  • Release only Docker build

@atalman merged

@atalman
Copy link
Contributor Author
atalman commented Sep 21, 2023

Link to landed master PR (if applicable):

Link to release branch PR:

Criteria Category:

  • Critical CI test fix

@atalaman merged

@atalman
Copy link
Contributor Author
atalman commented Sep 21, 2023

Link to landed master PR (if applicable):

Link to release branch PR:

Criteria Category:

  • Critical CI test fix

@atalman merged

@atalman
Copy link
Contributor Author
atalman commented Sep 21, 2023

Link to landed master PR (if applicable):

Link to release branch PR:

Criteria Category:


@atalman merged

@atalman
Copy link
Contributor Author
atalman commented Sep 21, 2023

We are in Phase 2 (after 09/11/23):

Note that changes here require us to rebuild a Release Candidate and restart extended testing (likely delaying the release). Therefore, the only accepted changes are Release-blocking critical fixes for: silent correctness, backwards compatibility, crashes, deadlocks, (large) memory leaks

@EwoutH
Copy link
EwoutH commented Sep 26, 2023

Will Python 3.12 be supported by the PyTorch 2.1.x series?

@atalman No patch releases contain only critical fixes. We advance Python and Cuda versions for minor or major releases.

@williamwen42
Copy link
Member

Python 3.12 won't be supported in 2.1.0. We currently do not have a plan for 3.12 support, but I'll get to it some time.

@vince62s
Copy link
vince62s commented Sep 28, 2023

is there a specific target date for 2.1 release ?

@atalman Please follow https://dev-discuss.pytorch.org/t/pytorch-release-2-1-0/1271 . M6: Release Day (10/04/23)

@atalman atalman unpinned this issue Oct 6, 2023
@atalman
Copy link
Contributor Author
atalman commented Oct 6, 2023

@atalman
Copy link
Contributor Author
atalman commented Oct 6, 2023

Link to landed master PR (if applicable):

Link to release branch PR:

Criteria Category:

  • Critical Docker build fix.

@atalman merged

@atalman
Copy link
Contributor Author
atalman commented Oct 6, 2023

Closing this task. Release 2.1 is complete

@atalman atalman closed this as completed Oct 6, 2023
@huydhn
Copy link
Contributor
huydhn commented Oct 11, 2023

Link to landed master PR (if applicable):

Link to release branch PR:

Criteria Category:

  • Critical CI change to relese Android binaries.

@huydhn merged

@huydhn
Copy link
Contributor
huydhn commented Oct 11, 2023

Link to landed master PR (if applicable):

  • NA

Link to release branch PR:

Criteria Category:

  • Release only changes

@huydhn merged

@huydhn
Copy link
Contributor
huydhn commented Oct 11, 2023

Link to landed master PR (if applicable):

  • NA

Link to release branch PR:

Criteria Category:

  • Release only changes

@huydhn merged

@huydhn
Copy link
Contributor
huydhn commented Oct 11, 2023

Link to landed master PR (if applicable):

Link to release branch PR:

Criteria Category:

  • Critical CI change to relese Android binaries.

@huydhn merged

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
None yet
Development

No branches or pull requests

0