Optimize LRScheduler docs #146684

zeshengzong · 2025-02-07T08:42:57Z

Fixes part of #120735 via add more description about LRScheduler

Changes

What the constructor's last_epoch argument is.
And does epoch start from 0 or 1?
Also there are two terms - "epoch", "step" - are they the same?

last_epoch explained via Args description, the difference might be clear after compare with step method description.

That the constructor relies on/creates the 'initial_lr' property on the .optimizer.
(By the way, is the Optimizer class up with it?)

initial_lr will be set when init LRScheduler, use optimizer lr value, which is initialized when create a optimizer. But these are inner implementation, users doesn't need c 8000 are about.

That .get_last_lr() and .get_lr() are totally different methods despite of naming.
Wait, there are also ._get_closed_form_lr() methods, hm...

Add description in get_last_lr and get_lr, but private method _get_closed_form_lr should not expose to users in docs.

That the constructor does .step() itself via ._initial_step() method.

Which arguments the .step() method has.
Or is it (the epoch argument) deprecated?

Add deprecate epoch in step method doc

What does .step() do?
Does it modify (in some way) the .optimizer? (See also p.2.)

Update step doc

7.Are the .last_epoch, .base_lrs attributes public?
(Don't know, maybe it is not accepted to publish such information.)
(For .last_epoch, see also p.1a.)

These params seems handled by LRScheduler itself and not give example for use in here, I assume users can omit these params when use LRScheduler.

Test Result

Before

After

cc @janeyx99

pytorch-bot · 2025-02-07T08:43:01Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/146684

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit af38ae0 with merge base 13339ce ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

janeyx99

Thanks for the attempt, but I don't think this PR addresses most of the concerns in the linked issue.

zeshengzong · 2025-02-17T07:23:09Z

Hi @janeyx99, please check this version, anything need to be added please let me know, thanks!

zeshengzong · 2025-03-07T03:14:18Z

Hello @janeyx99 , please check any improvements should be made when available, thanks!

zeshengzong · 2025-03-24T01:35:23Z

@pytorchbot rebase -b main

pytorchmergebot · 2025-03-24T01:36:55Z

@pytorchbot started a rebase job onto refs/remotes/origin/main. Check the current status here

pytorchmergebot · 2025-03-24T01:37:01Z

Successfully rebased opt/docs/LRScheduler onto refs/remotes/origin/main, please pull locally before adding more changes (for example, via git checkout opt/docs/LRScheduler && git pull --rebase)

janeyx99 · 2025-04-14T16:58:22Z

@zeshengzong sorry I haven't had the bandwidth to properly review this change--I want to make sure it actually addresses the concerns in the issue as clearly as possible. I'm hoping to get to it before the end of the week, feel free to ping me here if that does not happen to remind!

zeshengzong · 2025-04-25T09:51:48Z

Hello @janeyx99 , please help review this one when available, thanks!

janeyx99

Thanks for taking a look. This PR would not fully fix the concerns in the issue. See my review below and #120735 (comment).

I do not expect fixing the issue to be trivial.

janeyx99 · 2025-04-25T19:26:31Z

torch/optim/lr_scheduler.py

+        optimizer (Optimizer): Wrapped optimizer. (Find more about
+            `optimizer <https://pytorch.org/docs/stable/optim.html#base-class>`_) .
+        last_epoch (int, optional): The index of the last training epoch to resume.
+            Default: -1, starts from the optimizer lr.


Suggested change

Default: -1, starts from the optimizer lr.

Default: -1, starts from the optimizer lr.

The starts from the optimizer lr is confusing still, see #120735 (comment)

janeyx99 · 2025-04-25T19:27:12Z

torch/optim/lr_scheduler.py

@@ -252,7 +277,7 @@ class LambdaLR(LRScheduler):
    """Sets the initial learning rate.

    The learning rate of each parameter group is set to the initial lr
-    times a given function. When last_epoch=-1, sets initial lr as lr.
+    times a given function. When last_epoch=-1, use optimizer's lr as inital lr.


Suggested change

times a given function. When last_epoch=-1, use optimizer's lr as inital lr.

times a given function. When last_epoch=-1, use optimizer's lr as initial lr.

janeyx99 · 2025-04-25T19:27:47Z

torch/optim/lr_scheduler.py

@@ -458,7 +483,7 @@ class StepLR(LRScheduler):
    """Decays the learning rate of each parameter group by gamma every step_size epochs.

    Notice that such decay can happen simultaneously with other changes to the learning rate
-    from outside this scheduler. When last_epoch=-1, sets initial lr as lr.
+    from outside this scheduler. When last_epoch=-1, use optimizer's lr as inital lr.


Suggested change

from outside this scheduler. When last_epoch=-1, use optimizer's lr as inital lr.

from outside this scheduler. When last_epoch=-1, use optimizer's lr as initial lr.

What if last_epoch is not -1? Then what's the inital lr?

janeyx99 · 2025-04-25T19:28:03Z

torch/optim/lr_scheduler.py

@@ -1010,7 +1035,7 @@ class CosineAnnealingLR(LRScheduler):
            & T_{cur} = (2k+1)T_{max}.
        \end{aligned}

-    When last_epoch=-1, sets initial lr as lr. Notice that because the schedule
+    When last_epoch=-1, use optimizer's lr as inital lr. Notice that because the schedule


please spell initial correctly throughout this file

zeshengzong · 2025-04-28T12:20:15Z

Thanks for taking a look. This PR would not fully fix the concerns in the issue. See my review below and #120735 (comment).

I do not expect fixing the issue to be trivial.

Hi, thanks for your time! I was thinking about how much detail should be exposed to users in public doc, as from my view something like in #120735 (comment) has describe so much detail and may not actually need it when I use it, like:

So after the scheduler object, the "first" (i.e. <last_epoch argument>-th) step had been done.
Also the optimizer's 'lr's had been replaced).
None of this behavior is documented.

And only add description about last_epoch and params seems not enough to describe "how it works". A better solution might be draw a diagram about the interaction of lr_scheduler and optimizer in a training loop (also include user passed params). WDYT? @janeyx99

janeyx99 · 2025-05-02T21:20:17Z

I'd be curious to see the diagram, and I agree that LRScheduler is not the easiest to document with its many intricacies. We can try to land some version of this PR but not close the original issue, as I think the original issue brings up valid points that should be addressed more (not just in docs, but perhaps in future design as well).

pytorch-bot bot added the release notes: optim label Feb 7, 2025

pytorchbot added the open source label Feb 7, 2025

zeshengzong marked this pull request as ready for review February 7, 2025 09:16

zeshengzong requested review from albanD and janeyx99 as code owners February 7, 2025 09:16

albanD removed their request for review February 7, 2025 10:15 8000

mikaylagawarecki added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Feb 7, 2025

janeyx99 reviewed Feb 12, 2025

View reviewed changes

zeshengzong marked this pull request as draft February 14, 2025 01:32

zeshengzong force-pushed the opt/docs/LRScheduler branch from 99dd534 to aabf362 Compare February 17, 2025 03:09

zeshengzong marked this pull request as ready for review February 17, 2025 07:23

pytorchmergebot force-pushed the opt/docs/LRScheduler branch from aabf362 to a1304ed Compare March 24, 2025 01:37

zeshengzong added 3 commits April 23, 2025 10:27

Optimize LRScheduler docs

58caff7

Update

84e9c74

Update

af38ae0

zeshengzong force-pushed the opt/docs/LRScheduler branch from 9c82b7f to af38ae0 Compare April 23, 2025 02:27

janeyx99 reviewed Apr 25, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize LRScheduler docs #146684

Optimize LRScheduler docs #146684

	Default: -1, starts from the optimizer lr.
	Default: -1, starts from the optimizer lr.

	times a given function. When last_epoch=-1, use optimizer's lr as inital lr.
	times a given function. When last_epoch=-1, use optimizer's lr as initial lr.

	from outside this scheduler. When last_epoch=-1, use optimizer's lr as inital lr.
	from outside this scheduler. When last_epoch=-1, use optimizer's lr as initial lr.

Optimize LRScheduler docs #146684

Are you sure you want to change the base?

Optimize LRScheduler docs #146684

Conversation

Changes

Test Result

Before

After

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/146684

✅ No Failures

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment