-
Notifications
You must be signed in to change notification settings - Fork 24.2k
Optimize LRScheduler docs #146684
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Optimize LRScheduler docs #146684
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/146684
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit af38ae0 with merge base 13339ce ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the attempt, but I don't think this PR addresses most of the concerns in the linked issue.
99dd534
to
aabf362
Compare
Hi @janeyx99, please check this version, anything need to be added please let me know, thanks! |
Hello @janeyx99 , please check any improvements should be made when available, thanks! |
@pytorchbot rebase -b main |
@pytorchbot started a rebase job onto refs/remotes/origin/main. Check the current status here |
Successfully rebased |
aabf362
to
a1304ed
Compare
@zeshengzong sorry I haven't had the bandwidth to properly review this change--I want to make sure it actually addresses the concerns in the issue as clearly as possible. I'm hoping to get to it before the end of the week, feel free to ping me here if that does not happen to remind! |
9c82b7f
to
af38ae0
Compare
Hello @janeyx99 , please help review this one when available, thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for taking a look. This PR would not fully fix the concerns in the issue. See my review below and #120735 (comment).
I do not expect fixing the issue to be trivial.
optimizer (Optimizer): Wrapped optimizer. (Find more about | ||
`optimizer <https://pytorch.org/docs/stable/optim.html#base-class>`_) . | ||
last_epoch (int, optional): The index of the last training epoch to resume. | ||
Default: -1, starts from the optimizer lr. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Default: -1, starts from the optimizer lr. | |
Default: -1, starts from the optimizer lr. |
The starts from the optimizer lr is confusing still, see #120735 (comment)
@@ -252,7 +277,7 @@ class LambdaLR(LRScheduler): | |||
"""Sets the initial learning rate. | |||
|
|||
The learning rate of each parameter group is set to the initial lr | |||
times a given function. When last_epoch=-1, sets initial lr as lr. | |||
times a given function. When last_epoch=-1, use optimizer's lr as inital lr. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
times a given function. When last_epoch=-1, use optimizer's lr as inital lr. | |
times a given function. When last_epoch=-1, use optimizer's lr as initial lr. |
@@ -458,7 +483,7 @@ class StepLR(LRScheduler): | |||
"""Decays the learning rate of each parameter group by gamma every step_size epochs. | |||
|
|||
Notice that such decay can happen simultaneously with other changes to the learning rate | |||
from outside this scheduler. When last_epoch=-1, sets initial lr as lr. | |||
from outside this scheduler. When last_epoch=-1, use optimizer's lr as inital lr. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from outside this scheduler. When last_epoch=-1, use optimizer's lr as inital lr. | |
from outside this scheduler. When last_epoch=-1, use optimizer's lr as initial lr. |
What if last_epoch is not -1? Then what's the inital lr?
@@ -1010,7 +1035,7 @@ class CosineAnnealingLR(LRScheduler): | |||
& T_{cur} = (2k+1)T_{max}. | |||
\end{aligned} | |||
|
|||
When last_epoch=-1, sets initial lr as lr. Notice that because the schedule | |||
When last_epoch=-1, use optimizer's lr as inital lr. Notice that because the schedule |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please spell initial correctly throughout this file
Hi, thanks for your time! I was thinking about how much detail should be exposed to users in public doc, as from my view something like in #120735 (comment) has describe so much detail and may not actually need it when I use it, like:
And only add description about |
I'd be curious to see the diagram, and I agree that LRScheduler is not the easiest to document with its many intricacies. We can try to land some version of this PR but not close the original issue, as I think the original issue brings up valid points that should be addressed more (not just in docs, but perhaps in future design as well). |
Fixes part of #120735 via add more description about
LRScheduler
Changes
last_epoch
explained via Args description, the difference might be clear after compare withstep
method description.initial_lr
will be set when init LRScheduler, use optimizer lr value, which is initialized when create a optimizer. But these are inner implementation, users doesn't need c 8000 are about.Add description in
get_last_lr
andget_lr
, but private method_get_closed_form_lr
should not expose to users in docs.Add deprecate
epoch
instep
method docUpdate
step
docThese params seems handled by
LRScheduler
itself and not give example for use in here, I assume users can omit these params when useLRScheduler
.Test Result
Before
After
cc @janeyx99