8000 GitHub - yanghu819/modded-nanogpt: Mort optimizer: momemtum orthogonal optimizer
[go: up one dir, main page]

Skip to content

Mort optimizer: momemtum orthogonal optimizer

License

Notifications You must be signed in to change notification settings

yanghu819/modded-nanogpt

 
 

Repository files navigation

Mort optimizer

The code snippet implements Gram-Schmidt orthogonalization to ensure the gradient is orthogonal to the momentum direction:

                    ref_norm = buf / (buf.norm() + 1e-8)
                    proj = g @ ref_norm.T @ ref_norm
                    g = g - proj

This process:

First normalizes the momentum vector for numerical stability Then computes the projection of the gradient onto the momentum direction Finally subtracts this projection to obtain the orthogonal component The resulting g_orth is guaranteed to be orthogonal to the momentum direction

About

Mort optimizer: momemtum orthogonal optimizer

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 99.4%
  • Other 0.6%
0