Pytorch PP requires all parameters to have grad in backward

@H-Huang

🐛 Describe the bug

When running Pytorch PP, the backward step code requires all parameters to have grad. This is too strict and perhaps should relax to at least one parameter requires grad.

def forward(self, param1, param2, ...):
  # currently, both param1 and param2 require grad

Error message:

  RuntimeError: [7] for chunk 0 has gradients None and is expecting to send gradients to stage 6,

Versions

Pytorch trunk

cc @H-Huang @awgu @wanchaol @fegin @fduwjj @wz337 @wconstab @d4l3k

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

🐛 Describe the bug

Versions

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Description

🐛 Describe the bug

Versions

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions