E478 GRPO: size does not match at dimension 0 expected index [1685, 1] to be no larger than self [1684, 152064] apart from dimension 1 · Issue #6035 · modelscope/ms-swift · GitHub
[go: up one dir, main page]

Skip to content

GRPO: size does not match at dimension 0 expected index [1685, 1] to be no larger than self [1684, 152064] apart from dimension 1 #6035

@wizkdc

Description

@wizkdc

Describe the bug
I encountered an issue similar to #3377, but with a key difference: I was using the multiturn scheduler with the "last round" option enabled, and I did not enable sequence_parallel_size.

Through debugging, I found that the problem appears to stem from the following code snippet.

https://github.com/modelscope/ms-swift/blob/main/swift/trainers/rlhf_trainer/grpo_trainer.py#L2924-L2940

Specifically, when the length of the chunked data happens to be smaller than logits_to_keep (which is derived from the entire batch), it leads to this size mismatch.

To address this, I made a modification by adding a padding_to parameter. The code now runs correctly, but I'm not certain whether this is the optimal solution.

-    chunk_inputs.update(to_device(template.data_collator(encoded_data), self.model.device))
+    chunk_inputs.update(to_device(template.data_collator(encoded_data, padding_to=chunk_inputs["logits_to_keep"] + 1), self.model.device))

Your hardware and system info
Write your system info like CUDA version/system/GPU/torch version here(在这里给出硬件信息和系统信息,如CUDA版本,系统,GPU型号和torch版本等)

Additional context
Add any other context about the problem here(在这里补充其他信息)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      0