GRPO: size does not match at dimension 0 expected index [1685, 1] to be no larger than self [1684, 152064] apart from dimension 1 · Issue #6035 · modelscope/ms-swift · GitHub
You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
I encountered an issue similar to #3377, but with a key difference: I was using the multiturn scheduler with the "last round" option enabled, and I did not enable sequence_parallel_size.
Through debugging, I found that the problem appears to stem from the following code snippet.
Specifically, when the length of the chunked data happens to be smaller than logits_to_keep (which is derived from the entire batch), it leads to this size mismatch.
To address this, I made a modification by adding a padding_to parameter. The code now runs correctly, but I'm not certain whether this is the optimal solution.