sampling: make top_n_sigma no-op at <=0 rather than <0 #13345
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull reques
2AEF
t is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This changes the behavior of the recently-added
top_n_sigma
sampler to a short-circuit no-op state at values <= 0 rather than < 0. The rationale for this change is as follows:top_n_sigma
== 0 is redundant as it is a more roundabout way to achieve greedy decoding, which already has other means of being specified, i.e.top_k
== 1top_n_sigma
== 0 represents no-op rather than greedy decoding in other existing tooling (i.e. text-generation-webui, aphrodite-engine, koboldcpp, YALS), so this would keep the interface consistent for frontend developers