8000 Feature Request: Add Min-P sampling layer · Issue #1154 · NVIDIA/TensorRT-LLM · GitHub
[go: up one dir, main page]

Skip to content
Feature Request: Add Min-P sampling layer #1154
@aikitoria

Description

@aikitoria

It would be very nice if the library supported using Min-P sampling as an alternative to Top-P/Top-K. This became popular for local LLMs in the past few months because it provides significantly more useful results, or at least feels like it does. More info here: https://www.reddit.com/r/LocalLLaMA/comments/17vonjo/your_settings_are_probably_hurting_your_model_why/

image

Most other libraries already support it, examples:
turboderp-org/exllamav2@0d436d7
ggml-org/llama.cpp#3841

This only requires a single parameter - consider all tokens whose probability is greater than than the probability of the first one scaled down by some number.

Metadata

Metadata

Labels

feature requestNew feature or request. This includes new model, dtype, functionality supporttriagedIssue has been triaged by maintainers

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions

    0