Closed
Description
I was reading a really interesting piece on Reddit regarding samplers, and a particularly interesting exchange came up which appears to have highlighted a discrepancy between the order llama.cpp applies temperature (to probabilities) while research literature / other implementations apply temperature earlier in the chain (to logits).
I thought it would be unfortunate for this discussion to die without visibility and discussion, so I've tossed up a GH issue.
I would normally look for historical / closed issues that are related, but I'm on my phone and that's rather complex.
The interesting Reddit discussion: