Logprobs incorrectly computed from logits

Hi, thanks for these very useful bindings! It's made playing with llama.cpp much easier :-) 

I noticed that the current implementation of the `logprobs` option in the completion API uses `Llama.logit_to_logprob` to convert the logits reported by llama.cpp into logprobs. This sends each logit $x$ to $\log(e^{x} + 1)$.

```python
def logit_to_logprob(x: float) -> float:
    return math.log(1.0 + math.exp(x))
```

However, because the logits produced by LLaMA parameterize a Categorical distribution, I believe we must take their `softmax` to get the correct logprob: we should map each logit $x_i$ to $\log(\frac{e^x_i}{\sum_{j \in \{1, \dots, V\}} e^{x_j}})$. That is, we should take the elementwise `exp` of the entire vector of logprobs, and then renormalize so they sum to 1, before taking their elementwise `log`.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions