10000 added failure to answer configuration option explanation by jennm · Pull Request #29119 · DataDog/documentation · GitHub
[go: up one dir, main page]

Skip to content

added failure to answer configuration option explanation #29119

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
May 13, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions content/en/llm_observability/configuration/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -102,6 +102,7 @@ Connect your Amazon Bedrock account to LLM Observability with your AWS Account.
1. Select the span names you would like your evaluation to run on. (Optional if traces is selected).
1. Optionally, specify the tags you want this evaluation to run on and choose whether to apply the evaluation to spans that match any of the selected tags (Any of), or all of the selected tags (All of).
1. Select what percentage of spans you would like this evaluation to run on by configuring the **sampling percentage**. This number must be greater than 0 and less than or equal to 100. A Sampling Percentage of 100% means that the evaluation runs on all valid spans, whereas a sampling percentage of 50% means that the evaluation runs on 50% of valid spans.
1. (Optional) For Failure to Answer, if OpenAI or Azure OpenAI is selected, configure the evaluation by selecting what types of answers should be considered Failure to Answer. This configuration is detailed in [Failure to Answer Configuration][5].

After you click **Save**, LLM Observability uses the LLM account you connected to power the evaluation you enabled.

Expand Down Expand Up @@ -138,3 +139,4 @@ Topics can contain multiple words and should be as specific and descriptive as p
[2]: https://app.datadoghq.com/llm/settings/evaluations
[3]: /llm_observability/terms/#topic-relevancy
[4]: https://app.datadoghq.com/llm/applications
[5]: /llm_observability/terms/#failure-to-answer-configuration
11 changes: 11 additions & 0 deletions content/en/llm_observability/terms/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -183,6 +183,17 @@ This check identifies instances where the LLM fails to deliver an appropriate re
|---|---|---|
| Evaluated on Output | Evaluated using LLM | Failure To Answer flags whether each prompt-response pair demonstrates that the LLM application has provided a relevant and satisfactory answer to the user's question. |

##### Failure to Answer Configuration
The types of Failure to Answer are defined below and can be configured when the Failure to Answer evaluation is enabled.

| Configuration Option | Description | Example(s) |
|---|---|---|
| Empty Code Response | An empty code object like an empty list or tuple, signifiying no data or results | (), [], {}, "", '' |
| Empty Response | No meaningful response, returning only whitespace | whitespace |
| No Content Response | An empty output accompanied by a message indicated no content is available | Not found, N/A |
| Redirection Response | Redirects the user to another source of suggests an alternative approach | If you have additional details, I’d be happy to include them|
| Refusal Response | Explicitly declines to provide an answer or the complete the request | Sorry, I can't answer this question |

#### Language Mismatch

This check identifies instances where the LLM generates responses in a different language or dialect than the one used by the user, which can lead to confusion or miscommunication. This check ensures that the LLM's responses are clear, relevant, and appropriate for the user's linguistic preferences and needs.
Expand Down
Loading
0