[go: up one dir, main page]

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve DocChatAgent citations #477

Closed
pchalasani opened this issue May 22, 2024 · 1 comment
Closed

Improve DocChatAgent citations #477

pchalasani opened this issue May 22, 2024 · 1 comment

Comments

@pchalasani
Copy link
Contributor
pchalasani commented May 22, 2024

This is an obvious-in-hindsight idea that should have been implemented long ago. It parallels the "sentence-numbering trick" used in the Relevance Extractor.

Currently, DocChatAgent.answer_from_docs(query, passages) (where passages are already relevant extracts from chunks, pulled using the LLM) sends this prompt to LLM:

Answer the QUERY based on the PASSAGES, and append CITE SOURCES you have used, showing for each 
source, the SOURCE and EXTRACTS, where EXTRACTS should at most contain the first 3 and last 3 words of each extract.

PASSAGES: 
{passages}

QUERY: 
{query}

This results in an LLM response that looks like:

In the year 2050, GPT10 was released. Additionally, all countries merged into Lithuania.

SOURCE: wikipedia
EXTRACTS: In the year ... GPT10 was released.

SOURCE: almanac
EXTRACTS: In the year ... merged into Lithuania.

SOURCE: world history, 2070 edition
EXTRACTS: All countries had  ... back in 2050

There are many issues with this:

  • having the LLM generate (even partial) extracts is wasteful (token cost), slow, and results in incomplete extracts (since we're trying to save tokens by only generating the first/last few words)
  • When the response is long, there may be several references used, but the above scheme results in all the references showing up at the end, rather than more granular references for different parts of the response. So we don't know which parts of the response came from which reference.

This can be much improved by instead doing this:

  • number the passages sent in the prompt, [1]... [2]... etc
  • ask the LLM to just cite sources using markdown footnote-notation like [^1][^3], etc
  • the code should then extract the final fully-detailed cited texts and display them (again in markdown footnote syntax) after the LLM generates its answer.

So the idea is just have LLM generate granular, numerical citations, and let the code extract the detailed source text (so we don't spend LLM token cost on this).

This will result in a response that is much more like a standard footnote or reference format:

In the year 2050, GPT10 was released [^1]. Additionally, all countries merged into Lithuania [^2][^5].

SOURCES:
[^1] wikipedia
    In the year 2050, GPT10 was released.
[^2] almanac
    In the year 2050, all countries merged into Lithuania.
[^5] world history, 2070 edition
    All countries had already become part of Lithuania, back in 2050

Note the granular citations. Also, unlike the existing approach, the citations are detailed, not just snippets, and are not generated by the LLM (they are extracted from the LLM's numerical citations).

pchalasani added a commit that referenced this issue May 24, 2024
* doc_chat_agent: better citation mechanism

* handle LLM deviations in table_chat_agent.py

* fix rich spinner/streaming edge cases

* lance query planner tweaks to help gpt-4o

* chat_agent.py fix timing of response -> ChatDocument
@pchalasani
Copy link
Contributor Author

implemented in PR #476

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant