>*The victim does not have to be in the public channel for the attack to work* O...

saintfire · on Aug 20, 2024

I do find citations helpful because I can check if the LLM just hallucinated.

It's not that seeing a citation makes me trust it, it's that I can fact check it.

Kagi's FastGPT is the first LLM I've enjoyed using because I can treat it as a summary of sources and then confirm at a primary source. Rather than sifting through increasingly irrelevant sources that pollute the internet.

cj · on Aug 20, 2024

> I really don't understand why anyone expects LLM citations to be correct

It can be done if you do something like:

1. Take user’s prompt, ask LLM to convert the prompt into a elastic search query (for example)

2. Use elastic search (or similar) to find sources that contain the keywords

3. Ask LLM to limit its response to information on that page

4. Insert the citations based on step 2 which you know are real sources

Or at least that’s my naive way of how I would design it.

The key is limiting the LLM’s knowledge to information in the source. Then the only real concern is hallucination and the value of the information surfaced by Elastic Search

I realize this approach also ignores benefits (maybe?) of allowing it full reign on the entire corpus of information, though.

Groxx · on Aug 20, 2024

It also doesn't prevent it from hallucinating something wholesale from the rest of the corpus it was trained on. Sometimes this is a huge source of incorrect results due to almost-but-not-quite matching public data.

But yes, a complete list of "we fed it this" is useful and relatively trustworthy in ways that "ask the LLM to cite what it used" is absolutely not.

mkehrt · on Aug 21, 2024

Why would you expect step 3 to work?

__loam · on Aug 21, 2024

That's the neat part, it doesn't