> Fundamentally, LLMs are gullible. I'd say that the fundamental problem is mixi...

ryoshu · on Oct 27, 2024

This is a fundamental problem with these architectures. It's like having a SQL database with no way of handling prepared statements. I have yet to see a solution offered outside of rewriting queries, but that's a whack-a-mole problem.

amelius · on Oct 27, 2024

Maybe simply turn every token input t into a tensor of shape 2x1 and use t[0] for the original input and set t[1] to either 0 or 1 depending on whether it is a command or data. Then train the thing and punish it when it responds to data.