Are companies really just YOLOing and plugging LLMs into everything knowing prom...

Eji1700 · on Aug 20, 2024

> Are companies really just YOLOing and plugging LLMs into everything

Look we still can't get companies to bother with real security and now every marketing/sales department on the planet is selling C level members on "IT WILL LET YOU FIRE EVERYONE!"

If you gave the same sales treatment to sticking a fork in a light socket the global power grid would go down overnight.

"AI"/LLM's are the perfect shitstorm of just good enough to catch the business eye while being a massive issue for the actual technical side.

mns · on Aug 21, 2024

> Look we still can't get companies to bother with real security and now every marketing/sales department on the planet is selling C level members on "IT WILL LET YOU FIRE EVERYONE!"

Just recently one of our C level people was in a discussion on Linkedin about AI and was asking: "How long until an AI can write full digital products?", meaning probably how long until we can fire the whole IT/Dev departments. It was quite funny and sad in the same time reading this.

surfingdino · on Aug 20, 2024

The problem is that you cannot unteach it serving that shit. It's not like there is file you can delete. "It's a model, that's what it has learned..."

simonw · on Aug 20, 2024

If you are implementing RAG - which you should be, because training or fine-tuning models to teach them new knowledge is actually very ineffective, then you absolutely can unteach them things - simply remove those documents from the RAG corpus.

__loam · on Aug 21, 2024

I still don't understand the hype behind rag. Like yeah it's a natural language interface into whatever database is being integrated, but is that actually worth the billions being spent here? I've heard they still hallucinate even when you are using rag techniques.

simonw · on Aug 21, 2024

Being able to ask a question in human language and get back an answer is the single most useful thing that LLMs have to offer.

The obvious challenge here is "how do I ensure it can answer questions about this information that wasn't included in its training data?"

RAG is the best answer we have to that. Done well it can work great.

(Actually doing it well is surprisingly difficult - getting a basic implementation of RAG up and running is a couple of hours of hacking, making it production ready against whatever weird things people might throw at it can take months.)

neverokay · on Aug 21, 2024

Being able to ask a question in human language and get back an answer is the single most useful thing that LLMs have to offer.

I’m gonna add:

- I think this thing can become a universal parser over time.

__loam · on Aug 21, 2024

I recognize it's useful. I don't think it justifies the cost.

surfingdino · on Aug 21, 2024

Of course, it doesn't. Most of those questions are better answered using SQL and those which are truly complex can't be answered by AI.

gregatragenet3 · on Aug 21, 2024

What cost? A few cents per question answered?

__loam · on Aug 22, 2024

The billions spent on R&D, legal fees, and inference?

eru · on Aug 20, 2024

There's no global power grid. There are lots of local power grids.

Eji1700 · on Aug 21, 2024

There's also no mass marketing campaign for sticking forks in electrical sockets in case anyone was wondering.

Terr_ · on Aug 21, 2024

Pedantically, yes, but it doesn't really matter to OP's real message: The problematic effect would be global in scope, as people everywhere would do stupid things to an arbitrary number of discrete grids or generation systems.

xyst · on Aug 20, 2024

The S in LLM stands for safety!

SoftTalker · on Aug 21, 2024

Or Security.

btown · on Aug 21, 2024

"That's why we use multiple LLMs, because it gives us an S!"

Terr_ · on Aug 20, 2024

Yeah, there's some craziness here: Many people really want to believe in Cool New Magic Somehow Soon, and real money is riding on everyone mutually agreeing to keep acting like it's a sure thing.

> we still can't get LLMs to distinguish trusted and untrusted input...?

Alas, I think the fundamental problem is even worse/deeper: The core algorithm can't even distinguish or track different sources. The prompt, user inputs, its own generated output earlier in the conversation, everything is one big stream. The majority of "Prompt Engineering" seems to be trying to make sure your injected words will set a stronger stage than other injected words.

Since the model has no actual [1] concept of self/other, there's no good way to start on the bigger problems of distinguishing good-others from bad-others, let alone true-statements from false-statements.

______

[1] This is different from shallow "Chinese Room" mimicry. Similarly, output of "I love you" doesn't mean it has emotions, and "Help, I'm a human trapped in an LLM factory" obviously nonsense--well, at least if you're running a local model.

surfingdino · on Aug 20, 2024

Companies and governments. All racing to send all of their own as well as our data to the data centres of AWS, OpenAI, MSFT, Google, Meta, Salesforce, and nVidia.

neverokay · on Aug 21, 2024

Maybe. I think users will be largely in control of their context and message history over the course of decades.

Context is not being stored in Gemini or OpenAi (yet, I think, not to that degree).

My one year’s worth of LLM chats isn’t actually stored anywhere yet and doesn’t have to be, and for the most part I’d want it to be portable.

I’d say this is probably something that needs to be legally protected asap.

surfingdino · on Aug 21, 2024

My trust in AI operators not storing original content for later use is zero.

simonw · on Aug 21, 2024

If you pay them enough money you can sign a custom contract with them that means you can sue them to pieces if they are later found to be storing your original content despite saying that they aren't.

Personally I've decided to trust them when they tell me they won't do that in their terms and conditions. My content isn't actually very valuable to them.

rodgerd · on Aug 20, 2024

The AI craze is based on wide-scale theft or misuse of data to make numbers for the investor class. Funneling customer data and proprietary information and causing data breaches will, per Schmidt, make hundreds of billions for a handful of people, and the lawyers will clean up the mess for them.

Any company that tries to hold out will be buried by investment analysts and fund managers whose finances are contingent on AI slop.

titzer · on Aug 21, 2024

The whole idea that we're going to build software systems using natural language prompts to AI models which then promptly (heh) fall on their face because they mash together text strings to feed to a huge inscrutable AI is lazy and stupid. We're in a dumb future where "SUDO make me a sandwich" is a real attack strategy.

ryoshu · on Aug 20, 2024

Yes. And no one wants to listen to the people who deal with this for a living.

mr_toad · on Aug 21, 2024

> Are companies really just YOLOing and plugging LLMs into everything knowing prompt injection is possible?

This is the first time I’ve seen an AI use public data in a prompt. Most AI products only augment prompts with internal data. Secondly, most AI products render the results as text, not HTML with links.

simonw · on Aug 21, 2024

It’s very common for AI products to render markdown with links and sometimes images, hence this problem: https://simonwillison.net/tags/markdown-exfiltration/

8n4vidtmkvmk · on Aug 21, 2024

wat? ChatGPT renders links, images and much more.