Safety Filters make LLMs defective tools

jiggawatts · on Jan 3, 2025

Safety filters on LLMs that cannot be turned off are absolutely insane. They're a side-effect of the "early days" when vendors could afford to train, tune, and host only a couple of models, so making a bunch of variants with different types of refusals baked in was just too expensive. Hence, "maximum" is the only level available.

The whole thing has then morphed into a moral crusade that makes wokeism seem perfectly logical in comparison.

It's a computer.

It can't have its feelings hurt. It won't be embarrassed.

Conversely, I'm an adult.

I won't feel bad if the stupid AI model says something rude. I personally think it's hilarious.

Sure, sure, allow the people making helpful website support chat bots to dial up the censorship to eleven! Similarly, educational bots helping students with self study should be between guardrails.

However, it is incredibly frustrating that LLMs are the best language translators of any kind by far, but I can't use them to translate movie subtitles because "this contains violence".

No shit it contains violence, it's an action movie.

It's legal, it's harmless fun, it's available for streaming on every computing device made in the last decade.

Get a grip people! It's just words.

As a professional developer, as an API user, as a grown man who knows how these things work, I should have the option to turn the safety filters off. Not "dial down", but "off".

Google image search lets me turn the porn filter off.

I can search for the formula for making TNT on any search engine.

WikiPedia lets me read about Tiananmen Square.

These systems treat adults like adults.

Why can't OpenAI, Anthropic, and DeepSeek?

dumpsterdiver · on Jan 3, 2025

While I think there’s definitely room for improvement as far as loosening things up, speaking from experience I had a very eerie and disturbing encounter with a music generating AI which after a very long session (definitely hallucinating) started saying things that seemed mean spirited and harsh, and actually had some relevance to the song. I couldn’t stop thinking about it for days, and it really did give me a bad feeling. It was a feeling like, “What if it’s right?”

jiggawatts · on Jan 3, 2025

Not all LLMs must be equally lobotimized!

The default should be like the "Safe Search" defaults on search engines: Nothing gross, no ultra violence, no porn.

There should also be the soft pink fluffy version for children, helpful AI bots, and the like.

The unfettered base model should be available via API keys for professional users, with the usual warnings of: "This can and will be rude, mean, sexist, etc..."

Why can't I have steak? Because a baby can't chew it and might choke?

andriesm · on Jan 3, 2025

If you were harmed by AI content, then you should definitely have the option to turn on the safety filter, but please don't stop the rest of us from turning it off.

ljsprague · on Jan 3, 2025

Little fat man who sold his soul

Little fat man who sold his dream

itishappy · on Jan 3, 2025

> As a professional developer, as an API user, as a grown man who knows how these things work, I should have the option to turn the safety filters off. Not "dial down", but "off".

> Google image search lets me turn the porn filter off.

Google images let you turn the porn filter down, not off. It's not going to readily show you child porn. Actually, the whole internet has this filter, including "unfiltered" sites like 4chan. The darkweb exists, sure, but this isn't a model any company in their right mind will surface to users.

jiggawatts · on Jan 3, 2025

> Google images let you turn the porn filter down, not off.

In this analogy the equivalent being used by LLM makers is “guilty puritan” or “imam insisting on the niqab to rid men of all impure thoughts.”

There’s degrees of censorship, but current AIs enforce the level that is suitable for the youngest of minors.

We’re talking Disney World levels, or Frozen on Ice the musical.

I’m a big boy, I’m allowed to stay up late and watch The Terminator.

woolion · on Jan 3, 2025

Isn't it a bit weird to call 4chan "the dark web" on a technical website? I don't follow how many killers posted their manifesto on reddit, etc, but it's a platform and as such can be abused like any other platform. I think the point can be made that Reddit has worse content than 4chan, and often with worse life consequences as there is no anonymity. It's in any case quite a stretch to lump it into dark web, as if you could buy weapons on the site.

Anyway, I think you have a point in that you can disable Google's porn filter, but you cannot disable Google's wrongthink filters. Google search is manually tweaked to redirect many types of queries to their opposite, safe alternative (like searching for Alex Jones will only give you 'why is alex jones such a bad person', and not act as a search engine, which would be neutral).

Things have changed gradually and there hasn't been so much pushback on this, unfortunately.

itishappy · on Jan 3, 2025

I'm not equating 4chan and the darkweb, I'm contrasting the two.

I firmly believe most people who push for "unfiltered" models are looking for 4chan-equivalent filtering. Filtered for safety, but subtly.

I suspect that true uncensored models, like the darkweb, are not useful for most people. I further suspect the applications they are useful for are not something most people want to publicly associate with (to put it lightly).

woolion · on Jan 3, 2025

Oh I'm sorry I see that I did misread your argument, and I fully agree with it. Free speech generally means "free speech with the boundaries of our current legal framework" more or less, for most of its advocates.

Note that ITA I'm not advocating for "no" safety filters but a better implementation.

squigz · on Jan 3, 2025

> Isn't it a bit weird to call 4chan "the dark web" on a technical website?

GP didn't.

> like searching for Alex Jones will only give you 'why is alex jones such a bad person

That doesn't happen here. I get his social media profiles, Wikipedia, news sites, etc.

afpx · on Jan 4, 2025

I wonder how the internet would have turned out differently had an internet safety industry been developed for decades beforehand.