I wouldn't be surprised if this ends up like prop 65 cancer warnings, or cookie banners. The intention might be to separate believable but low quality hallucinated AI content spam from high quality manual content. But it will backfire like prop 65. You'll see notices everywhere because increasingly AI will be used in all parts of the content creation pipeline.
I see YouTube's own guidelines in the article and they seem reasonable. But I think over time the line will move, be unclear and we'll end up like prop 65 anyways.
The Prop 65 warnings are probably unhelpful even when accurate because they don't show anything about the level of risk or how typical or atypical it is for a given context. (I'm thinking especially about warnings on buildings more than on food products, although the same problem exists to some degree for food.)
It's very possible that Prop 65 has motivated some businesses to avoid using toxic chemicals, but it doesn't often help individuals make effective health decisions.
While you may think it didn’t have an effect a recent 99pi episode covered it and it sounds like it has definitely motivated many companies to remove chemicals from their products.
If it's something you've bought recently the offending ingredient should be listed. Otherwise, my money would be on lead being used as a plasticizer. Either way at least you have the tools to find out now.
Like is it one of those things the remove a 1 in a billion chance of cancer, and now have a product that wears out twice as fast leading to a doubling of sales?
Prop 65 is also way too broad. It needs to be specific about what carcinogens you’re being exposed to and not just “it’s a parking garage and this is our legally mandated sign”
Seems to still be pretty pointless considering that roads and parking lots and garages are all to be avoided if you want to avoid exposure… just stay away from any of those
The "sponsored content" tag on youtube seems to work very well though. Most content creators don't want to label their videos sponsored unless they are, I assume the same goes for AI generated content flags. Why would a manual content creator want to add that?
The "Sponsored Content" tag on a channel should link to a video of face / voice of the channel talking about what sponsored content means in a way that's FTC compliant.
That would be either poor understanding or poor enforcement of the rule, since they specifically list stuff special effects, beauty filters etc as allowed.
A more plausible scenario would be if you aren't sure if all your stock footage is real. Though with youtube creators being one of the biggest groups of customers for stock footage I expect most providers will put very clear labeling in place.
That's a much clearer line though, it's much simpler to know if you were paid to create this content or not. Use of AI isn't, especially if it's deep in some tool you used.
Does blurring part of the image with Photoshop count? What if Photoshop used AI behind the scene for whatever filter you applied? What about some video editor feature that helps with audio/video synchronization or background removal?
This is a problem of provenance (as it's known in the art world) and being certain of the provanence is a difficult thing to do - it's like converting a cowboy coded C++ project to consistently using const... you need to dig deep into every corner and prefer dependencies that obey proper const usage. Doing that as an individual content creator would be extremely daunting - but this isn't about individuals. If Getty has a policy against AI and guarantees no AI generation on their platform while Shutterstock doesn't[1] then creators may end up preferring Getty so that they can label their otherwise AI free content as such on Youtube - maybe it gets incorporated into the algorithm and gets them more views - maybe it's just a moral thing... if there's market pressure then the down-the-chain people will start getting stricter and, especially if one of those intermediary stock providers violates an agreement and gets hit with a lawsuit, then we might see a more concerted movement to crack down on AI generation.
At the end of the day it's going to be drenched in contracts and obscure proofs of trust - i.e. some signing cert you can attach to an image if it was generated on an entirely controlled environment that prohibits known AI generation techniques - that technical side is going to be an arms race and I don't know if we can win it (which may just result in small creators being bullied out of the market)... but above the technical level I think we've already got all the tools we need.
You may be interested in the Content Authenticity Initiative’s Content Credentials. The idea seems to be to keep a more-or-less-tamperproof provenance of changes to an image from the moment the light hits the camera’s sensor.
It sounds like the idea is to normalize the use of such an attribution trail in the media industry, so that eventually audiences could start to be suspicious of images lacking attribution.
Adobe in particular seems to be interested in making GenAI-enabled features of its tools automatically apply a Content Credential indicating their use, and in making it easier to keep the content attribution metadata than to strip it out.
Maybe this could motivate toolmakers to label their own products as “Uses AI” or “AI Free” allowing content creators verify their entire toolchain to be AI Free.
As opposed to today, where companies are doing everything they can, stretching the truth, just so they can market their tools as “Using AI.”
You can't use them - other tools that match most of the functionality without including AI tools will emerge and take over the market if this is an important thing to people... alternatively Adobe wises up and rolls back AI stuff or isolates it into consumer-level only things that mark images as tainted.
This is a great point and I don’t know. We are entering a strange and seemingly totally untrustworthy world. I wouldn’t want to have to litigate all this.
This is depressing, we’re going to intentionally use worse tools to avoid some idiotic scare label. Basically the entire GMO or “artificial flavor” debates all over again.
If you edit this image by hand you’re good, but if you use a tool that “uses AI” to do it, you need to put the scare label on. Even if pixel-for-pixel both methods output the identical image! Just as a GMO/not GMO has no correlation to harmful compounds being in the food, and artificial flavors are generally more pure than those extracted from some wacky and more expensive means from a “natural” item.
> To be effective, warnings like this have to be MANDATED on the item in question, and FORBIDDEN when not present.
I think for it to be effective you'd have to require them to provide an itemized list of WHAT is AI generated. Otherwise what if a content creator has a GenAI logo or feature that's in every video and put a lazy disclaimer.
> (This post may have been generated by AI; this notice in compliance with AI notification complications.)
For something like YouTube, you could have the video's progress bar be a different color for the AI sections. Maybe three: real, unknown, AI. Without an "unknown" type tag, you wouldn't be able to safely use clips.
This will make AI the new sesame allergen [1] — if you aren't 100% certain every asset you use isn't AI-generated, then it makes sense to stick some AI-generated content in and label the video accordingly, out of compliance.
Wow. This is an awesome education on why you can’t just regulate the world into what you want it to be without regard to feasibility. I’m sure the few who are allergic are mad, but it would also be messed up to just ban all “allergens” across the board - which is the only effective and fair way to guarantee that this approach couldn’t ever be used to comply with these laws. There isn’t much out there that somebody isn’t allergic to or intolerant of.
>would also be messed up to just ban all “allergens” across the board -
Lol, this sounds like one of those fabels where an idiot king bans all allergens then a week later everyone is starving to death in the kingdom because it turns out that in a large enough population there will be enough different allergies that everything gets banned.
> To be effective, warnings like this have to be MANDATED on the item in question, and FORBIDDEN when not present.
That already happens for foods.
The solution for suppliers is to intentionally add small quantities of allergens (sesame). [1] By having that as an actual ingredient, manufacturers don't have to worry about whether or not there is cross contamination while processing.
How much AI is enough to warrant it though. Like is human motion-capture based content AI or human? How about automatic touchup makeup? At what point does touch-up become face swap?
I’ve found Prop 65 warnings to be useful. They’re not pervasively everywhere; but when I see a Prop 65 warning, I consciously try to pick a product without it.
> You'll see notices everywhere because increasingly AI will be used in all parts of the content creation pipeline.
Which would be OK with me, personally. Right now, those cookie banners do serve a valuable function for me -- when I see them, I know to treat the site with caution and skepticism. If AI warnings end up similar, they too will serve a similar purpose. It's all better than nothing.
You put prop 65 as backfiring, but it looks to me like the original intent was reducing toxic products in tap water for instance and it largely achieved that goal.
From there warnings proliferated on so many more products, but getting told that chocolate bars can cause cancer is still a reasonable tradeoff. Especially as nothing is stopping the law from getting tweaked from there.
Comparing it to prop 65 or GDPR makes it look like a probably deeply effective, yes slightly annoying rule...I sure hope that's what we end up with.
The ePrivacy directive and GDPR don't literally require cookie banners but the former requires disclosure of specific information and the latter requires consent for most forms of data collection and processing. Even the 2002 directive actually require an option to refuse cookies which many cookie banners still fail to implement properly post-GDPR.
The problem is that most websites want to start collecting, tracking and processing data that requires consent before any interaction takes place that would allow for a contextual opt-in. This means they have to get that consent somehow and the "cookie banner" or consent dialog serves that purpose.
Of course many (especially American) implementations get this hilariously wrong by a) collecting and processing data even before consent is established, b) not making opt-out as trivial as opt-in despite the ePrivacy directive explicitly requiring this (e.g. hiding "refuse" behind a "more info" button or not giving it the same weight as "accept all"), c) not actually specifying the details on what data is collected etc to the level required by the directive, d) not providing any way to revise/change the selections (especially withdrawing consent previously given) and e) trying to trick users with a manual opt-out checkbox per advertiser/service labeled "legitimate interest" which is an alternative to consent and thus is not something you can opt out of because it does not require consent (but of course in these cases the use never actually qualifies as "legitimate interest" to begin with and the opt-out is a poorly constructed CYA).
In a different world, consent dialogs could work entirely like mobile app permissions: if you haven't given consent for something you'll be prompted when it becomes relevant. But apparently most sites bank on users pressing "accept all" to get rid of the annoying banner - although of course legally they probably don't even have data to determine if this gamble works for them because most analytics requires consent (i.e. your analytics will show a near 100% acceptance rate because you only see the data of users who opted into analytics and they likely just pressed "accept all").
Am I the only one who is bothered by calling this phenomenon "hallucinating"?
It's marketing-speak and corporate buzzwords to cover for the fact that their LLMs often produced wrong information because they aren't capable of understanding your request, nuance, or the training data it used is wrong, or the model just plain sucks.
Would we tolerate such doublespeak it were anything else? "Well, you ordered a side of fries with your burger but because our wait staff made a mistake...sorry, hallucinated, they brought you a peanut butter sandwich that's growing mold instead."
It gets more concerning when the stakes are raised. When LLMs (inevitably) start getting used in more important contexts, like healthcare. "I know your file says you're allergic to penicillin and you repeated when talking to our ai-doctor but it hallucinated that you weren't."
Human beings regularly hallucinate details that aren’t real when asked to provide their memories of an event, and often don’t realize they’re doing it at all. So whole AI definitely is lacking in the “can assess fact versus fiction” department, that’s an overlapping problem with “invents things that aren’t actually real”. It can, today, hallucinate accurate and inaccurate information, but it can’t determine validity at all, so it’s sometimes wrong even when not hallucinating.
I can't stand it being called "hallucinating" because it anthropomorphizes the technology. This isn't a conciousness that is "seeing" things that don't exist: it's a word generator that is generating words that don't make sense (not in a syntactic sense, but in a semantic sense).
Calling it "hallucination" implies that there are (other) moments when it is understanding the world correctly -- and that itself is not true. At those moments, it is a word generator that is generating words that DO make sense.
At no point is this a conciousness, and anthropomorphizing it gives the impression that it is one.
It isn't an error, either. It's doing exactly what it's intended to, exactly as it's intended to do it. The error is in the human assumption that the ability to construct syntactically coherent language signals self-awareness or sentience. That it should be capable of understanding the semantics correctly, because humans obviously can.
There really is no correct word to describe what's happening, because LLMs are effectively philosophical zombies. We have no metaphors for an entity that can appear to hold a coherent conversation, do useful work and respond to commands but not think. All we have is metaphors from human behavior which presume the connection between language and intellect, because that's all we know. Unfortunately we also have nearly a century of pop culture telling us "AI" is like Data from Star Trek, perfectly logical, superintelligent and always correct.
And "hallucination" is good enough. It gets the point across, that these things can't be trusted. "Confabulation" would be better, but fewer people know it, and it's more important to communicate the untrustworthy nature of LLMs to the masses than it is to be technically precise.
Calling it an error implies the model should be expected to be correct, the way a calculator should be expected to be correct. It generates syntactically correct language, and that's all it does. There is no "calculation" involved, so the concept of an "error" is meaningless - the sentences it creates either only happen to correlate to truth, or not, but it's coincidence either way.
> Calling it an error implies the model should be expected to be correct
To a degree, people do expect the output to be correct. But in my view, that's orthogonal to the use of the term "error" in this sense.
If an LLM says something that's not true, that's an erroneous statement. Whether or not the LLM is intended or expected to produce accurate output isn't relevant to that at all. It's in error nonetheless, and calling it that rather than "hallucination" is much more accurate.
After all, when people say things that are in error, we don't say they're "hallucinating". We say they're wrong.
> It generates syntactically correct language, and that's all it does.
Yes indeed. I think where we're misunderstanding each other is that I'm not talking about whether or not the LLM is functioning correctly (that's why I wouldn't call it a "bug"), I'm talking about whether or not factual statements it produces are correct.
It's a language model, trained on syntactically correct code, with a data set which presumably contains more correct examples of code than not, so it isn't surprising that it can generate syntactically correct code, or even code which correlates to valid solutions.
But if it actually had insight and knowledge about the code it generated, it would never generate random, useless (but syntactically correct) code, nor would it copy code verbatim, including comments and license text.
It's a hell of a trick, but a trick is what it is. The fact that you can adjust the randomness in a query should give it away. It's de rigueur around here to equate everything a human does with everything an LLM does, including mistakes, but human programmers don't make mistakes the way LLMs do, and human programmers don't come with temperature sliders.
It's not surprising if it generated syntactically correct code that does random things.
The fact that it instead generates syntactically correct code that, more often than not, solves - or at least tries to solve - the problem that is posited, indicates that there is a "there" there, however much one talks about stochastic parrots and such.
As for temperature sliders for humans, that's what drugs are in many ways.
> Would we tolerate such doublespeak it were anything else?
Yes: identity theft. My identity wasn't "stolen", what really happened was a company gave a bad loan.
But calling it identity theft shifts the blame. Now it's my job to keep my data "safe", not their job to make sure they're giving the right person the loan.
I don't get this at all. "Hallucinate" to me only can mean "produce false information". I've only ever seen it used perjoratively re: AI, and I don't understand what it covers up- how else are people interpreting it? I could see the point if you were saying that it implies sentience that isn't there, but your analogy to a restaurant implies that's not what you're getting at.
I think people are much more conservative with their health than text generation. If the text looks funky, you can just try regenerating it, or write it yourself and have only lost a few minutes. If your health starts looking funky, you're kind of screwed.
To me it sounds pretty damning. "The tool hallucinates" makes me think it's completely out of touch with reality, spouting nonsense. While "It has made a mistake, it is factually incorrect" would apply to many of my comments if taken very literally.
Webster definition: "a sensory perception (such as a visual image or a sound) that occurs in the absence of an actual external stimulus and usually arises from neurological disturbance (such as that associated with delirium tremens, schizophrenia, Parkinson's disease, or narcolepsy) or in response to drugs (such as LSD or phencyclidine)".
I would fire with prejudice any marketing department that associated our product with "delirium tremens, schizophrenia, [...] LSD or phencyclidine".
Nonsense. It isn't marketing speak to cover for anything. It's a pretty good description of what is happening.
The reason models hallucinate is because we train them to produce linguistically plausible output, which usually overlaps well with factually correct output (because it wouldn't be plausible to say e.g. "Barack Obama is white"). But when there isn't much data to show that something that is totally made up is implausible then there's no penalty to the model for it.
It's nothing to do with not being able to understand your request, and it's rarely because the training data is wrong.
So if I replied to your comment with "you are incorrect" I would be putting you in a worse light than saying "you are hallucinating"? The second is making it sound better? Doesn't feel that way to me.
My problem with "hallucination" isn't that it makes error sound better or worse, it's that it makes it sound like there's a consciousness involved when there isn't.
> Also those two statements are not mutually exclusive.
> Errors in statistical models being called hallucinations in the past does not mean that term is not marketing speak for what I said earlier.
The implicit claim was that they call this hallucination because it sounds better. In other words that some marketing people thought "what's a nicer word for 'mistakes'?" That is categorically untrue.
I don't think there's any point arguing about whether or not the marketers like the use of the word "hallucinate" because neither of us has any evidence either way. Though I was also say the null hypothesis is that they're just using the standard word for it. So the onus is on you to provide some evidence that marketers came in an said "guys, make sure you say 'hallucinate'". Which I'm 99% sure has never happened.
It's a term of art from the days of image recognition AI that would confidently report seeing a giraffe while looking at a picture of an ambulance.
It doesn't feel right to me either, to use it in the context of generative AI, and I'd support renaming this behaviour in GenAI (text and images both) — though myself I'd call this behaviour "mis-remembering".
Edit: apparently some have suggested "delusion". That also works for me.
AI can already create photo-realistic images, and the old "look at the hands" rule doesn't really work on images generated with modern models.
There may be a few tells still, but those won't last long, and the moment someone can find a new pattern you can make that a negative prompt for new images to avoid repeating the same mistake.
I think we are already there, and it seems like we aren't because many people are using free low-quality models with a low number of steps because its more accessible.
Yes. Nearly all EU regulations are going to end up like that. Over-regulate and people develop blindness to regulations. Our best hope right now is that EU becomes more and more irrelevant as the gap between US and EU grows to the point American companies can simply bankroll the EU leaders.
I see YouTube's own guidelines in the article and they seem reasonable. But I think over time the line will move, be unclear and we'll end up like prop 65 anyways.