FaceDeer

FaceDeer@fedia.io · 5 hours ago

But you’re claiming that this knowledge cannot possibly be used to make a work that infringes on the original.

I am not. The only thing I’ve been claiming is that AI training is not copyright violation, and the AI model itself is not copyright violation.

As an analogy, you can use Photoshop to draw a picture of Mario. That does not mean that Photoshop is violating copyright by existing, and Adobe is not violating copyright by having created Photoshop.

You claimed that AI training is not even in the domain of copyright, which is different from something that is possibly in that domain, but is ruled to not be infringing.

I have no idea what this means.

I’m saying that the act of training an AI does not perform any actions that are within the realm of the actions that copyright could actually say anything about. It’s like if there’s a law against walking your dog without a leash, and someone asks “but does it cover aircraft pilots’ licenses?” No, it doesn’t, because there’s absolutely no commonality between the two subjects. It’s nonsensical.

Honestly, none of your responses have actually supported your initial position.

I’m pretty sure you’re misinterpreting my position.

The “copyright situation” regarding an actual literal picture of Mario doesn’t need to be fixed because it’s already quite clear. There’s nothing that needs to change to make an AI-generated image of Mario count as a copyright violation, that’s what the law already says and AI’s involvement is irrelevant.

When people talk about needing to “change copyright” they’re talking about making something that wasn’t illegal previously into something that is illegal after the change. That’s presumably the act of training or running an AI model. What else could they be talking about?

FaceDeer@fedia.io · 8 hours ago

The parents weren’t paying attention to their obviously disturbed kid and they left a gun lying around for him to find. But sure, it was the chatbot that was the problem. Everything would have been perfectly fine forever without it.

FaceDeer@fedia.io · 9 hours ago

Yes, that’s what I said. There are no “additional restrictions” from having a GPL license on something. The GPL license works by giving rights that weren’t already present under the default copyright. You can reject the GPL on an open sourced piece of software if you want to, but then you lose the additional rights that the GPL gives you.

FaceDeer@fedia.io · 11 hours ago

I’d say it can be a problem because there have been examples of getting AIs to spit out entire copyrighted passages.

Examples that have turned out to either be a result of great effort to force the output to be a copy, a result of poor training techniques that result in overfitting, or both combined.

If this is really such a straightforward case of copyright violation, surely there are court cases where it’s been ruled to be so? People keep arguing legality without ever referencing case law, just news articles.

Furthermore, some works can have additional restrictions on their use. I couldn’t for example train an AI on Linux source code, have it spit out the exact source code, then slap my own proprietary commercial license on it to bypass GPL.

That’s literally still just copyright. There’s no “additional restrictions” at play here.

FaceDeer@fedia.io · 11 hours ago

Learning what a character looks like is not a copyright violation. I’m not a great artist but I could probably draw a picture that’s recognizably Mario, does that mean my brain is a violation of copyright somehow?

Yet evidence supports it, while you have presented none to support your claims.

I presented some, you actually referenced what I presented in the very comment where you’re saying I presented none.

You can actually support your case very simply and easily. Just find the case law where AI training has been ruled a copyright violation. It’s been a couple of years now (as evidenced by the age of that news article you dug up), yet all the lawsuits are languishing or defunct.

FaceDeer@fedia.io · 11 hours ago

Very basically, yes. But the result is a model that doesn’t actually contain the training data, it’s too small for it to be physically possible.

FaceDeer@fedia.io · 13 hours ago

Sure. But that’s not what’s happening when an AI is trained. It’s not “stealing” the script or content of the video, it’s analyzing them.

FaceDeer@fedia.io · 13 hours ago

That article is over a year old. The NYT case against OpenAI turned out to be quite flimsy, their evidence was heavily massaged. What they did was pick an article of theirs that was widely copied across the Internet (and thus likely to be “overfit”, a flaw in training that AI trainers actively avoid nowadays) and then they’d give ChatGPT the first 90% of the article and tell it to complete the rest. They tried over and over again until eventually something that closely resembled the remaining 10% came out, at which point they took a snapshot and went “aha, copyright violated!”

They had to spend a lot of effort to get that flimsy case. It likely wouldn’t work on a modern AI, training techniques are much better now. Overfitting is better avoided and synthetic data is used.

Why do you think that of all the observable patterns, the AI will specifically copy “ideas” and “styles” but never copyrighted works of art?

Because it’s literally physically impossible. The classic example is Stable Diffusion 1.5, which had a model size of around 4GB and was trained on over 5 billion images (the LAION5B dataset). If it was actually storing the images it was being trained on then it would be compressing them to under 1 byte of data.

AIs don’t seem to be able to distinguish between abstract ideas like “plumbers fix pipes” and specific copyright-protected works of art.

This is simply incorrect.

FaceDeer@fedia.io · 21 hours ago

This is the Daenerys case, for some reason it seems to be suddenly making the rounds again. Most of the news articles I’ve seen about it leave out a bunch of significant details so that it ends up sounding more of an “ooh, scary AI!” Story (baits clicks better) rather than a “parents not paying attention to their disturbed kid’s cries for help and instead leaving loaded weapons lying around” story (as old as time, at least in America).

FaceDeer@fedia.io · 23 hours ago

Don’t make “profiteering AI companies” pay for UBI. Make all companies pay for UBI. Just tax their income and turn it around into UBI payments.

One of the major benefits of UBI is how simple it is. The simpler the system is the harder it is to game it. If you put a bunch of caveats on which companies pay more or pay less based on various factors, then there’ll be tons of faffing about to dodge those taxes.

FaceDeer@fedia.io · 23 hours ago

Copyright, yes it’s a problem and should be fixed.

No, this is just playing into another of the common anti-AI fallacies.

Training an AI does not do anything that copyright is even involved with, let alone prohibited by. Copyright is solely concerned with the copying of specific expressions of ideas, not about the ideas themselves. When an AI trains on data it isn’t copying the data, the model doesn’t “contain” the training data in any meaningful sense. And the output of the AI is even further removed.

People who insist that AI training is violating copyright are advocating for ideas and styles to be covered by copyright. Or rather by some other entirely new type of IP protection, since as I said this is nothing at all like what copyright already deals with. This would be an utterly terrible thing for culture and free expression in general if it were to come to pass.

I get where this impulse comes from. Modern society has instilled a general sense that everything has to be “owned” by someone, even completely abstract things. Everyone thinks that they’re owed payment for everything that they can possibly demand payment for, even if it’s something that just yesterday they were doing purely for fun and releasing to the world without a care. There’s this base impulse of “mine! Therefore I must control it!” Ironically, it’s what leads to the capitalist hellscape so many people are decrying at the same time they demand more.

FaceDeer@fedia.io · 1 day ago

Bots are capable of simulating offense perfectly well.

FaceDeer@fedia.io · 1 day ago

You don’t see how one leads directly to the other? Full grown adults are the users of those corporations’ products. If the corporations aren’t allowed to put certain features in those products then that’s the same as prohibiting their users from using those features.

Imagine if there was a government regulation that prohibited the sale of cars with red paint on them. They’re not prohibiting an individual person from owning a car with red paint, they’re not prohibiting individuals from painting their own cars red, but don’t you think that’ll make it a lot harder for individuals to get red cars if they want them?

FaceDeer@fedia.io · 2 days ago

You’re acting as if the bot had some sort of intention to help him.

No I’m not. I’m describing what actually happened. It doesn’t matter what the bot’s “intentions” were.

The larger picture here is that these news articles are misrepresenting the vents they’re reporting on by omitting significant details.

FaceDeer@fedia.io · 2 days ago

And adults too. When you combine “the law says you can’t offer this service to children or we’ll destroy you” with “there’s no way to reliably tell if the people we’re offering this service to are children” the result is “guess we can’t offer this service to anyone.”

FaceDeer@fedia.io · 2 days ago

Be that as it may this particular instance is much more complicated and extreme than the “average” and so makes a poor basis for arguing anything in particular. The details of this specific situation don’t back up a simple interpretation.

I would recommend using studies by psychologists as a better basis.

FaceDeer@fedia.io · 2 days ago

Or how about parents regulate their children, so that we don’t have government nannies telling full grown adults what they’re allowed to do with chatbots?

FaceDeer@fedia.io · 2 days ago

Ah, this is that Daenerys bot story again? It keeps making the rounds, always leaving out a lot of rather important information.

The bot actually talked him out of suicide multiple times. The kid was seriously disturbed and his parents were not paying the attention they should have been to his situation. The final chat before he committed suicide was very metaphorical, with the kid saying he wanted to “join” Daenerys in West World or wherever it is she lives, and the AI missed the metaphor and roleplayed Daenerys saying “sure, come on over” (because it’s a roleplaying bot and it’s doing its job).

This is like those journalists that ask ChatGPT “if you were a scary robot how would you exterminate humanity?” And ChatGPT says “well, poisonous gasses with traces of lead, I guess?” And the journalists go “gasp, scary robot!”

FaceDeer@fedia.io · 3 days ago

Horses are incredibly expensive to house and maintain. I wouldn’t bet on this being more, when you’re not using it you can just stash it in your garage for as long as you want without having to worry about it.

FaceDeer@fedia.io · 4 days ago

Given how dramatically LLMs have improved over the past couple of years I think it’s pretty clear at this point that AI trainers do know something of what they’re doing and aren’t just randomly stumbling around.