Training data can be used "regardless of whether it is for non-profit or commercial purposes, whether it is an act other than reproduction, or whether it is content obtained from illegal sites or otherwise."
The problem is that the AI can print the book word for word if you ask the right questions and at that point it’s breaking copyright again but that’s not a problem with the learning part but with how AI has no concept of understanding context at all
I was skeptical of this, but it checks out: I easily got ChatGPT to print out the full text to The Tell-Tale Heart, without any errors at all in the various spots I accuracy-checked.
Granted I chose it because it’s a very short public domain work - I was more skeptical of its technical ability to recall the exact text without errors than of the ability to trick it into violating copyright law.
I still suspect it’s much easier to (accidentally) trick it into writing a fanfiction of a copyrighted work that it claims is the original than it is to get it to produce the true original, though.
Your argument that it is useful as a copyright infringing machine is that it can reproduce a public domain work? That’s… not the argument you think it is.
You can easily get photoshop to reproduce one of Mondrain’s paintings. That’s on the user not the tool, i fail to see why the same doesnt apply to the tool of generative AI.
Just like a person with really good memory can. So what? Nobody is actually printing 300 page books that way when we can use libgen or any other source instead.
The problem is that the AI can print the book word for word if you ask the right questions and at that point it’s breaking copyright again but that’s not a problem with the learning part but with how AI has no concept of understanding context at all
I was skeptical of this, but it checks out: I easily got ChatGPT to print out the full text to The Tell-Tale Heart, without any errors at all in the various spots I accuracy-checked.
Granted I chose it because it’s a very short public domain work - I was more skeptical of its technical ability to recall the exact text without errors than of the ability to trick it into violating copyright law.
I still suspect it’s much easier to (accidentally) trick it into writing a fanfiction of a copyrighted work that it claims is the original than it is to get it to produce the true original, though.
Your argument that it is useful as a copyright infringing machine is that it can reproduce a public domain work? That’s… not the argument you think it is.
My message was pretty clear about which part of their claim I was skeptical about and what I was testing for. It’s not what you described here.
You can easily get photoshop to reproduce one of Mondrain’s paintings. That’s on the user not the tool, i fail to see why the same doesnt apply to the tool of generative AI.
You can’t easily tell it to replicate any painting for you - with current AI you can do that with almost any book it trained with
A human being with really good/photographic memory can do the same.
Just like a person with really good memory can. So what? Nobody is actually printing 300 page books that way when we can use libgen or any other source instead.