Do they think the hands-off treatment that giant corporations that basically print money get is going to somehow “trickle down” to them, too?
Because last I checked, the guys who ran Jetflicks are facing jail time. Like, potentially longer jail time than most murder sentences.
…but letting OpenAI essentially do the same without consequences will mean Open Source AI people will somehow get the same hands-off treatment? That just reeks of bullshit to me.
I just don’t fucking buy it and letting massive corporations just skirt IP laws while everyone else gets fucked hard by those same IP laws just doesn’t seem like the best hill to die on, yet plenty of people who are anti-copyright/anti-IP laws are dying on this fucking hill.
What gives?
I am personally of the opinion that current IP/copyright laws are draconian, but that IP/copyright isn’t inherently a bad thing. I just know, based on previous history in the US, that letting the Big Guys skirt laws almost never leads to Little Guys getting similar treatment.
Also, I hope this is an okay place for this rant. Thanks for keeping this space awesome. Please remove if this is inappropriate for this forum, please and thank you.
I really want to see someone use the lack of any AI accountability as an active defense in a copyright trial for piracy. Force the court to compare the “theft” of a digital item like a book or movie that is used a few times probably, to copying something that is then effectively used internally every time the AI generates something.
I’d like to point out that the title is conflating two very different acts of “piracy”. What I think of as “the little guy” piracy is content “theft”, whereby they acquire some content they didn’t pay for to enjoy. What “the big guy” piracy looks like is licensing theft, whereby they take something “freely available on the internet” and use it to make their own product which directly affects the creator.
The world in which I watch some P&R or play some Warhammer for free in my little cave looks identical to the world in which I didn’t do that. The world in which I read all the Warhammer lore and make a game and sell it using the same setting and characters without talking to GW directly devalues their IP and the world looks different (this is effectively what AI does as it can be made to reproduce a lot of the training data).
I’d love to live in a world without DRM and “always online” and purchased**TM (arbitrarily revocable) games, and convenient and affordable ways to access media so piracy isn’t necessary and the degree to which it would still happen would be so minor we wouldn’t even need laws for it. I support the kind of piracy that rallies against that shit, I do not support arbitrary license theft.
I think you’re on the money there. Copyright was originally intended as industry regulation, a way to prevent larger book publishers from just copying a smaller publisher’s book on day one and flooding the market with their copies. It’s applied to many more industries than just books (good!) but also to a wider group than actual publishers (bad!). When someone running a massive free ROMs site gets taken down, that’s probably reasonable, they’re playing the role of a publisher there and unfairly undercutting the competition (although the penalties in the US are still absurdly steep, as they usually are for individuals in this country). But when someone gets attacked for posting an image on social media, or streamers have to worry about the music playing in their games, or ISPs have to enforce against downloaders of pirated software, or modders have to be careful about linking their mod in such a way that no original code is included, that’s not what copyright should be.
It is not, in fact, bad that copyright applies to a wider group than publishers, unless you are using “publisher” extremely broadly to apply to “creators”.
If “someone gets attacked for posting an image on social media”, that rarely means “lawyers came after me because I posted a screenshot of a page from Sandman”. It often means that the poster took someone else’s art, snipped off the artist’s signature, and posted without attribution, and the artist is rightfully angry. Copyright is what enables that artist to continue to eat and make more art. The same goes for music, or software, or movies.
Sure, the system is horribly abused by uneven power structures, as every system in the world is. For music especially, we all know that the takedowns are usually issued by people who have nothing to do with the creation of the protected work, because of the way licensing and rights grants work in that industry. Automated takedown systems (which have to exist because of the scale of online content) also have no reasonable appeal mechanism, and the people making the decisions don’t (and can’t) make reasonable assessments about fair use and transformative works.
I’m not saying that everyone who participates in piracy is a bad, wicked thief–I absolutely participate in it myself. But copyright is not the villain here; that’s just trying to make us feel justified about our actions. Someone made a creative work I enjoyed, and I don’t have a moral right to the product of their effort for free.
Copyright law is full of ambiguities and gray areas, some intentional and some unintentional. The concept of “fair use” is an example of an intentional gray area, since the idea is that society as a whole will benefit from allowing people to skirt copyright law in certain circumstances, and lawmakers can’t possibly hope to enumerate every such circumstance. It then falls on courts to determine if a given circumstance falls under “fair use”. The problem is courts move very slowly when faced with a new circumstance that hasn’t been litigated before, and that’s what’s happening with AI companies training AI on copyrighted works. Once decisions have been made and stare decisis is established, then they’ll move faster. The NY Times vs OpenAI is the case to watch IMO, since that’s the biggest one challenging the idea that training AI is fair use.
Excellent comment.
Do you know if the lawsuit that involved Sarah Silverman is going? Because I originally thought that one would have more legs, but maybe because the companies using the books3 corpus all dropped use of them, the case was dropped? I’m honestly unsure.
It’s just that the fact that any of them used books3 to begin with should say everything. Everyone knew books3 was the entirety of private torrent tracker Bibliotik. It was not hidden. Bibliotik isn’t just regular old ebook piracy either, they distribute the tools to remove DRM from ebooks. So they’re using a corpus literally made from pirated ebooks that have potentially had their DRM stripped. I just seemed like a big, easy admission that they were more than happy to use unscrupulous methods to profit.
As someone who was booted off of Bibliotik because it’s damn impossible to keep ratio there, it’s mind boggling to me that it wasn’t a bigger deal how widely used books3 was.
I haven’t followed the Sarah Silverman case, but I think it’s likely that’ll end in a settlement. NYTimes is less likely to settle, since they seem to be trying to set a precedent, and they’ve got the resources to do that.
They don’t think that.
I have not followed any current debate, so this is just my own thoughts. I expect any battle between Disney and Microsoft to end with a deal where consumers and independent producers are worse off.
Similar to how YouTube often hands out copyright strikes for musicians uploading their own music, in a possible future you might need an AI license to upload any work to any platform of size. I mean, you don’t technically have to, it is just that that the AI driven filter will otherwise strike you faster than Tumblr hiding images of trans women. Oh, and when you fold and get the AI license, you notice that it includes signing away your rights to not have your uploaded work be part of the AI training materials.
Maybe I am just jaded. But until AI crashes and burns the in my opinion most likely outcome of legal proceedings is splitting the loot in proportion to the power of the interested parties. On the other hand I don’t expect anything good to come out of letting AI companies run wild. So I dearly hope they destroy each other, but I expect them to embrace.
Not my fight either way tbh
Judging by my extensive arguing with people like this on reddit they fall into one of two camps:
-
Once bigcorp creates panicea AIs, they won’t need to pirate anymore because they will be able to ask ChatGPT to make all the movies and music they need.
-
They don’t think that additional copyright protections will ever be able to touch Microsoft and can only ever hurt the little guy.
-
The differing factor for me is profit, those Jetflicks guys aren’t getting any sympathy from me. Charging or taking donations just to maintain infrastructure is one thing, but they had millions of dollars in profit.
You should be able to use anything you want for things, but once you start trying to make a profit off others works, that’s where I tend to draw the line. For example, I have no qualms to pirate Photoshop to make memes or fix old family photos or whatever, but if I were to ever actually get good with it I would buy it or switch to an open source competitor before trying to make money with it.
AI training is where it starts to get murky, but having a base understanding of how training generally goes, I don’t see an issue with it. The original image(s) fed into it is not really there anymore, it was processed into weights and math and then discarded in a way.
Yes, it’s possible to recreate an original image from its training, but every example I’ve seen required a very very specific prompt to get it to do that and even then it was a bit off. It’s akin to a person memorizing a piece of art and then re-creating it from memory, depending on the individual skill level it’ll get pretty close but it’ll be just a bit off
Now, training data taken from behind a paywall or private data like DMs or private posts can be a bit different and I’m generally against that.
So I’ve adopted this line: “If it’s posted publicly, expect it to be used publicly” which includes just your average joe seeing it browsing around or a bot grabbing it for AI training
So I’ve adopted this line: “If it’s posted publicly, expect it to be used publicly” which includes just your average joe seeing it browsing around or a bot grabbing it for AI training
Very akin to an old Mark Hosler (of Negativland) interview where he said (not verbatim, cant find the old interview): “If you want to keep full control of your art, keep it in your home, maybe share it with family or a few close friends, but once that art is out in the wider world, you don’t really have control over it anymore.” Because, as you point out, an artist can recreate art that they have seen with their own eyes, but it will likely be a bit off from the original.
While I agree, let me play Devil’s Advocate for a moment: books3 was “publicly posted” but was created from all the books on private torrent tracker Bibliotik. Would you agree that this would fall under private data since it’s all pirated ebooks?
As for the Jetflicks guys, to me it’s mostly twisted that they’re up for more jail time than a lot of murderers get. Otherwise, I agree, the profits they made kind of kill any narrative that they were doing it for good reasons or that they deserve a massive amount of sympathy.
books3 was “publicly posted” but was created from all the books on private torrent tracker Bibliotik. Would you agree that this would fall under private data since it’s all pirated ebooks?
Not necessarily private data per se, but if it’s being used to train a closed source model for profit (like openai using it for chatGPT) then I would consider it in the same realm as with the Jetflicks guys of using pirated works for profit. If it was just a couple people or researchers using it to train an open source model, then I see no issue with that especially since it’s helping further advancement of technology for all over advancement of profits for one company.
As for the Jetflicks guys, to me it’s mostly twisted that they’re up for more jail time than a lot of murderers get.
I was unaware of that part, ok they’ll get sympathy from me on that, that’s absolutely disgusting.