Generative artificial intelligence (GenAI) company Anthropic has claimed to a US court that using copyrighted content in large language model (LLM) training data counts as “fair use”, however.
Under US law, “fair use” permits the limited use of copyrighted material without permission, for purposes such as criticism, news reporting, teaching, and research.
In October 2023, a host of music publishers including Concord, Universal Music Group and ABKCO initiated legal action against the Amazon- and Google-backed generative AI firm Anthropic, demanding potentially millions in damages for the allegedly “systematic and widespread infringement of their copyrighted song lyrics”.
AI training should not be a privilege of the mega-corporations. We already have the ability to train open source models, and organizations like Mozilla and LAION are working to make AI accessible to everyone. We can’t allow the ultra-wealthy to monopolize a public technology by creating barriers that make it prohibitively expensive for regular people to keep up. Mega corporations already have a leg up with their own datasets and predatory terms of service that exploit our data. Don’t do their dirty work for them.
Denying regular people access to a competitive, corporate-independent tool for creativity, education, entertainment, and social mobility, we condemn them to a far worse future, with fewer rights than we started with.
How am I doing their dirty work for them? I literally will stop thinking that they’re getting away with piracy for profit when we stop haranguing people who are committing to piracy for the benefit of mankind.
I’m not saying Meta should be stopped, I’m saying the prosecution of Sci-Hub and Annas-Archive need to be stopped under the same pretenses.
If it’s okay to pirate for the purpose of making money (what we put The Pirate Bay admins in jail for), then it’s okay to pirate to benefit mankind.
There is literally no way in hell someone can convince me what Meta and others are doing is not pirating to use the data contained within to make money. What’s good for the goose is good for the gander, as they say.
I reiterate, they knew it was pirated and had DRM circumvented when they downloaded it. There was zero question of the source of this data. They knew from the beginning they intended to profit from the use of this data. How is that different than what we accused The Pirate Bay admins of?
It really feels like “Well these corporations have money to steal more prolifically than little people, so since they’re stealing is so big, we have to ignore it. They have lots of money and lawyers to fight us, The Pirate Bay didn’t, nor do Sci-Hub or Annas-Archive, so let’s just not try against those with money to fight us.”
Then I misunderstood what you were saying. Carry on.
Scraping Reddit for comments is not piracy, and that’s what most of these disputes are about.
It’s pretty disingenuous to claim otherwise, or that these ai tools are using the content differently than in the past.
This is all fearmongering as a negotiation tactic.
Whatever price creators decide they “deserve” will be entirely between organizations with a large enough lawyer pool to back it up, such as Reddit which didn’t make a damn piece of the content it’s currently trying to sell and claiming ownership of.