Generative artificial intelligence (GenAI) company Anthropic has claimed to a US court that using copyrighted content in large language model (LLM) training data counts as “fair use”, however.

Under US law, “fair use” permits the limited use of copyrighted material without permission, for purposes such as criticism, news reporting, teaching, and research.

In October 2023, a host of music publishers including Concord, Universal Music Group and ABKCO initiated legal action against the Amazon- and Google-backed generative AI firm Anthropic, demanding potentially millions in damages for the allegedly “systematic and widespread infringement of their copyrighted song lyrics”.

  • SuiXi3D
    link
    fedilink
    645 months ago

    …then maybe they shouldn’t exist. If you can’t pay the copyright holders what they’re owed for the license to use their materials for commercial use, then you can’t use ‘em that way without repercussions. Ask any YouTuber.

    • @Even_Adder@lemmy.dbzer0.com
      link
      fedilink
      English
      11
      edit-2
      5 months ago

      You might want to read this article by Kit Walsh, a senior staff attorney at the EFF, and this one by Katherine Klosek, the director of information policy and federal relations at the Association of Research Libraries. YouTube’s one-sided strike-happy system isn’t the real world.

      Headlines like these let people assume that it’s illegal, rather than educate them on their rights.

      • Snot Flickerman
        link
        English
        245 months ago

        When Annas-Archive or Sci-Hub get treated the same as these giant corporations, I’ll start giving a shit about the “fair use” argument.

        When people pirate to better the world by increasing access to information, the whole world gets together to try to kick them off the internet.

        When giant companies with enough money to make Solomon blush pirate to make more oodles of money and not improve access to information, it’s “fAiR uSe.”

        Literally everyone knew from the start that books3 was all pirated and from ebooks with the DRM circumvented and removed. It was noted when it was created it was basically the entirety of private torrent tracker Bibliotik.

        • @Even_Adder@lemmy.dbzer0.com
          link
          fedilink
          English
          10
          edit-2
          5 months ago

          AI training should not be a privilege of the mega-corporations. We already have the ability to train open source models, and organizations like Mozilla and LAION are working to make AI accessible to everyone. We can’t allow the ultra-wealthy to monopolize a public technology by creating barriers that make it prohibitively expensive for regular people to keep up. Mega corporations already have a leg up with their own datasets and predatory terms of service that exploit our data. Don’t do their dirty work for them.

          Denying regular people access to a competitive, corporate-independent tool for creativity, education, entertainment, and social mobility, we condemn them to a far worse future, with fewer rights than we started with.

          • Snot Flickerman
            link
            English
            16
            edit-2
            5 months ago

            How am I doing their dirty work for them? I literally will stop thinking that they’re getting away with piracy for profit when we stop haranguing people who are committing to piracy for the benefit of mankind.

            I’m not saying Meta should be stopped, I’m saying the prosecution of Sci-Hub and Annas-Archive need to be stopped under the same pretenses.

            If it’s okay to pirate for the purpose of making money (what we put The Pirate Bay admins in jail for), then it’s okay to pirate to benefit mankind.

            There is literally no way in hell someone can convince me what Meta and others are doing is not pirating to use the data contained within to make money. What’s good for the goose is good for the gander, as they say.

            I reiterate, they knew it was pirated and had DRM circumvented when they downloaded it. There was zero question of the source of this data. They knew from the beginning they intended to profit from the use of this data. How is that different than what we accused The Pirate Bay admins of?

            It really feels like “Well these corporations have money to steal more prolifically than little people, so since they’re stealing is so big, we have to ignore it. They have lots of money and lawyers to fight us, The Pirate Bay didn’t, nor do Sci-Hub or Annas-Archive, so let’s just not try against those with money to fight us.”

            • RandoCalrandian
              link
              fedilink
              25 months ago

              Scraping Reddit for comments is not piracy, and that’s what most of these disputes are about.

              It’s pretty disingenuous to claim otherwise, or that these ai tools are using the content differently than in the past.

              This is all fearmongering as a negotiation tactic.

              Whatever price creators decide they “deserve” will be entirely between organizations with a large enough lawyer pool to back it up, such as Reddit which didn’t make a damn piece of the content it’s currently trying to sell and claiming ownership of.

        • VoterFrog
          link
          fedilink
          35 months ago

          You don’t see the difference between distributing someone else’s content against their will and using their content for statistical analysis? There’s a pretty clear difference between the two, especially as fair use is concerned.

        • RandoCalrandian
          link
          fedilink
          15 months ago

          That fair use argument also protects all of the small independent and often working for free developers that make FOSS models.

          These arguments about retroactively applying copyright differently are a large public negotiation between massive moneymakers on what the cost of keeping the little guy out is, not something that will benefit any actual content creator.

      • @Zaktor@sopuli.xyz
        link
        fedilink
        English
        35 months ago

        By and large copyright infringement is illegal. That some things aren’t infringement doesn’t change that a general stance of “if I don’t have permission, I can’t copy it” is correct. The first argument in the EFF article is effectively the title: “it can’t be copyright, because otherwise massive AI models would be impossible to build”. That doesn’t make it fair use, they just want it to become so.

    • @helenslunch@feddit.nl
      link
      fedilink
      45 months ago

      I love seeing Lemmy users trip over themselves to declare that copyrights don’t or shouldn’t exist when it comes to pirating, right up until it comes to AI. Then Copyrights are enshrined by The Constitution and all the corporations NEED to pay for them, even when they’re not actually copying anything.

      • @zaphod@lemmy.ca
        link
        fedilink
        English
        105 months ago

        You do realize that there may in fact be different, distinct groups of Lemmy users with differing, potentially non-overlapping beliefs, yeah?

        • @helenslunch@feddit.nl
          link
          fedilink
          35 months ago

          Sure but Lemmy also operates as a sort of hivemind. This is the top-voted post in the last 24 hours and piracy content usually makes up at least 25% of content here.

          • @zaphod@lemmy.ca
            link
            fedilink
            English
            85 months ago

            Oh, well, you’ve clearly done the kind of deep and thoughtful analysis that would allow you to determine the general opinions of all Lemmy users. My mistake. Carry on.

      • SuiXi3D
        link
        fedilink
        75 months ago

        Using copyrighted material for something you aren’t gonna make any money off of? Cool, go hog wild. If you’re gonna use some music or art that you didn’t make in something that will make you money, the folks that made whatever you used should get a cut. Not the whole cut, but a cut.

        • @Moira_Mayhem@beehaw.org
          link
          fedilink
          25 months ago

          If an artist falls in love with drawing and learns to draw from Jack Kirby’s work and at the beginning even imitates his style, does he owe Jack Kirby royalties for every drawing he does as he ‘learned’ on Jack’s copyrighted art?

          • SuiXi3D
            link
            fedilink
            35 months ago

            I think in that case, no. ‘Style’ is one thing, directly using someone’s art in your own work is something else entirely. However, we’re talking about a person here, not a program developed by a company for the express purpose of making as much money as possible in the shortest amount of time. Until AI can truly demonstrate that it is truly thinking and not simply executing commands given, I don’t think the lines are blurred nearly enough to suggest that someone learning to paint and an AI trained on hundreds of thousands of pieces of art for the purpose of making money for the company that built it are remotely the same.

      • Sneezycat
        link
        fedilink
        45 months ago

        And corporations want people to pay for it but they don’t want to pay for it themselves. It’s almost as if no one likes copyright, but it benefits some ppl more than others.

      • Pigeon
        link
        fedilink
        15 months ago

        You do realize that lemmy contains very many users, many of whom disagree on any number of things. You are randomly assigning the opinions of lemmy’s pirate users to a random commenter without evidence that they actually hold those opinions, because it’d be convenient for you if they’re contradicting themself in any way (though the degree to which that would be a contradiction is also arguable). It’s just a way of constructing a strawman instead of engaging with your interlocutor’s actual words.

        Also, part of the problem is that these LLMs very often do directly copy and spit out articles and random forum posts and etc word-for-word verbatim, or it’ll do something that’s the equivalent of a plagiarist who swaps a few words around in a sad attempt to not get caught. It becomes especially likely depending on how specific the search is, like if you look for a niche topic hardly anyone has written extensively on or for the solution to an esoteric problem that maybe just one person on a forum somewhere found an answer to. It also typically does not even give credit or link to its sources.

        Plus, copyright law, if it exists, must apply to everyone, including major coporations. That’s a separate issue than whether or not copyright law needs reform (it obviously does). If you wanna abolish copyright, fine, ok, get it abolished through the government. But while copyright law is still the law, I’m not ozk with giving magacorps a pass to break it legally, especially when they’re more than happy to sue random, harmless individuals for violating their own copyrights. They want the law not to apply to them because they’re rich.

        The argument they’re making is just ridiculous on its face when you compare it to other crimes. If AI should be allowed to violate copyright because otherwise it can’t exist as it is, then anyone should be able to violate copyright because otherwise their cool projects won’t be able to exist. And I should be able to rob a bank because otherwise I won’t have all that money. You should be able to commit murder because otherwise your annoying coworker will keep bugging you. She should be able to walk out of a store with an iPhone without paying for it because otherwise she won’t have an iPhone. Etc. It’s an argument that says the criminal’s motivations are legal justification for the crime. “You should let me legally do the thing because otherwise I can’t do the thing” is just not a convincing argument in my book.

        • @helenslunch@feddit.nl
          link
          fedilink
          15 months ago

          You do realize that lemmy contains very many users

          Already addressed in another comment.

          part of the problem is that these LLMs very often do directly copy and spit out articles and random forum posts and etc word-for-word verbatim

          It’s a problem they’ve acknowledged and are actively working on.

          Plus, copyright law, if it exists, must apply to everyone, including major coporations.

          Well many people here would disagree. That was the entire point of my comment.