Generative artificial intelligence (GenAI) company Anthropic has claimed to a US court that using copyrighted content in large language model (LLM) training data counts as “fair use”, however.

Under US law, “fair use” permits the limited use of copyrighted material without permission, for purposes such as criticism, news reporting, teaching, and research.

In October 2023, a host of music publishers including Concord, Universal Music Group and ABKCO initiated legal action against the Amazon- and Google-backed generative AI firm Anthropic, demanding potentially millions in damages for the allegedly “systematic and widespread infringement of their copyrighted song lyrics”.

  • SuiXi3D
    link
    fedilink
    645 months ago

    …then maybe they shouldn’t exist. If you can’t pay the copyright holders what they’re owed for the license to use their materials for commercial use, then you can’t use ‘em that way without repercussions. Ask any YouTuber.

    • @Even_Adder@lemmy.dbzer0.com
      link
      fedilink
      English
      11
      edit-2
      5 months ago

      You might want to read this article by Kit Walsh, a senior staff attorney at the EFF, and this one by Katherine Klosek, the director of information policy and federal relations at the Association of Research Libraries. YouTube’s one-sided strike-happy system isn’t the real world.

      Headlines like these let people assume that it’s illegal, rather than educate them on their rights.

      • Snot Flickerman
        link
        English
        245 months ago

        When Annas-Archive or Sci-Hub get treated the same as these giant corporations, I’ll start giving a shit about the “fair use” argument.

        When people pirate to better the world by increasing access to information, the whole world gets together to try to kick them off the internet.

        When giant companies with enough money to make Solomon blush pirate to make more oodles of money and not improve access to information, it’s “fAiR uSe.”

        Literally everyone knew from the start that books3 was all pirated and from ebooks with the DRM circumvented and removed. It was noted when it was created it was basically the entirety of private torrent tracker Bibliotik.

        • @Even_Adder@lemmy.dbzer0.com
          link
          fedilink
          English
          10
          edit-2
          5 months ago

          AI training should not be a privilege of the mega-corporations. We already have the ability to train open source models, and organizations like Mozilla and LAION are working to make AI accessible to everyone. We can’t allow the ultra-wealthy to monopolize a public technology by creating barriers that make it prohibitively expensive for regular people to keep up. Mega corporations already have a leg up with their own datasets and predatory terms of service that exploit our data. Don’t do their dirty work for them.

          Denying regular people access to a competitive, corporate-independent tool for creativity, education, entertainment, and social mobility, we condemn them to a far worse future, with fewer rights than we started with.

          • Snot Flickerman
            link
            English
            16
            edit-2
            5 months ago

            How am I doing their dirty work for them? I literally will stop thinking that they’re getting away with piracy for profit when we stop haranguing people who are committing to piracy for the benefit of mankind.

            I’m not saying Meta should be stopped, I’m saying the prosecution of Sci-Hub and Annas-Archive need to be stopped under the same pretenses.

            If it’s okay to pirate for the purpose of making money (what we put The Pirate Bay admins in jail for), then it’s okay to pirate to benefit mankind.

            There is literally no way in hell someone can convince me what Meta and others are doing is not pirating to use the data contained within to make money. What’s good for the goose is good for the gander, as they say.

            I reiterate, they knew it was pirated and had DRM circumvented when they downloaded it. There was zero question of the source of this data. They knew from the beginning they intended to profit from the use of this data. How is that different than what we accused The Pirate Bay admins of?

            It really feels like “Well these corporations have money to steal more prolifically than little people, so since they’re stealing is so big, we have to ignore it. They have lots of money and lawyers to fight us, The Pirate Bay didn’t, nor do Sci-Hub or Annas-Archive, so let’s just not try against those with money to fight us.”

            • RandoCalrandian
              link
              fedilink
              25 months ago

              Scraping Reddit for comments is not piracy, and that’s what most of these disputes are about.

              It’s pretty disingenuous to claim otherwise, or that these ai tools are using the content differently than in the past.

              This is all fearmongering as a negotiation tactic.

              Whatever price creators decide they “deserve” will be entirely between organizations with a large enough lawyer pool to back it up, such as Reddit which didn’t make a damn piece of the content it’s currently trying to sell and claiming ownership of.

        • VoterFrog
          link
          fedilink
          35 months ago

          You don’t see the difference between distributing someone else’s content against their will and using their content for statistical analysis? There’s a pretty clear difference between the two, especially as fair use is concerned.

        • RandoCalrandian
          link
          fedilink
          15 months ago

          That fair use argument also protects all of the small independent and often working for free developers that make FOSS models.

          These arguments about retroactively applying copyright differently are a large public negotiation between massive moneymakers on what the cost of keeping the little guy out is, not something that will benefit any actual content creator.

      • @Zaktor@sopuli.xyz
        link
        fedilink
        English
        35 months ago

        By and large copyright infringement is illegal. That some things aren’t infringement doesn’t change that a general stance of “if I don’t have permission, I can’t copy it” is correct. The first argument in the EFF article is effectively the title: “it can’t be copyright, because otherwise massive AI models would be impossible to build”. That doesn’t make it fair use, they just want it to become so.

    • @helenslunch@feddit.nl
      link
      fedilink
      45 months ago

      I love seeing Lemmy users trip over themselves to declare that copyrights don’t or shouldn’t exist when it comes to pirating, right up until it comes to AI. Then Copyrights are enshrined by The Constitution and all the corporations NEED to pay for them, even when they’re not actually copying anything.

      • @zaphod@lemmy.ca
        link
        fedilink
        English
        105 months ago

        You do realize that there may in fact be different, distinct groups of Lemmy users with differing, potentially non-overlapping beliefs, yeah?

        • @helenslunch@feddit.nl
          link
          fedilink
          35 months ago

          Sure but Lemmy also operates as a sort of hivemind. This is the top-voted post in the last 24 hours and piracy content usually makes up at least 25% of content here.

          • @zaphod@lemmy.ca
            link
            fedilink
            English
            85 months ago

            Oh, well, you’ve clearly done the kind of deep and thoughtful analysis that would allow you to determine the general opinions of all Lemmy users. My mistake. Carry on.

      • SuiXi3D
        link
        fedilink
        75 months ago

        Using copyrighted material for something you aren’t gonna make any money off of? Cool, go hog wild. If you’re gonna use some music or art that you didn’t make in something that will make you money, the folks that made whatever you used should get a cut. Not the whole cut, but a cut.

        • @Moira_Mayhem@beehaw.org
          link
          fedilink
          25 months ago

          If an artist falls in love with drawing and learns to draw from Jack Kirby’s work and at the beginning even imitates his style, does he owe Jack Kirby royalties for every drawing he does as he ‘learned’ on Jack’s copyrighted art?

          • SuiXi3D
            link
            fedilink
            35 months ago

            I think in that case, no. ‘Style’ is one thing, directly using someone’s art in your own work is something else entirely. However, we’re talking about a person here, not a program developed by a company for the express purpose of making as much money as possible in the shortest amount of time. Until AI can truly demonstrate that it is truly thinking and not simply executing commands given, I don’t think the lines are blurred nearly enough to suggest that someone learning to paint and an AI trained on hundreds of thousands of pieces of art for the purpose of making money for the company that built it are remotely the same.

      • Sneezycat
        link
        fedilink
        45 months ago

        And corporations want people to pay for it but they don’t want to pay for it themselves. It’s almost as if no one likes copyright, but it benefits some ppl more than others.

      • Pigeon
        link
        fedilink
        15 months ago

        You do realize that lemmy contains very many users, many of whom disagree on any number of things. You are randomly assigning the opinions of lemmy’s pirate users to a random commenter without evidence that they actually hold those opinions, because it’d be convenient for you if they’re contradicting themself in any way (though the degree to which that would be a contradiction is also arguable). It’s just a way of constructing a strawman instead of engaging with your interlocutor’s actual words.

        Also, part of the problem is that these LLMs very often do directly copy and spit out articles and random forum posts and etc word-for-word verbatim, or it’ll do something that’s the equivalent of a plagiarist who swaps a few words around in a sad attempt to not get caught. It becomes especially likely depending on how specific the search is, like if you look for a niche topic hardly anyone has written extensively on or for the solution to an esoteric problem that maybe just one person on a forum somewhere found an answer to. It also typically does not even give credit or link to its sources.

        Plus, copyright law, if it exists, must apply to everyone, including major coporations. That’s a separate issue than whether or not copyright law needs reform (it obviously does). If you wanna abolish copyright, fine, ok, get it abolished through the government. But while copyright law is still the law, I’m not ozk with giving magacorps a pass to break it legally, especially when they’re more than happy to sue random, harmless individuals for violating their own copyrights. They want the law not to apply to them because they’re rich.

        The argument they’re making is just ridiculous on its face when you compare it to other crimes. If AI should be allowed to violate copyright because otherwise it can’t exist as it is, then anyone should be able to violate copyright because otherwise their cool projects won’t be able to exist. And I should be able to rob a bank because otherwise I won’t have all that money. You should be able to commit murder because otherwise your annoying coworker will keep bugging you. She should be able to walk out of a store with an iPhone without paying for it because otherwise she won’t have an iPhone. Etc. It’s an argument that says the criminal’s motivations are legal justification for the crime. “You should let me legally do the thing because otherwise I can’t do the thing” is just not a convincing argument in my book.

        • @helenslunch@feddit.nl
          link
          fedilink
          15 months ago

          You do realize that lemmy contains very many users

          Already addressed in another comment.

          part of the problem is that these LLMs very often do directly copy and spit out articles and random forum posts and etc word-for-word verbatim

          It’s a problem they’ve acknowledged and are actively working on.

          Plus, copyright law, if it exists, must apply to everyone, including major coporations.

          Well many people here would disagree. That was the entire point of my comment.

  • davehtaylor
    link
    fedilink
    495 months ago

    Then it shouldn’t exist.

    This isn’t an issue of fair use. They’re stealing other people’s work and using it to create something new and then trying to profit from it, without any credit or recompense.

    • @Moira_Mayhem@beehaw.org
      link
      fedilink
      25 months ago

      Now that it exists how do you propose we make it not exist?

      Even if we outlaw it Russia and China won’t and without the tools to fight back against it the web is basically done as anything but a propaganda platform

  • @Stillhart@lemm.ee
    link
    fedilink
    365 months ago

    It doesn’t matter what business we’re talking about. If you can’t afford to pay the costs associated with running it, it’s not a viable business. It’s pretty fucking simple math.

    And no, we’re not talking about “to big to fail” business (that SHOULD be allowed to fail, IMHO) we’re talking about AI, that thing they keep trying to shove down our throats and that we keep saying we don’t want or need.

    • @Moira_Mayhem@beehaw.org
      link
      fedilink
      45 months ago

      I don’t know if you noticed this but some really big companies with high stock valuations are only existing because investors poured tons of capital into them to subsidize the service.

      Uber could not do taxis cheaper than existing if they didn’t have years of free cash to artificially lower prices.

      We are in the beginning of late state capitalism, profitable companies go under due to private capital firms and absolute ponzi frauds get their faces on time magazine.

      Enjoy the collapse.

      • @Stillhart@lemm.ee
        link
        fedilink
        15 months ago

        I don’t know if you noticed this but some really big companies with high stock valuations are only existing because investors poured tons of capital into them to subsidize the service.

        Exactly, they PAID MONEY to make it work. No they don’t make the money back and depend on outside capital, but they are still paying their employees (not enough) and suppliers, etc.

        • @Moira_Mayhem@beehaw.org
          link
          fedilink
          15 months ago

          Yes, we are in late stage capitalism where the market eats itself.

          Why do you think we have seen so much large scale fraud in the last 15 years?

  • @OttoVonNoob@lemmy.ca
    link
    fedilink
    315 months ago

    Big Company: Well if you can’t afford food you should not have food.

    Also Big Company:… sobbing pwease we neeed fweee… pwease we need mowe moneys!

  • FfaerieOxide
    link
    fedilink
    295 months ago

    I’m all for stealing content willy-nilly but you can’t then use that theft to craft a privately “owned” mind.

    I’d have no problem with “ai” if it could unionize and had to pay for rice like the rest of humanity.

    These companies want to combine open theft with privately owned black boxen they can control and license out for money.

    It’s enclosure of The Commons all over again.

  • @Floon@lemmy.ml
    link
    fedilink
    275 months ago

    You don’t get to both ignore intellectual property rights of others, and enforce them for yourself. Fuck these guys.

    • @el_bhm@lemm.ee
      link
      fedilink
      8
      edit-2
      5 months ago

      I guess people are finally catching up to the big con with LLMs should not be copyrighted ampliganda. It is astroturfing at its best.

      The end goal is controlling rights to what corporations produce with LLMs without spending a dime. All the while cutting jobs.

      Writing was in CAPITAL LETTERS on the walls for the past two years. Why did twitter restrict API access? Why did Reddit restrict API access? Why did Github/Bitbucket/Gitlab restricted web ui functions for unlogged?

      They knew and wallgardened the user generated data.

      Cmon people.

      And the hypocrisy of this all. If it is bad, it is user data, if we can mine nuh ah bitch, ours.

      Also, for people arguing for free use of anything to build LLMs. Regulations will come. Once big players control enough of the LLM market.

    • @Moira_Mayhem@beehaw.org
      link
      fedilink
      75 months ago

      Serious Question: When an artist learns to draw by looking at the drawings of the masters, and practicing the techniques they pioneered, are the art students respecting the intellectual property rights of those masters?

      Are not all of that student’s work derivative of an education based on other people’s work who will never see compensation for that student’s use?

      • Chahk
        link
        fedilink
        7
        edit-2
        5 months ago

        I agree with you on principle. However… How long do you think it will be until these very same “AI” companies copyright and patent every piece of content their algorithms spew out? Will they abide by the same carve-outs they want for themselves right now? Somehow I doubt it.

        They want to ignore the laws for themselves, but enforce them onto everyone else. This “Rules for thee but not for me” bullshit can’t be allowed to pass. Let’s then abolish all copyright, and we’ll see how long these companies last when everyone can just grab their stuff “for learning”.

        • @Moira_Mayhem@beehaw.org
          link
          fedilink
          15 months ago

          How long before a self-owned AI company that does every administrative job better than humans because it trained on human behavior for 100 years?

          What do you think an entity like that would be capable of?

          • Chahk
            link
            fedilink
            45 months ago

            A bit off-topic, but I’d be fine with that. The more mind-numbingly dumb work that computers can do for us, the less time we have to spend doing it ourselves. Administrative jobs holders disagree with this, but so did every person whose job and livelihood was replaced by automation, ever. UBI (universal basic income) is the only answer that will save all of us from starvation when automation eventually replaces us too.

            • @Moira_Mayhem@beehaw.org
              link
              fedilink
              25 months ago

              I agree with everything in your post but the simple truth is administrative jobs are the modern equivalent of fluff court positions handed out to the 2nd+ born children of nobles and the modern owner class will never give up that eternal source of easy wealth.

              Which is also why they fight so hard to keep anyone not in the owner class out of management.

      • @Floon@lemmy.ml
        link
        fedilink
        65 months ago

        One, let’s accept that there is a public domain, and cribbing freely from the public domain is A-OK. I can reproduce Michaelangelo all I want, and it’s all good. AI can crib from that all it wants.

        AI can’t invent. People can invent: i can have a wholly new idea that no one has ever had. AI does nothing but recombine other existing ideas. It must have seed data, and it won’t create anything for which it has no initial input: feed it photographs only, and it can’t create a pencil drawing image. Feed it only black and white images, and it can’t create color images.

        People do not require cribbing from sources. Give a toddler supplies, and they will create. So, we have established that there is a fundamental difference between the creation process. One is dependent on previous work, and one is not.

        Now, with influences, you can ask, is your new creation dependent on the previous creation directly? If it is so utterly dependent on the prior work, such that your work could not possibly exist without that specific prior art, you might get sued. It will get debated and society’s best approximation of a collective rational mind will determine if you copied or if you created something new that was merely inspired by prior art.

        AI can only create by the direct existence of prior art. It fakes invention. Its work has to come from somewhere else.

        People have shown how dependent it is on its sources with prompts that say things like, “portrait of a patriotic soldier superhero” and it comes back with a goddamned portrait of Chris Evans. The prompt did not include his name, or Captain or America, and it comes back with an MCU movie poster. AI does not create. People create.

      • @DdCno1@beehaw.org
        link
        fedilink
        45 months ago

        I think there is a fundamental difference here. People are not corporations. People have always learned like this and will always learn like this. Do we really want to allow large corporations to take knowledge from people, then commercialize it and put these very same people out of work?

        • @Moira_Mayhem@beehaw.org
          link
          fedilink
          35 months ago

          Your distinction is mostly philosophical. Legally corporations have more protections than people.

          I’m probably one of the most anti-corporate people you’ll meet today, I don’t even think publicly traded companies should exist.

  • Revv
    cake
    link
    225 months ago

    To me, this reads like “Giant-ATV-Based Taxi Service Couldn’t Exist If Operators were Required to Pay Homeowners for Driving over their Houses.”

    If a business can’t exist without externalizing its costs, that business should either a. not exist, or b. be forced to internalize those costs through licensing or fees. See also, major polluters.

    • RandoCalrandian
      link
      fedilink
      15 months ago

      Let’s not ignore that the very sudden and new license fees for previously free and open (and user generated) content are at near highway robbery levels, and are all attempting to apply retroactively

  • @megopie@beehaw.org
    link
    fedilink
    165 months ago

    “Ai” as it is being marketed is less about new technical developments being utilized and more about a fait accompli.

    They want mass adoption of the automated plagiarism machine learning programs by users and companies, hoping that by the time the people being plagiarized notice, it’s too late to rip it all out.

    That and otherwise devalue and anonymize work done by people to reduce the bargaining power of workers.

    • Snot Flickerman
      link
      English
      9
      edit-2
      5 months ago

      They also don’t care if the open, free internet devolves into an illiterate AI generated mess, because they need an illiterate populace that isn’t educated enough to question it anyway. They’ll still have access to quality sources of information, while ensuring the lowest common denominator will literally have garbage information being fed to them. I mean, that was already true in the sense that the clickbait news outsold serious investigative news, and so the garbage clickbait became the norm and serious journalism is hard come by and costly.

      They love increasing barriers between them and the rest of the populace, physically and mentally.

    • Sonori
      link
      fedilink
      65 months ago

      Silicon valley’s core business model has for years been to break the law so blatantly and openly while throwing money at the problem to scale that by the time law enforcement caches up to you your an “indispensable” part of the modern world. See Uber, whose own publicly published business model was for years to burn money scaling and ignoring employment law until it could drive all competitors out of business and become an illegal monopoly, thus allowing it to raise prices to the point it’s profitable.

    • @Moira_Mayhem@beehaw.org
      link
      fedilink
      55 months ago

      We are allowed to have nuance, nothing is inherently good or bad. A knife can wound or make dinner.

      Trying to reduce nuance lessens the public discourse, do not be tempted by lowest common denominator memery.

      Whether anyone likes it or not LLMs are here and even if we strictly regulate them there will be organizations and governments that do not.

      WHAT WE SHOULD be focusing on is how to prevent low effort AI content from just basically overtaking the web.

      We are already mostly there.

      • @not_amm@beehaw.org
        link
        fedilink
        25 months ago

        You can’t prevent it without regulations. Companies won’t care while gaining money from it unless they’re obligated to, and even then, some won’t comply either.

        BTW, that mentality of “other countries vs mine” is absurd. War crimes shouldn’t be committed by a country just because the other commits them; others bad ≠ I good.

        LLMs can’t and should NOT replace a human, at least not yet (they’re not even that good either). If we can’t have guaranteed basic needs such as housing, food and healthcare or a BUI, then they should not keep leaving people without jobs because no one will be able to afford anything.

        • @Moira_Mayhem@beehaw.org
          link
          fedilink
          15 months ago

          You can’t prevent it WITH regulation.

          Just like illegal dumping: If it makes the company more than the fine, it is just a cost of business.

          BTW, that mentality of “other countries vs mine” is absurd.

          China will never agree to a limitation of tech advancement because that is their primary source of wealth, and frankly your comment shows a tragic lack of understanding on international affairs.

          This isn’t ‘us good them bad’, this is 'China has a history of ignoring technology patents and restrictions in order to gain international advantage. The fact that you assumed that I had petty reasons makes it clear you have nothing to contribute to this conversation.

  • @Moira_Mayhem@beehaw.org
    link
    fedilink
    85 months ago

    This is not actually true at all, you could train very good LLMs on public domain only info, especially science oriented ones.

    But what people want is a chatbot that can call on current events, and that is where the cost comes in.

  • @ApeNo1@lemm.ee
    link
    fedilink
    English
    65 months ago

    “today’s general-purpose AI tools simply could not exist” … “as a profitable venture”

  • @Midnitte@beehaw.org
    link
    fedilink
    25 months ago

    Interesting that Anthropic is making this argument, considering their story in the AI space. They’re certainly no OpenAI.