‘Impossible’ to create AI tools like ChatGPT without copyrighted material, OpenAI says

L4sBot@lemmy.world · 1 year ago

‘Impossible’ to create AI tools like ChatGPT without copyrighted material, OpenAI says

HelloThere@sh.itjust.works · 1 year ago

OK, so pay for it.

Pretty simple really.

Björn Tantau@swg-empire.de · 1 year ago

Or let’s use this opportunity to make copyright much less draconian.

dhork@lemmy.world · edit-2 1 year ago

¿Porque no los dos?

I don’t understand why people are defending AI companies sucking up all human knowledge by saying “well, yeah, copyrights are too long anyway”.

Even if we went back to the pre-1976 term of 28 years, renewable once for a total of 56 years, there’s still a ton of recent works that AI are using without any compensation to their creators.

I think it’s because people are taking this “intelligence” metaphor a bit too far and think if we restrict how the AI uses copyrighted works, that would restrict how humans use them too. But AI isn’t human, it’s just a glorified search engine. At least all standard search engines do is return a link to the actual content. These AI models chew up the content and spit out something based on it. It simply makes sense that this new process should be licensed separately, and I don’t care if it makes some AI companies go bankrupt. Maybe they can work adequate payment for content into their business model going forward.

deweydecibel@lemmy.world · edit-2 1 year ago

It shouldn’t be cheap to absorb and regurgitate the works of humans the world over in an effort to replace those humans and subsequently enrich a handful of silicon valley people.

Like, I don’t care what you think about copyright law and how corporations abuse it, AI itself is corporate abuse.

And unlike copyright, which does serve its intended purpose of helping small time creators as much as it helps Disney, the true benefits of AI are overwhelmingly for corporations and investors. If our draconian copyright system is the best tool we have to combat that, good. It’s absolutely the lesser of the two evils.

lolcatnip@reddthat.com · 1 year ago

Do you believe it’s reasonable, in general, to develop technology that has the potential to replace some human labor?

Do you believe compensating copyright holders would benefit the individuals whose livelihood is at risk?

the true benefits of AI are overwhelmingly for corporations and investors

“True” is doing a lot of work here, I think. From my perspective the main beneficiaries of technology like LLMs and stable diffusion are people trying to do their work more efficiently, people paying around, and small-time creators who suddenly have custom graphics to illustrate their videos, articles, etc. Maybe you’re talking about something different, like deep fakes? The downside of using a vague term like “AI” is that it’s too easy to accidently conflate things that have little in common.

EldritchFeminity · 1 year ago

There’s 2 general groups when it comes to AI in my mind: Those whose work would benefit from the increased efficiency AI in various forms can bring, and those who want the rewards of work without putting in the effort of working.

The former include people like artists who could do stuff like creating iterations of concept sketches before choosing one to use for a piece to make that part of their job easier/faster.

Much of the opposition of AI comes from people worrying about/who have been harmed by the latter group. And it all comes down the way that the data sets are sourced.

These are people who want to use the hard work of others for their own benefit, without giving them compensation; and the corporations fall pretty squarely into this group. As does your comment about “small-time creators who suddenly have custom graphics to illustrate their videos, articles, etc.” Before AI, they were free to hire an artist to do that for them. MidJourney, for example, falls into this same category - the developers were caught discussing various artists that they “launder through a fine tuned Codex” (their words, not mine, here for source) for prompts. If these sorts of generators were using opt-in data sets, paying licensing fees to the creators, or some other way to get permission to use their work, this tech could have tons of wonderful uses, like for those small-time creators. This is how music works. There are entire businesses that run on licensing copyright free music out to small-time creators for their videos and stuff, but they don’t go out recording bands and then splicing their songs up to create synthesizers to sell. They pay musicians to create those songs.

Instead of doing what the guy behind IKEA did when he thought “people besides the rich deserve to be able to have furniture”, they’re cutting up Bob Ross paintings to sell as part of their collages to people who want to make art without having to actually learn how to make it or pay somebody to turn their idea into reality. Artists already struggle in a world that devalues creativity (I could make an entire rant on that, but the short is that the starving artist stereotype exists for a reason), and the way companies want to use AI like this is to turn the act of creating art into a commodity even more; to further divest the inherently human part of art from it. They don’t want to give people more time to create and think and enjoy life; they merely want to wring even more value out of them more efficiently. They want to take the writings of their journalists and use them to train the AI that they’re going to replace them with, like a video game journalism company did last fall with all of the writers they had on staff in their subsidiary companies. They think, “why keep 20 writers on staff when we can have a computer churn out articles for our 10 subsidiaries?” Last year, some guy took a screenshot of a piece of art that one of the artists for Genshin Impact was working on while livestreaming, ran it through some form of image generator, and then came back threatening to sue the artist for stealing his work.

Copyright laws don’t favor the small guy, but they do help them protect their work as a byproduct of working for corporate interests. In the case of the Genshin artist, the fact that they were livestreaming their work and had undeniable, recorded proof that the work was theirs and not some rando in their stream meant that copyright law would’ve been on their side if it had actually gone anywhere rather than some asshole just being an asshole. Trademark isn’t quite the same, but I always love telling the story of the time my dad got a cease and desist letter from a company in another state for the name of a product his small business made. So he did some research, found out that they didn’t have the trademark for it in that state, got the trademark himself, and then sent them back their own letter with the names cut out and pasted in the opposite spots. He never heard from them again!

AnneBonny@lemmy.dbzer0.com · 1 year ago

I don’t understand why people are defending AI companies sucking up all human knowledge by saying “well, yeah, copyrights are too long anyway”.

Would you characterize projects like wikipedia or the internet archive as “sucking up all human knowledge”?

MBM@lemmings.world · 1 year ago

Does Wikipedia ever have issues with copyright? If you don’t cite your sources or use a copyrighted image, it will get removed

dhork@lemmy.world · 1 year ago

In Wikipedia’s case, the text is (well, at least so far), written by actual humans. And no matter what you think about the ethics of Wikipedia editors, they are humans also. Human oversight is required for Wikipedia to function properly. If Wikipedia were to go to a model where some AI crawls the web for knowledge and writes articles based on that with limited human involvement, then it would be similar. But that’s not what they are doing.

The Internet Archive is on a bit less steady legal ground (see the resent legal actions), but in its favor it is only storing information for archival and lending purposes, and not using that information to generate derivative works which it is then selling. (And it is the lending that is getting it into trouble right now, not the archiving).

phillaholic@lemm.ee · 1 year ago

The Internet Archive has no ground to stand on at all. It would be one thing if they only allowed downloading of orphaned or unavailable works, but that’s not the case.

randon31415@lemmy.world · 1 year ago

Wikipedia has had bots writing articles since the 2000 census information was first published. The 2000 census article writing bot was actually the impetus for Wikipedia to make the WP:bot policies.

assassin_aragorn@lemmy.world · 1 year ago

Wikipedia is free to the public. OpenAI is more than welcome to use whatever they want if they become free to the public too.

afraid_of_zombies@lemmy.world · 1 year ago

It is free. They have a pair model with more stuff but the baseline model is more than enough for most things.

assassin_aragorn@lemmy.world · 1 year ago

There should be no paid model if they aren’t going to pay for training material.

afraid_of_zombies@lemmy.world · 1 year ago

There also shouldn’t be goal post moving in lemmy threads but yet here we are. Can you move the goalposts back into position for me?

afraid_of_zombies@lemmy.world · 1 year ago

The copyright shills in this thread would shutdown Wikipedia

lolcatnip@reddthat.com · 1 year ago

I don’t understand why people are defending AI companies

Because it’s not just big companies that are affected; it’s the technology itself. People saying you can’t train a model on copyrighted works are essentially saying nobody can develop those kinds of models at all. A lot of people here are naturally opposed to the idea that the development of any useful technology should be effectively illegal.

assassin_aragorn@lemmy.world · 1 year ago

This is frankly very simple.

If the AI is trained on copyrighted material and doesn’t pay for it, then the model should be freely available for everyone to use.
If the AI is trained on copyrighted material and pays a license for it, then the company can charge people for using the model.

If information should be free and copyright is stifling, then OpenAI shouldn’t be able to charge for access. If information is valuable and should be paid for, then OpenAI should have paid for the training material.

OpenAI is trying to have it both ways. They don’t want to pay for information, but they want to charge for information. They can’t have one without the either.

BURN@lemmy.world · 1 year ago

You can make these models just fine using licensed data. So can any hobbyist.

You just can’t steal other people’s creations to make your models.

lolcatnip@reddthat.com · 1 year ago

Of course it sounds bad when you using the word “steal”, but I’m far from convinced that training is theft, and using inflammatory language just makes me less inclined to listen to what you have to say.

BURN@lemmy.world · 1 year ago

Training is theft imo. You have to scrape and store the training data, which amounts to copyright violation based on replication. It’s an incredibly simple concept. The model isn’t the problem here, the training data is.

brain_in_a_box@lemmy.ml · edit-2 9 months ago

Removed by mod

lolcatnip@reddthat.com · 1 year ago

Training is theft imo.

Then it appears we have nothing to discuss.

dhork@lemmy.world · 1 year ago

I am not saying you can’t train on copyrighted works at all, I am saying you can’t train on copyrighted works without permission. There are fair use exemptions for copyright, but training AI shouldn’t apply. AI companies will have to acknowledge this and get permission (probably by paying money) before incorporating content into their models. They’ll be able to afford it.

lolcatnip@reddthat.com · 1 year ago

What if I do it myself? Do I still need to get permission? And if so, why should I?

I don’t believe the legality of doing something should depend on who’s doing it.

BURN@lemmy.world · 1 year ago

Yes you would need permission. Just because you’re a hobbyist doesn’t mean you’re exempt from needing to follow the rules.

As soon as it goes beyond a completely offline, personal, non-replicatible project, it should be subject to the same copyright laws.

If you purely create a data agnostic AI model and share the code, there’s no problem, as you’re not profiting off of the training data. If you create an AI model that’s available for others to use, then you’d need to have the licensing rights to all of the training data.

afraid_of_zombies@lemmy.world · 1 year ago

recent works that AI are using without any compensation to their creators.

Name the creator.

dhork@lemmy.world · 1 year ago

Um… Sure?

https://authorsguild.org/news/sign-our-open-letter-to-generative-ai-leaders/

https://readwrite.com/midjourney-ai-art-program-faces-lawsuit-over-alleged-use-of-magic-the-gathering-art/

https://thehill.com/policy/technology/4392624-new-york-times-chatgpt-lawsuit-poses-new-legal-threats-to-artificial-intelligence/

These are all writers and artists who have found their works wholly sucked into these Generative AI applications, and being made into derivative works,nwithout any compensation at all. This isn’t an abstract argument, content creators are actively discovering this, and their only recourse right now is to file lawsuits.

afraid_of_zombies@lemmy.world · edit-2 1 year ago

One name not a fucking click bait article. I want one single name of the artist who is now on food stamps because openai trained their model on their art.

dhork@lemmy.world · 1 year ago

That first link to the Authors Guild is to an open letter with over 15,000 names on it, but you didn’t bother clicking on it, did you?

afraid_of_zombies@lemmy.world · 1 year ago

Having problems reading and following instructions.

Give me a name of the artist who had a nice successful career and because of AI copying their works is now poor. 1 name. Not click bait, not a slacktivism open letter. 1 name.

No victim = No crime

HelloThere@sh.itjust.works · edit-2 1 year ago

I’m no fan of the current copyright law - the Statute of Anne was much better - but let’s not kid ourselves that some of the richest companies in the world have any desire what so ever to change it.

Gutless2615@ttrpg.network · 1 year ago

My brother in Christ I’m begging you to look just a little bit into the history of copyright expansion.

HelloThere@sh.itjust.works · 1 year ago

I am well aware.

LWD@lemm.ee · edit-2 1 year ago

deleted

Gutless2615@ttrpg.network · 1 year ago

I only discuss copyright on posts about AI copyright issues. Yes, brilliant observation. I also talk about privacy y issues on privacy relevant posts, labor issues on worker rights related articles and environmental justice on global warming pieces. Truly a brilliant and skewering observation. Youre a true internet private eye.

Fair use and pushing back against (corporate serving) copyright maximalism is an issue I am passionate about and engage in. Is that a problem for you?

LWD@lemm.ee · edit-2 1 year ago

deleted

Gutless2615@ttrpg.network · 1 year ago

deleted by creator

Fisk400@feddit.nu · 1 year ago

As long as capitalism exist in society, just being able go yoink and taking everyone’s art will never be a practical rule set.

S410@lemmy.ml · edit-2 1 year ago

deleted by creator

Exatron@lemmy.world · 1 year ago

How hard it is doesn’t matter. If you can’t compensate people for using their work, or excluding work people don’t want users, you just don’t get that data.

There’s plenty of stuff in the public domain.

afraid_of_zombies@lemmy.world · 1 year ago

And artists are being compensated now fairly?

Exatron@lemmy.world · 1 year ago

Previous wrongs don’t make this instance right.

afraid_of_zombies@lemmy.world · 1 year ago

now

beckerist@lemmy.world · 1 year ago

deleted by creator

Fisk400@feddit.nu · 1 year ago

Sounds like a OpenAI problem and not an us problem.

HelloThere@sh.itjust.works · edit-2 1 year ago

I never said it was going to be easy - and clearly that is why OpenAI didn’t bother.

If they want to advocate for changes to copyright law then I’m all ears, but let’s not pretend they actually have any interest in that.

deweydecibel@lemmy.world · 1 year ago

I can guarantee you that you’re going to have a pretty hard time finding a dataset with diverse data containing things like napkin doodles or bathroom stall writing that’s compiled with permission of every copyright holder involved.

You make this sound like a bad thing.

BURN@lemmy.world · 1 year ago

And why is that a bad thing?

Why are you entitled to other peoples work, just because “it’s hard to find data”?

S410@lemmy.ml · edit-2 1 year ago

deleted by creator

BURN@lemmy.world · 1 year ago

People do not consume and process data the same way an AI model does. Therefore it doesn’t matter about how humans learn, because AIs don’t learn. This isn’t repurposing work, it’s using work in a way the copyright holder doesn’t allow, just like copyright holders are allowed to prohibit commercial use.

S410@lemmy.ml · edit-2 1 year ago

deleted by creator

BURN@lemmy.world · 1 year ago

I’m well aware of how machine learning works. I did 90% of the work for a degree in exactly it. I’ve written semi-basic neural networks from scratch, and am familiar with terminology around training and how the process works.

Humans learn, process, and most importantly, transform data in a different manner than machines. The sum totality of the human existence each individual goes through means there is a transformation based on that existence that can’t be replicated by machines.

A human can replicate other styles, as you show with your example, but that doesn’t mean that is the total extent of new creation. It’s been proven in many cases that civilizations create art in isolation, not needing to draw from any previous art to create new ideas. That’s the human element that can’t be replicated in anything less than true General AI with real intelligence.

Machine Learning models such as the LLMs/GenerativeAI of today are statistically based on what it has seen before. While it doesn’t store the data, it does often replicate it in its outputs. That shows that the models that exist now are not creating new ideas, rather mixing up what they already have.

flop_leash_973@lemmy.world · edit-2 1 year ago

If it ends up being OK for a company like OpenAI to commit copyright infringement to train their AI models it should be OK for John/Jane Doe to pirate software for private use.

But that would never happen. Almost like the whole of copyright has been perverted into a scam.

tinwhiskers@lemmy.world · 1 year ago

Using copyrighted material is not the same thing as copyright infringement. You need to (re)publish it for it to become an infringement, and OpenAI is not publishing the material made with their tool; the users of it are. There may be some grey areas for the law to clarify, but as yet, they have not clearly infringed anything, any more than a human reading copyrighted material and making a derivative work.

hperrin@lemmy.world · 1 year ago

It comes from OpenAI and is given to OpenAI’s users, so they are publishing it.

linearchaos@lemmy.world · 1 year ago

It’s being mishmashed with a billion other documents just like to make a derivative work. It’s not like open hours giving you a copy of Hitchhiker’s Guide to the Galaxy.

hperrin@lemmy.world · 1 year ago

New York Times was able to have it return a complete NYT article, verbatim. That’s not derivative.

Fraubush@lemm.ee · 1 year ago

I thought the same thing until I read another perspective into it from Mike Masnick and, from what he writes, it seems pretty clear they manipulated ChatGPT with some very specific prompts that someone who doesn’t already pay NYT for access would not be able to do. For example, feeding it 3 verbatim paragraphs from an article and asking it to generate the rest if you understand how these LLMs work, its really not surprising that you can indeed force it to do things like that but it’s an extreme and I’m qith Masnick and the user your responding to on this one myself.

I also watched most of today’s subcommittee hearing on AI and journalism. A lot of the arguments are that this will destroy local journalism. Look, strong local journalism is some of the most important work that is dying right now. But the grave was dug by these large media companies and hedge funds that bought up and gutted those local news orgs and not many people outside of the industry batted an eye while that was happening. This is a bit of a tangent but I don’t exactly trust the giant headgefunds who gutted these local news journalists ocer the padt deacde to all of a sudden care at all about how important they are.

Sorry fir the tangent butbheres the article i mentioned thats more on topic - http://mediagazer.com/231228/p11#a231228p11

hperrin@lemmy.world · 1 year ago

So they gave it the 3 paragraphs that are available publicly, said continue, and it spat out the rest of the article that’s behind a paywall. That sure sounds like copyright infringement.

linearchaos@lemmy.world · 1 year ago

And that’s not the intent of the service, it’s a bug and they’ll fix it.

A_Very_Big_Fan@lemmy.world · 1 year ago

any more than a human reading copyrighted material and making a derivative work.

It seems obvious to me that it’s not doing anything different than a human does when we absorb information and make our own works. I don’t understand why practically nobody understands this

I’m surprised to have even found one person that agrees with me

BURN@lemmy.world · 1 year ago

Because it’s objectively not true. Humans and ML models fundamentally process information differently and cannot be compared. A model doesn’t “read a book” or “absorb information”

A_Very_Big_Fan@lemmy.world · edit-2 1 year ago

I didn’t say they processed information the same, I said generative AI isn’t doing anything that humans don’t already do. If I make a drawing of Gordon Freeman or Courage the Cowardly Dog, or even a drawing of Gordon Freeman in the style of Courage the Cowardly Dog, I’m not infringing on the copyright of Valve or John Dilworth. (Unless I monetize it, but even then there’s fair-use…)

Or if I read a statistic or some kind of piece of information in an article and spoke about it online, I’m not infringing the copyright of the author. Or if I listen to hundreds of hours of a podcast and then do a really good impression of one of the hosts online, I’m not infringing on that person’s copyright or stealing their voice.

Neither me making that drawing, nor relaying that information, nor doing that impression are copyright infringement. Me uploading a copy of Courage or Half-Life to the internet would be, or copying that article, or uploading the hypothetical podcast on my own account somewhere. Generative AI doesn’t publish anything, and even if it did I think there would be a strong case for fair-use for the same reasons humans would have a strong case for fair-use for publishing their derivative works.

Syntha@sh.itjust.works · 1 year ago

Insane how this comment is downvoted, when, as far as a I’m aware, it’s literally just the legal reality at this point in time.

Milk_Sheikh@lemm.ee · 1 year ago

Wow! You’re telling me that onerous and crony copyright laws stifle innovation and creativity? Thanks for solving the mystery guys, we never knew that!

deweydecibel@lemmy.world · 1 year ago

innovation and creativity

Neither of which are being stiffled here. OpenAI didn’t write ChatGPT with copyrighted code.

What’s being “stiffled” is corporate harvesting and profiting of the works of individuals, at their expense. And damn right it should be.

SCB@lemmy.world · 1 year ago

at their expense

How?

Milk_Sheikh@lemm.ee · 1 year ago

‘Data poisoning’, encryption, & copyright.

afraid_of_zombies@lemmy.world · 1 year ago

Please show me the poor artist whose work was stolen. I want a name.

If there is no victim there is no crime.

EldritchFeminity · 1 year ago

Click here to find out more

Just because you think art isn’t actually work and artists don’t deserve to be paid for the work they do doesn’t make it okay and doesn’t make you right.

afraid_of_zombies@lemmy.world · 1 year ago

Instead of screenshots why can’t you just type in the name? Why is basic research to back up your defense of the current copyright system so freaken impossible?

LWD@lemm.ee · edit-2 1 year ago

deleted

afraid_of_zombies@lemmy.world · 1 year ago

Attack the argument not the person.

LWD@lemm.ee · edit-2 1 year ago

deleted

EldritchFeminity · 1 year ago

How many you want, apart from Sara Winters up there and 8Pxl, who I linked to. I have 25 pages of them from court documents. Roughly 4,000 names in total, in alphabetical order.

https://storage.courtlistener.com/recap/gov.uscourts.cand.407208/gov.uscourts.cand.407208.129.10.pdf

If I hadn’t included screenshots, you would’ve just claimed they were made up. Keep moving the goalposts, AI shill.

afraid_of_zombies@lemmy.world · 1 year ago

Type the name.

Gutless2615@ttrpg.network · 1 year ago

deleted by creator

LWD@lemm.ee · edit-2 1 year ago

deleted

kingthrillgore@lemmy.ml · 1 year ago

Its almost like we had a thing where copyrighted things used to end up but they extended the dates because money

Ultraviolet@lemmy.world · 1 year ago

This is where they have the leverage to push for actual copyright reform, but they won’t. Far more profitable to keep the system broken for everyone but have an exemption for AI megacorps.

gardylou@lemmy.world · 1 year ago

deleted by creator

afraid_of_zombies@lemmy.world · 1 year ago

If the rule is stupid or evil we should applaud people who break it.

kiagam@lemmy.world · 1 year ago

we should use those who break it as a beacon to rally around and change the stupid rule

Grabbels@lemmy.world · 1 year ago

Except they pocket millions of dollars by breaking that rule and the original creators of their “essential data” don’t get a single cent while their creations indirectly show up in content generated by AI. If it really was about changing the rules they wouldn’t be so obvious in making it profitable, but rather use that money to make it available for the greater good AND pay the people that made their training data. Right now they’re hell-bent in commercialising their products as fast as possible.

If their statement is that stealing literally all the content on the internet is the only way to make AI work (instead of for example using their profits to pay for a selection of all that data and only using that) then the business model is wrong and illegal. It’s as a simple as that.

I don’t get why people are so hell-bent on defending OpenAI in this case; if I were to launch a food-delivery service that’s affordable for everyone, but I shoplifted all my ingredients “because it’s the only way”, most would agree that’s wrong and my business is illegal. Why is this OpenAI case any different? Because AI is an essential development? Oh, and affordable food isn’t?

afraid_of_zombies@lemmy.world · 1 year ago

I am not defending OpenAi I am attacking copyright. Do you have freedom of speech if you have nothing to say? Do you have it if you are a total asshole? Do you have it if you are the nicest human who ever lived? Do you have it and have no desire to use it?

800XL@lemmy.world · 1 year ago

I guess the lesson here is pirate everything under the sun and as long as you establish a company and train a bot everything is a-ok. I wish we knew this when everyone was getting dinged for torrenting The Hurt Locker back when.

Remember when the RIAA got caught with pirated mp3s and nothing happened?

What a stupid timeline.

Alien Nathan Edward@lemm.ee · 1 year ago

if it’s impossible for you to have something without breaking the law you have to do without it

if it’s impossible for the artistocrat class to have something without breaking the law, we change or ignore the law

lolcatnip@reddthat.com · 1 year ago

Copyright law is mostly bullshit, though.

Krauerking@lemy.lol · 1 year ago

Oh sure. But why is it only the massive AI push that allows the large companies owning the models full of stolen materials that make basic forgeries of the stolen items the ones that can ignore the bullshit copyright laws?

It wouldn’t be because it is super profitable for multiple large industries right?

afraid_of_zombies@lemmy.world · 1 year ago

Just because people are saying the law is bad doesn’t mean they are saying the lawbreakers are good. Those two are independent of each other.

I have never been against cannabis legalization. That doesn’t mean I think people who sold it on the streets are good people.

dutchkimble@lemy.lol · 1 year ago

Cool, don’t do it then

S410@lemmy.ml · edit-2 1 year ago

deleted by creator

AntY@lemmy.world · 1 year ago

The main difference between the two in your analogy, that has great bearing on this particular problem, is that the machine learning model is a product that is to be monetized.

S410@lemmy.ml · edit-2 1 year ago

deleted by creator

GentlemanLoser@ttrpg.network · 1 year ago

Naive

MBM@lemmings.world · 1 year ago

Sounds like a solution would be to force, for any AI, to either share the source code or proof that it’s not trained on copyrighted data

deweydecibel@lemmy.world · 1 year ago

And ultimately replace the humans it learned from.

Zoboomafoo@slrpnk.net · 1 year ago

Good, I want AI to do all my work for me

afraid_of_zombies@lemmy.world · 1 year ago

Yes clearly 90 years plus death of artist is acceptable

LWD@lemm.ee · edit-2 1 year ago

deleted

testfactor@lemmy.world · 1 year ago

And real children aren’t in a capitalist society?

BURN@lemmy.world · 1 year ago

Also an “AI” is not human, and should not be regulated as such

afraid_of_zombies@lemmy.world · 1 year ago

Neither is a corporation and yet they claim first amendment rights.

BURN@lemmy.world · 1 year ago

That’s an entirely separate problem, but is certainly a problem

afraid_of_zombies@lemmy.world · 1 year ago

I don’t think it is. We have all these non-human stuff we are awarding more rights to than we have. You can’t put a corporation in jail but you can put me in jail. I don’t have freedom from religion but a corporation does.

BURN@lemmy.world · 1 year ago

Corporations are not people, and should not be treated as such.

If a company does something illegal, the penalty should be spread to the board. It’d make them think twice about breaking the law.

We should not be awarding human rights to non-human, non-sentient creations. LLMs and any kind of Generative AI are not human and should not in any case be treated as such.

afraid_of_zombies@lemmy.world · 1 year ago

Corporations are not people, and should not be treated as such.

Understand. Please tell Disney that they no longer own Mickey Mouse.

Exatron@lemmy.world · 1 year ago

The difference here is that a child can’t absorb and suddenly use massive amounts of data.

S410@lemmy.ml · edit-2 1 year ago

deleted by creator

Barbarian@sh.itjust.works · edit-2 1 year ago

I really don’t understand this whole “learning” thing that everybody claims these models are doing.

A Markov chain algorithm with different inputs of text and the output of the next predicted word isn’t colloquially called “learning”, yet it’s fundamentally the same process, just less sophisticated.

They take input, apply a statistical model to it, generate output derived from the input. Humans have creativity, lateral thinking and the ability to understand context and meaning. Most importantly, with art and creative writing, they’re trying to express something.

“AI” has none of these things, just a probability for which token goes next considering which tokens are there already.

agamemnonymous@sh.itjust.works · 1 year ago

Humans have creativity, lateral thinking and the ability to understand context and meaning

What evidence do you have that those aren’t just sophisticated, recursive versions of the same statistical process?

Barbarian@sh.itjust.works · edit-2 1 year ago

I think the best counter to this is to consider the zero learning state. A language model or art model without any training data at all will output static, basically. Random noise.

A group of humans socially isolated from the rest of the world will independently create art and music. It has happened an uncountable number of times. It seems to be a fairly automatic emergent property of human societies.

With that being the case, we can safely say that however creativity works, it’s not merely compositing things we’ve seen or heard before.

agamemnonymous@sh.itjust.works · 1 year ago

I disagree with this analysis. Socially isolated humans aren’t isolated, they still have nature to imitate. There’s no such thing as a human with no training data. We gather training data our whole life, possibly from the womb. Even in an isolated group, we still have others of the group to imitate, who in turn have ancestors, and again animals and natural phenomena. I would argue that all creativity is precisely compositing things we’ve seen or heard before.

sus@programming.dev · edit-2 1 year ago

I don’t think “learning” is a word reserved only for high-minded creativeness. Just rote memorization and repetition is sometimes called learning. And there are many intermediate states between them.

testfactor@lemmy.world · 1 year ago

Out of curiosity, how far do you extend this logic?

Let’s say I’m an artist who does fractal art, and I do a line of images where I take jpegs of copywrite protected art and use the data as a seed to my fractal generation function.

Have I have then, in that instance, taken a copywritten work and simply applied some static algorithm to it and passed it off as my own work, or have I done something truly transformative?

The final image I’m displaying as my own art has no meaningful visual cues to the original image, as it’s just lines and colors generated using the image as a seed, but I’ve also not applied any “human artistry” to it, as I’ve just run it through an algorithm.

Should I have to pay the original copywrite holder?
If so, what makes that fundamentally different from me looking at the copywritten image and drawing something that it inspired me to draw?
If not, what makes that fundamentally different from AI images?

LWD@lemm.ee · edit-2 1 year ago

deleted

testfactor@lemmy.world · 1 year ago

I feel like you latched on to one sentence in my post and didn’t engage with the rest of it at all.

That sentence, in your defense, was my most poorly articulated, but I feel like you responded devoid of any context.

Am I to take it, from your response, that you think that a fractal image that uses a copywritten image as a seed to it’s random number generator would be copyright infringement?

If so, how much do I, as the creator, have to “transform” that base binary string to make it “fair use” in your mind? Are random but flips sufficient?
If so, how is me doing that different than having the machine do that as a tool? If not, how is that different than me editing the bits using a graphical tool?

LWD@lemm.ee · edit-2 1 year ago

deleted

HelloThere@sh.itjust.works · 1 year ago

It’s a question of scale. A single child cannot replace literally all artists, for example.

Exatron@lemmy.world · 1 year ago

The problem is that a human doesn’t absorb exact copies of what it learns from, and fair use doesn’t include taking entire works, shoving them in a box, and shaking it until something you want comes out.

S410@lemmy.ml · edit-2 1 year ago

deleted by creator

Exatron@lemmy.world · 1 year ago

Except they literally don’t. Human memory doesn’t retain an exact copy of things. Very good isn’t the same as exactly. And human beings can’t grab everything they see and instantly use it.

S410@lemmy.ml · edit-2 1 year ago

deleted by creator

PipedLinkBot@feddit.rocks · 1 year ago

Here is an alternative Piped link(s):

C418 - Haggstrom, but it’s composed by John Williams

Piped is a privacy-respecting open-source alternative frontend to YouTube.

I’m open-source; check me out at GitHub.

Chee_Koala@lemmy.world · 1 year ago

But our current copyright model is so robust and fair! They will only have to wait 95y after the author died, which is a completely normal period.

If you want to control your creations, you are completely free to NOT publish it. Nowhere it’s stated that to be valuable or beautiful, it has to be shared on the world podium.

We’ll have a very restrictive Copyright for non globally transmitted/published works, and one for where the owner of the copyright DID choose to broadcast those works globally. They have a couple years to cash in, and then after I dunno, 5 years, we can all use the work as we see fit. If you use mass media to broadcast creative works but then become mad when the public transforms or remixes your work, you are part of the problem.

Current copyright is just a tool for folks with power to control that power. It’s what a boomer would make driving their tractor / SUV while chanting to themselves: I have earned this.

LWD@lemm.ee · edit-2 1 year ago

deleted

just_change_it@lemmy.world · 1 year ago

I think it’s pretty amazing when people just run with the dogma that empowers billionaires.

Every creator hopes they’ll be the next taylor swift and that they’ll retain control of their art for those life + 70 years and make enough to create their own little dynasty.

The reality is that long duration copyright is almost exclusively a tool of the already wealthy, not a tool for the not-yet-wealthy. As technology improves it will be easier and easier for wealth to control the system and deny the little guy’s copyright on grounds that you used something from their vast portfolio of copyright/patent/trademark/ipmonopolyrulelegalbullshit. Already civil legal disputes are largely a function of who has the most money.

I don’t have the solution that helps artists earn a living, but it doesn’t seem like copyright is doing them many favors as-is unless they are retired rockstars who have already earned in excess of the typical middle class lifetime earnings by the time they hit 35, or way earlier.

assassin_aragorn@lemmy.world · 1 year ago

I don’t have the solution that helps artists earn a living, but it doesn’t seem like copyright is doing them many favors as-is unless they are retired rockstars who have already earned in excess of the typical middle class lifetime earnings by the time they hit 35, or way earlier.

Just because copyright helps them less doesn’t mean it doesn’t help them at all. And at the end of the day, I’d prefer to support the retired rockstars over the stealing billionaires.

LWD@lemm.ee · edit-2 1 year ago

deleted

afraid_of_zombies@lemmy.world · 1 year ago

Current Copyright Law Imperfect,

Yeah and Joseph Stalin was a bit naughty. As long as we are seeing how understated we can be.

If you don’t have the solution, perhaps you should not attack one of the remaining defenses against rampant abuses of peoples’ livelihood.

The creator of Superman wasnt paid royalties and was laid off. Many years later he worked a restaurant delivery guy and ended up dropping off food at DC comics. The artist that built that company doing a sandwich run.

LWD@lemm.ee · edit-2 1 year ago

deleted

afraid_of_zombies@lemmy.world · 1 year ago

If you got an accusation go ahead and make it. I will be hearing downloading a fucking car

LWD@lemm.ee · edit-2 1 year ago

deleted

drislands@lemmy.world · 1 year ago

Them: “Oh yeah I have 10 minutes until my dentist appointment, I’ll check that out.”

Chee_Koala@lemmy.world · edit-2 1 year ago

First:

I truly believe that they don’t matter as an individual when looking at their creation as a whole. It matters among their loved ones, and for that person itself. Why do you need more… importance? From who? Why do you need to matter in scope of creation? Is it a creation for you? Then why publish it? Is it a creation for others? Then why does your identity matter? It just seems like egotism with extra steps. Using copyright to combat this seems like a red herring argument made by people who have portfolio’s against people who don’t…

You are not only your own person, you carry human culture remnants distilled out of 12000 years of humanity! You plagiarised almost the whole of humanity while creating your ‘unique’ addition to culture. But, because your remixed work is newer and not directly traceable to its direct origins, we’re gonna pretend that you wrote it as a hermit living without humanity on a rock and establish the rules from there on out. If it was fair for all the players in this game, it would already be impossible to not plagiarise.

PipedLinkBot@feddit.rocks · 1 year ago

Here is an alternative Piped link(s):

a great video

Piped is a privacy-respecting open-source alternative frontend to YouTube.

I’m open-source; check me out at GitHub.

lolcatnip@reddthat.com · 1 year ago

IMHO being able to “control your creations” isn’t what copyright was created for; it’s just an idea people came up with by analogy with physical property without really thinking through what purpose is supposed to serve. I believe creators of intellectual “property” have no moral right to control what happens with their creations, and they only have a limited legal right to do so as a side-effect of their legal right to profit from their creations.

kibiz0r@lemmy.world · 1 year ago

I’m dumbfounded that any Lemmy user supports OpenAI in this.

We’re mostly refugees from Reddit, right?

Reddit invited us to make stuff and share it with our peers, and that was great. Some posts were just links to the content’s real home: Youtube, a random Wordpress blog, a Github project, or whatever. The post text, the comments, and the replies only lived on Reddit. That wasn’t a huge problem, because that’s the part that was specific to Reddit. And besides, there were plenty of third-party apps to interact with those bits of content however you wanted to.

But as Reddit started to dominate Google search results, it displaced results that might have linked to the “real home” of that content. And Reddit realized a tremendous opportunity: They now had a chokehold on not just user comments and text posts, but anything that people dare to promote online.

At the same time, Reddit slowly moved from a place where something may get posted by the author of the original thing to a place where you’ll only see the post if it came from a high-karma user or bot. Mutated or distorted copies of the original instance, reformated to cut through the noise and gain the favor of the algorithm. Re-posts of re-posts, with no reference back to the original, divorced of whatever context or commentary the original creator may have provided. No way for the audience to respond to the author in any meaningful way and start a dialogue.

This is a miniature preview of the future brought to you by LLM vendors. A monetized portal to a dead internet. A one-way street. An incestuous ouroborous of re-posts of re-posts. Automated remixes of automated remixes.

–

There are genuine problems with copyright law. Don’t get me wrong. Perhaps the most glaring problem is the fact that many prominent creators don’t even own the copyright to the stuff they make. It was invented to protect creators, but in practice this “protection” gets assigned to a publisher immediately after the protected work comes into being.

And then that copyright – the very same thing that was intended to protect creators – is used as a weapon against the creator and against their audience. Publishers insert a copyright chokepoint in-between the two, and they squeeze as hard as they desire, wringing it of every drop of profit, keeping creators and audiences far away from each other. Creators can’t speak out of turn. Fans can’t remix their favorite content and share it back to the community.

This is a dysfunctional system. Audiences are denied the ability to access information or participate in culture if they can’t pay for admission. Creators are underpaid, and their creative ambitions are redirected to what’s popular. We end up with an auto-tuned culture – insular, uncritical, and predictable. Creativity reduced to a product.

But.

If the problem is that copyright law has severed the connection between creator and audience in order to set up a toll booth along the way, then we won’t solve it by giving OpenAI a free pass to do the exact same thing at massive scale.

flamingarms@feddit.uk · 1 year ago

And yet, I believe LLMs are a natural evolutionary product of NLP and a powerful tool that is a necessary step forward for humanity. It is already capable of exceptionally quickly scaffolding out basic tasks. In it, I see the assumptions that all human knowledge is for all humans, rudimentary tasks are worth automating, and a truly creative idea is often seeded by information that already exists and thus creativity can be sparked by something that has access to all information.

I am not sure what we are defending by not developing them. Is it a capitalism issue of defending people’s money so they can survive? Then that’s a capitalism problem. Is it that we don’t want to get exactly plagiarized by AI? That’s certainly something companies are and need to continue taking into account. But researchers repeat research and come to the same conclusions all the time, so we’re clearly comfortable with sharing ideas. Even in the Writer’s Guild strikes in the States, both sides agreed that AI is helpful in script-writing, they just didn’t want production companies to use it as leverage to pay them less or not give them credit for their part in the production.

EldritchFeminity · 1 year ago

The big issue is, as you said, a capitalism problem, as people need money from their work in order to eat. But, it goes deeper than that and that doesn’t change the fact that something needs to be done to protect the people creating the stuff that goes into the learning models. Ultimately, it comes down to the fact that datasets aren’t ethically sourced and that people want to use AI to replace the same people whose work they used to create said AI, but it also has a root in how society devalues the work of creativity. People feel entitled to the work of artists. For decades, people have believed that artists shouldn’t be fairly compensated for their work, and the recent AI issue is just another stone in the pile. If you want to see how disgusting it is, look up stuff like “paid in exposure” and the other kinds of things people tell artists they should accept as payment instead of money.

In my mind, there are two major groups when it comes to AI: Those whose work would benefit from the increased efficiency AI would bring, and those who want the reward for work without actually doing the work or paying somebody with the skills and knowledge to do the work. MidJourney is in the middle of a lawsuit right now and the developers were caught talking about how you “just need to launder it through a fine tuned Codex.” With the “it” here being artists’ work. Link The vast majority of the time, these are the kinds of people I see defending AI; they aren’t people sharing and collaborating to make things better - they’re people who feel entitled to benefit from others’ work without doing anything themselves. Making art is about the process and developing yourself as a person as much as it is about the end result, but these people don’t want all that. They just want to push a button and get a pretty picture or a story or whatever, and then feel smug and superior about how great an artist they are.

All that needs to be done is to require that the company that creates the AI has to pay a licensing fee for copyrighted material, and allow for copyright-free stuff and content where they have gotten express permission to use (opt-in) to be used freely. Those businesses with huge libraries of copyright-free music that you pay a subscription fee to use work like this. They pay musicians to create songs for them; they don’t go around downloading songs and then cut them up to create synthesizers that they sell.

Milk_Sheikh@lemm.ee · 1 year ago

Mutated or distorted copies of the original instance, reformated to cut through the noise and gain the favor of the algorithm. Re-posts of re-posts, with no reference back to the original, divorced of whatever context or commentary the original creator may have provided… This is a miniature preview of the future brought to you by LLM vendors. A monetized portal to a dead internet. A one-way street. An incestuous ouroborous of re-posts of re-posts. Automated remixes of automated remixes.

The internet is genuinely already trending this way just from LLM AI writing things like: articles and bot reviews, listicle and ‘review’ websites that laser focus for SEO hits, social media comments and posts to propagandize or astroturf…

We are going to live and die by how the Captcha-AI arms race is ran against the malicious actors, but that won’t help when governments or capital give themselves root access.

afraid_of_zombies@lemmy.world · 1 year ago

Too long didn’t read, busy downloading a car now. How much did Disney pay for this comment?

Ook the Librarian@lemmy.world · 1 year ago

It’s not “impossible”. It’s expensive and will take years to produce material under an encompassing license in the quantity needed to make the model “large”. Their argument is basically “but we can have it quickly if you allow legal shortcuts.”

Patches@sh.itjust.works · 1 year ago

That argument has unfortunately worked for many other Tech Bros

agitatedpotato@lemmy.world · 1 year ago

Whenever a company says something is impossible, they usually mean it’s just unprofitable.

afraid_of_zombies@lemmy.world · 1 year ago

The law is shit

unreasonabro@lemmy.world · 1 year ago

finally capitalism will notice how many times it has shot up its own foot with their ridiculous, greedy infinite copyright scheme

As a musician, people not involved in the making of my music make all my money nowadays instead of me anyway. burn it all down

██████████@lemmy.world · 1 year ago

Pitchfork fest 2024

unreasonabro@lemmy.world · 1 year ago

… that’s a good album name, might use that ;)

██████████@lemmy.world · edit-2 1 year ago

it would sell

Blackmist@feddit.uk · 1 year ago

Maybe you shouldn’t have done it then.

I can’t make a Jellyfin server full of content without copyrighted material either, but the key difference here is I’m not then trying to sell that to investors.

afraid_of_zombies@lemmy.world · 1 year ago

Maybe copyrights don’t protect artists they protect corporations

Shazbot@lemmy.world · 1 year ago

Reading these comments has shown me that most users don’t realize that not all working artists are using 1099s and filing as an individual. Once you have stable income and assets (e.g. equipment) there are tax and legal benefits to incorporating your business. Removing copyright protections for large corporations will impact successful small artists who just wanted a few tax breaks.

afraid_of_zombies@lemmy.world · 1 year ago

Ok don’t care. Ban copyright

BURN@lemmy.world · 1 year ago

They protect artists AND protect corporations, and you can’t have one without the other. It’s much better the way it is compared to no copyright at all.

afraid_of_zombies@lemmy.world · 1 year ago

Which is why no artist has ever been screwed. Nope never one happened.

BURN@lemmy.world · 1 year ago

They’re screwed less than they would be if copyright was abolished. It’s not a perfect system by far, but over restrictive is 100x better than an open system of stealing from others.

afraid_of_zombies@lemmy.world · 1 year ago

Citation needed.

Also copying isn’t stealing.

agitatedpotato@lemmy.world · edit-2 1 year ago

So without copyright, if an artist makes a cool picture and coca cola uses it to sell soda and decided not to give the artist any money, now they have no legal recourse, and that’s better? I don’t think the issue is as much copyright inherently, as much as it is who holds and enforces those rights. If all copyrights were necessarily held by the people who actually made what is copy-written, much of the problems would be gone.

afraid_of_zombies@lemmy.world · 1 year ago

If the copyright people had their way we wouldn’t be able to write a single word without paying them. This whole thing is clearly a fucking money grab. It is not struggling artists being wiped out, it is big corporations suing a well funded startup.