Make illegally trained LLMs public domain as punishment

🃏Joker@sh.itjust.works · 4 months ago

Make illegally trained LLMs public domain as punishment

sem · 4 months ago

The LLM does reproduce copyrighted data though.

FaceDeer@fedia.io · 4 months ago

How?

ClamDrinker@lemmy.world · edit-2 4 months ago

Not 1:1, overfitted images still have considerable differences to their original. If you chose “reproduce” to make that point, that’s why OP clarified it wasn’t literally copying training data, as the actual data being in the model would be a different story. Because these models are (in simplified form) a bunch of really complex math that produces material, it’s a mathematical inevitability that it produces copyrighted material, even for calculations that weren’t created due to overfitting. Just like infinite monkeys on infinite typewriters will eventually reproduce every piece of copyrighted text.

But then I would point you to the camera on your phone. If you take a copyrighted picture with that, you’re still infringing. But was the camera created with the intention to appropriate material captured by the lens? Which is why we don’t blame the camera for that, we blame the person that used it for that purpose. AI users have an ethical obligation not to steer the AI towards generating infringing material.

catloaf@lemm.ee · 4 months ago

And the easiest way to do that is to not include infringing material in the first place.

desktop_user · 4 months ago

*it can produce data identical to data that has been copyrighted before