• snipgan@kbin.social
    link
    fedilink
    arrow-up
    1
    ·
    1 year ago

    I really think artists/authors/etc. are going about this the wrong way. ChatGPT and other trained models aren’t really the issue here. How the data is available and collected by other software and groups is.

    What we should be really talking about is data privacy. Who can and how easily access one’s data they put on the internet.

    • tinwhiskers@kbin.social
      link
      fedilink
      arrow-up
      1
      ·
      edit-2
      1 year ago

      Well of course, putting it on the open internet is very intentionally making it available for everyone to see. If you don’t want everyone to see it, don’t put it on the open internet. The issue is what people do with it, not whether they can access it. Copyright forbids distributing copyrighted data. The entire point of that it is so that you can make it available to be seen but protected from people copying it. However, there is no distribution or storage of copyrighted material with an LLM - there is no copy. I think OpenAI will be OK, but these things are never certain when the big lawyers are let loose.

      Distributing the training dataset, though, that could well be a problem.