• PenguinTD@lemmy.ca
      link
      fedilink
      arrow-up
      5
      ·
      1 year ago

      I think it’s possible to have a middle ground, just by putting properly formed license terms like foss projects. Ie. Specify that the AI/bot must be following certain rules(ie, fetch the comments but not the user IDs), because if we don’t provide data for open and free alternative, there will be no good AI tools for common folks. And the top dogs are all hoarding data with sneaky ToS.

    • RandomBit@lemmy.sdf.org
      link
      fedilink
      arrow-up
      1
      ·
      1 year ago

      Nothing can stop screen scraping. A nefarious LLM startup could hire an outside group to screen scrape the data and give it to them for training. Charging exorbitant amounts of money for the API hurts users far more than other companies with a profit motive to get the data.