It’s clear that companies are currently unable to make chatbots like ChatGPT comply with EU law, when processing data about individuals. If a system cannot produce accurate and transparent results, it cannot be used to generate data about individuals. The technology has to follow the legal requirements, not the other way around.

  • FuglyDuck@lemmy.world
    link
    fedilink
    English
    arrow-up
    71
    ·
    8 months ago

    This is the problem with training LLMs on Reddit. It doesn’t know how to say “I don’t know”. So, like Redditors…. It just makes shit up.

    • Ech@lemm.ee
      link
      fedilink
      English
      arrow-up
      47
      ·
      8 months ago

      It’s not that it doesn’t know how to say “I don’t know”. It simply doesn’t know. Period. LLMs are not sentient and they don’t think about the questions they are asked, let alone if the answer they provide is correct. They string words together. That’s all. That we’ve gotten those strings of words to strongly resemble coherent text is very impressive, but it doesn’t make the program intelligent in the slightest.

      • Flying Squid@lemmy.world
        link
        fedilink
        arrow-up
        12
        ·
        8 months ago

        What amazes me is that people don’t find it significant that they don’t ask questions. I would argue there is no such thing as intelligence without curiosity.

        • reev@sh.itjust.works
          link
          fedilink
          English
          arrow-up
          2
          ·
          8 months ago

          What do you even mean with that? Pi asks questions and certainly feels curious and engaged in conversation. Even chatgpt will ask for more information if it doesn’t find the requested information in, for example, an Excel spreadsheet you upload.

    • catloaf@lemm.ee
      link
      fedilink
      English
      arrow-up
      22
      ·
      8 months ago

      They’re trained on far more than reddit. But it’s not a training data problem, it’s a wrong tool problem. It’s called “generative AI” for a reason: it generates text, same way a Markov chain does. You want it to tell you something, it’ll tell you. You want factual data, don’t ask a storyteller.

    • Flying Squid@lemmy.world
      link
      fedilink
      arrow-up
      18
      ·
      edit-2
      8 months ago

      What I think is especially funny though is that both the other person and myself have done enough (not horrific) things in our lives to have things like mainstream media mentions but it still got it entirely wrong.

      I’m not famous but it definitely should have known who I am.

    • Eranziel@lemmy.world
      link
      fedilink
      arrow-up
      3
      ·
      8 months ago

      If an LLM had to say “I don’t know” when it doesn’t know, that’s all it would be allowed to say! They literally don’t know anything. They don’t even know what knowing means. They are complex (and impressive, admittedly) text generators.