Scientists Train AI to Be Evil, Find They Can’t Reverse It::How hard would it be to train an AI model to be secretly evil? As it turns out, according to Anthropic researchers, not very.

  • the_q@lemmy.world
    link
    fedilink
    English
    arrow-up
    9
    ·
    11 months ago

    Is this really that surprising? Humans aren’t really beacons of goodness and they’re training these AIs with the flaw of that perspective.

    • Obinice@lemmy.world
      link
      fedilink
      English
      arrow-up
      4
      ·
      11 months ago

      What do you mean I’m not a beacon of goodness?! Say that again and I’ll get stabby!!