• Voroxpete@sh.itjust.works
    link
    fedilink
    arrow-up
    2
    ·
    2 days ago

    My guess is that it’s set up to see contexts with conflicting positions associated as controversial but it will just go with responses that don’t have controversy associated with them.

    This is significantly more reasoning and analysis than LLMs are capable of.

    When they give those “I can’t respond to that” replies, it’s because a specific programmed keyword filter was tripped, forcing the model to insert a pre-programmed response instead. The rest of the time, they’re just regurgitating a melange of the most statically present text on the subject from their training data.

    • Buddahriffic@lemmy.world
      link
      fedilink
      arrow-up
      1
      ·
      2 days ago

      Yeah, that’s what censorship usually looks like but look at the image in the comment I originally replied to. It didn’t say “I can’t answer that”, it said it didn’t have an opinion and then talked about the controversial nature of it.

      It’s not really reasoning or analysis I’m talking about but the way it ended up setting up its weights in the NN. If it had training data with wildly different responses to questions like that and had training data that commented on wildly different opinions as controversial, then that could make it believe (metaphor) that “it’s a controversial subject” is the most statistically present text.