My guess is that it’s set up to see contexts with conflicting positions associated as controversial but it will just go with responses that don’t have controversy associated with them.
This is significantly more reasoning and analysis than LLMs are capable of.
When they give those “I can’t respond to that” replies, it’s because a specific programmed keyword filter was tripped, forcing the model to insert a pre-programmed response instead. The rest of the time, they’re just regurgitating a melange of the most statically present text on the subject from their training data.
Yeah, that’s what censorship usually looks like but look at the image in the comment I originally replied to. It didn’t say “I can’t answer that”, it said it didn’t have an opinion and then talked about the controversial nature of it.
It’s not really reasoning or analysis I’m talking about but the way it ended up setting up its weights in the NN. If it had training data with wildly different responses to questions like that and had training data that commented on wildly different opinions as controversial, then that could make it believe (metaphor) that “it’s a controversial subject” is the most statistically present text.
This is significantly more reasoning and analysis than LLMs are capable of.
When they give those “I can’t respond to that” replies, it’s because a specific programmed keyword filter was tripped, forcing the model to insert a pre-programmed response instead. The rest of the time, they’re just regurgitating a melange of the most statically present text on the subject from their training data.
Yeah, that’s what censorship usually looks like but look at the image in the comment I originally replied to. It didn’t say “I can’t answer that”, it said it didn’t have an opinion and then talked about the controversial nature of it.
It’s not really reasoning or analysis I’m talking about but the way it ended up setting up its weights in the NN. If it had training data with wildly different responses to questions like that and had training data that commented on wildly different opinions as controversial, then that could make it believe (metaphor) that “it’s a controversial subject” is the most statistically present text.