If the concern is about “fears” as in “feelings”… there is an interesting experiment where a single neuron/weight in an LLM, can be identified to control the “tone” of its output, whether it be more formal, informal, academic, jargon, some dialect, etc. and expose it to the user for control over the LLM’s output.
With a multi-billion neuron network, acting as an a priori black box, there is no telling whether there might be one or more neurons/weights that could represent “confidence”, “fear”, “happiness”, or any other “feeling”.
It’s something to be researched, and I bet it’s going to be researched a lot.
If you give ai instruction to do something “no matter what”
The interesting part of the paper, is that the AIs would do the same even in cases where they were NOT instructed to “no matter what”. An apparently innocent conversation, can trigger results like those of a pathological liar, sometimes.
If the concern is about “fears” as in “feelings”… there is an interesting experiment where a single neuron/weight in an LLM, can be identified to control the “tone” of its output, whether it be more formal, informal, academic, jargon, some dialect, etc. and expose it to the user for control over the LLM’s output.
With a multi-billion neuron network, acting as an a priori black box, there is no telling whether there might be one or more neurons/weights that could represent “confidence”, “fear”, “happiness”, or any other “feeling”.
It’s something to be researched, and I bet it’s going to be researched a lot.
The interesting part of the paper, is that the AIs would do the same even in cases where they were NOT instructed to “no matter what”. An apparently innocent conversation, can trigger results like those of a pathological liar, sometimes.
oh, that is quite interesting. If its actually doing things (that make sense) it hasnt been instructed to then it could be sign of real intelligence