🃏Joker@sh.itjust.works to Technology@lemmy.worldEnglish · 1 month agoAlignment faking in large language modelswww.anthropic.comexternal-linkmessage-square13fedilinkarrow-up178
arrow-up178external-linkAlignment faking in large language modelswww.anthropic.com🃏Joker@sh.itjust.works to Technology@lemmy.worldEnglish · 1 month agomessage-square13fedilink
minus-squareeleitl@lemm.eelinkfedilinkEnglisharrow-up1·1 month agoSo you mean “alignment with human expectations”. Not what I was meaning at all. Good that that word doesn’t even mean anything specific these days.
So you mean “alignment with human expectations”. Not what I was meaning at all. Good that that word doesn’t even mean anything specific these days.