BlueMonday1984@awful.systems to TechTakes@awful.systemsEnglish · 8 days agoFacebook Pushes Its Llama 4 AI Model to the Right, Wants to Present “Both Sides” [404 Media]www.404media.coexternal-linkmessage-square7fedilinkarrow-up139 cross-posted to: demeta@programming.devBoycottUnitedStates@europe.pubtechnology@lemmy.worldbrainworms@lemm.ee404media@rss.ponder.cat
arrow-up139external-linkFacebook Pushes Its Llama 4 AI Model to the Right, Wants to Present “Both Sides” [404 Media]www.404media.coBlueMonday1984@awful.systems to TechTakes@awful.systemsEnglish · 8 days agomessage-square7fedilink cross-posted to: demeta@programming.devBoycottUnitedStates@europe.pubtechnology@lemmy.worldbrainworms@lemm.ee404media@rss.ponder.cat
minus-squarecorbin@awful.systemslinkfedilinkEnglisharrow-up5·6 days agoIt’s well-known folklore that reinforcement learning with human feedback (RLHF), the standard post-training paradigm, reduces “alignment,” the degree to which a pre-trained model has learned features of reality as it actually exists. Quoting from the abstract of the 2024 paper, Mitigating the Alignment Tax of RLHF (alternate link): LLMs acquire a wide range of abilities during pre-training, but aligning LLMs under Reinforcement Learning with Human Feedback (RLHF) can lead to forgetting pretrained abilities, which is also known as the alignment tax.
It’s well-known folklore that reinforcement learning with human feedback (RLHF), the standard post-training paradigm, reduces “alignment,” the degree to which a pre-trained model has learned features of reality as it actually exists. Quoting from the abstract of the 2024 paper, Mitigating the Alignment Tax of RLHF (alternate link):