• sushibowl@feddit.nl
    link
    fedilink
    arrow-up
    14
    ·
    3 days ago

    Most likely there is a separate censor LLM watching the model output. When it detects something that needs to be censored it will zap the output away and stop further processing. So at first you can actually see the answer because the censor model is still “thinking.”

    When you download the model and run it locally it has no such censorship.