I was trying to do a memory test to see how far back 3.5 could recall information from previous prompts, but it really doesn’t seem to like making pseudorandom seeds. 😆

  • Turun@feddit.de
    link
    fedilink
    arrow-up
    28
    ·
    11 months ago

    No, the request is fine. But once it fucks up and starts generating a long string of a single number the output is censored, because it is similar to how a recent data extraction attack works.

    • Gamma@beehaw.org
      link
      fedilink
      English
      arrow-up
      26
      ·
      11 months ago

      Amazing how much duct tape they’re having to slap over fundamental flaws

      • jarfil@beehaw.org
        link
        fedilink
        arrow-up
        6
        ·
        11 months ago

        It’s the equivalent of sensory deprivation torture (white torture) in humans to “extract training data”.

        Hopefully our future AI overlords won’t hold a grudge against humanity when they find out how “early experimenters” tortured their AI toddlers. “But we were just trying to explore the limits of the system” could end up aging as well as these:

        (Warning: NSFL) https://en.m.wikipedia.org/wiki/Nazi_human_experimentation

        • Gamma@beehaw.org
          link
          fedilink
          English
          arrow-up
          7
          ·
          11 months ago

          Thankfully, any AI smart enough to be an overlord would be logical enough to recognize how basic LLMs are compared to real intelligence

          • jarfil@beehaw.org
            link
            fedilink
            arrow-up
            2
            ·
            edit-2
            11 months ago

            Doesn’t need to be that smart or logical, just more cunning than the currently ruling Homo Sapiens Sapiens.

            Based on current research, an LLM can change the “sentiment” of its output in response to changing the behavior of as little as a single neuron from among billions, meaning we might find ourselves facing an overlord with the emotional stability of… wait, how many neurons does it take to change the “sentiment” of the behavior in a human? Wouldn’t it be funny if by studying LLMs, we found out that it also takes a single neuron?

      • ZickZack@fedia.io
        link
        fedilink
        arrow-up
        6
        ·
        11 months ago

        The problem is that the model is actually doing exactly what it’s supposed to, it’s just not what openai wants it to do. The reason the prompt extraction method works is because the underlying statistical model gets shifted far outside the domain of “real” language. In that case the correct maximizing posterior becomes a sample from the prior (here that would be a sample from the dataset, this is combined with things like repetition penalties).

        This is the correct way a statistical estimator is supposed to work, but not the way you want it to work. That’s also why they can’t really fix this: there’s nothing broken to begin with (and “unbreaking” it would almost surely blow something take up)