• DavidGarcia@feddit.nl
    link
    fedilink
    English
    arrow-up
    6
    ·
    3 days ago

    For the small ones, with GPUs a couple hundred watts when generating. For the large ones, somewhere between 10 to 100 times that.

    With specialty hardware maybe 10x less.

    • Pennomi@lemmy.world
      link
      fedilink
      English
      arrow-up
      3
      ·
      3 days ago

      A lot of the smaller LLMs don’t require GPU at all - they run just fine on a normal consumer CPU.

      • DavidGarcia@feddit.nl
        link
        fedilink
        English
        arrow-up
        1
        ·
        2 days ago

        yeah but 10x slower, at speeds that just don’t work for many use cases. When you compare energy consumption per token, there isn’t much difference.

      • copygirl
        link
        fedilink
        English
        arrow-up
        3
        ·
        3 days ago

        Wouldn’t running on a CPU (while possible) make it less energy efficient, though?

        • Pennomi@lemmy.world
          link
          fedilink
          English
          arrow-up
          3
          ·
          3 days ago

          It depends. A lot of LLMs are memory-constrained. If you’re constantly thrashing the GPU memory it can be both slower and less efficient.