• chiisana@lemmy.chiisana.net
    link
    fedilink
    English
    arrow-up
    23
    ·
    2 months ago

    What’s the resources requirements for the 405B model? I did some digging but couldn’t find any documentation during my cursory search.

    • modeler@lemmy.world
      link
      fedilink
      English
      arrow-up
      38
      ·
      edit-2
      2 months ago

      Typically you need about 1GB graphics RAM for each billion parameters (i.e. one byte per parameter). This is a 405B parameter model. Ouch.

      Edit: you can try quantizing it. This reduces the amount of memory required per parameter to 4 bits, 2 bits or even 1 bit. As you reduce the size, the performance of the model can suffer. So in the extreme case you might be able to run this in under 64GB of graphics RAM.

      • cheddar@programming.dev
        link
        fedilink
        English
        arrow-up
        21
        ·
        2 months ago

        Typically you need about 1GB graphics RAM for each billion parameters (i.e. one byte per parameter). This is a 405B parameter model.

      • Siegfried@lemmy.world
        link
        fedilink
        English
        arrow-up
        8
        ·
        edit-2
        2 months ago

        At work we habe a small cluster totalling around 4TB of RAM

        It has 4 cooling units, a m3 of PSUs and it must take something like 30 m2 of space

      • obbeel@lemmy.eco.br
        link
        fedilink
        English
        arrow-up
        2
        ·
        2 months ago

        According to huggingface, you can run a 34B model using 22.4GBs of RAM max. That’s a RTX 3090 Ti.

      • Longpork3@lemmy.nz
        link
        fedilink
        English
        arrow-up
        1
        ·
        edit-2
        2 months ago

        Hmm, I probably have that much distributed across my network… maybe I should look into some way of distributing it across multiple gpu.

        Frak, just counted and I only have 270gb installed. Approx 40gb more if I install some of the deprecated cards in any spare pcie slots i can find.

    • Blaster M@lemmy.world
      link
      fedilink
      English
      arrow-up
      6
      ·
      2 months ago

      As a general rule of thumb, you need about 1 GB per 1B parameters, so you’re looking at about 405 GB for the full size of the model.

      Quantization can compress it down to 1/2 or 1/4 that, but “makes it stupider” as a result.