GGUF quants are already up and llama.cpp was updated today to support it.

  • Picasso@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    2
    ·
    2 months ago

    Im especially interested in its advanced OCR capabilities. Will be testing this one out on lm studio

  • brucethemoose@lemmy.world
    link
    fedilink
    English
    arrow-up
    2
    ·
    2 months ago

    I tested these out and found they are really bad at longer context… at least in settings that can sanely fit on most GPUs.

    Seems the Gemma family is mostly for short-context work, still.