GGUF quants are already up and llama.cpp was updated today to support it.
You must log in or register to comment.
Im especially interested in its advanced OCR capabilities. Will be testing this one out on lm studio
I tested these out and found they are really bad at longer context… at least in settings that can sanely fit on most GPUs.
Seems the Gemma family is mostly for short-context work, still.