• Smorty [she/her]
    link
    fedilink
    arrow-up
    1
    ·
    7 days ago

    Something similar to this already kinda exists on HF with the 1.58 bit quantisation which seem to get very similar performance to the original Llama 3 8B model. That’s essentially a two bit quanitsation with reasonable performance!