LLM ASICs on USB sticks?

makeasnek@lemmy.ml · 7 months ago

LLM ASICs on USB sticks?

Fisch@discuss.tchncs.de · 3 months ago

Being able to run 7B quality models on your phone would be wild. It would also make it possible to run those models on my server (which is just a mini pc), so I could connect it to my Home Assistant voice assistant, which would be really cool.

Smorty [she/her] · 3 months ago

Something similar to this already kinda exists on HF with the 1.58 bit quantisation which seem to get very similar performance to the original Llama 3 8B model. That’s essentially a two bit quanitsation with reasonable performance!

Fisch@discuss.tchncs.de · 3 months ago

That’s really interesting, gonna try out how well it runs