makeasnek@lemmy.ml to AI@lemmy.mlEnglish · 6 months agoLLM ASICs on USB sticks?lemmy.mlimagemessage-square10fedilinkarrow-up125file-textcross-posted to: aicompanions@lemmy.world
arrow-up125imageLLM ASICs on USB sticks?lemmy.mlmakeasnek@lemmy.ml to AI@lemmy.mlEnglish · 6 months agomessage-square10fedilinkfile-textcross-posted to: aicompanions@lemmy.world
Source: nostr https://snort.social/nevent1qqsg9c49el0uvn262eq8j3ukqx5jvxzrgcvajcxp23dgru3acfsjqdgzyprqcf0xst760qet2tglytfay2e3wmvh9asdehpjztkceyh0s5r9cqcyqqqqqqgt7uh3n Paper: https://arxiv.org/abs/2406.02528
minus-squareSmorty [she/her]linkfedilinkarrow-up1·1 month agoSomething similar to this already kinda exists on HF with the 1.58 bit quantisation which seem to get very similar performance to the original Llama 3 8B model. That’s essentially a two bit quanitsation with reasonable performance!
minus-squareFisch@discuss.tchncs.delinkfedilinkEnglisharrow-up2·1 month agoThat’s really interesting, gonna try out how well it runs
Something similar to this already kinda exists on HF with the 1.58 bit quantisation which seem to get very similar performance to the original Llama 3 8B model. That’s essentially a two bit quanitsation with reasonable performance!
That’s really interesting, gonna try out how well it runs