kenna@lemmy.dbzer0.comM to human centered computing@lemmy.dbzer0.com · 11 months agoLLM in a flash: Efficient Large Language Model Inference with Limited Memoryhuggingface.coexternal-linkmessage-square0fedilinkarrow-up12cross-posted to: hackernews@lemmy.smeargle.fanshackernews@derp.foo
arrow-up12external-linkLLM in a flash: Efficient Large Language Model Inference with Limited Memoryhuggingface.cokenna@lemmy.dbzer0.comM to human centered computing@lemmy.dbzer0.com · 11 months agomessage-square0fedilinkcross-posted to: hackernews@lemmy.smeargle.fanshackernews@derp.foo