Cohere Drops Command-R 35B 08-2024 Update, Just About a Perfect Local LLM for 24GB GPUs.

brucethemoose@lemmy.world · 8 个月前

Cohere Drops Command-R 35B 08-2024 Update, Just About a Perfect Local LLM for 24GB GPUs.

Smorty [she/her] · 3 个月前

i totally agree… with everything. 6GB really is smol and, cuz imma crazy person, i currently try and optimize everything for llama3.2 3B Q4 model so people with even less GB VRAM can use it. i really like the idea of people just having some smollm laying around on their pc and devs being able to use it.

i really should probably opt for APIs, you’re right. the only API I ever used was Cohere, cuz yea their CR+ model is real nice. but i still wanna use smol models for a smol price if any. imma have a look at the APIs you listed. Never heard of Kobold Horde and Samba so i’ll have a look at those… or i go for the lazy route and chose depseek cuz it’s apparently unreasonably cheap for SOTA perf. so eh…

also yes! Lemmy really does seem anti AI, and i’m fine with that. i just say yeah companies use it in obviously dum ways but the tech is super interesting which is a reasonable argument i think.

so yes, local llm go! i wanna get that new top amd gpu once that gets announced. so i’ll be able to run those spicy 32B models. for now i’ll just stick with 8B and 3B cuz they work quick and kinda do what i want.

brucethemoose@lemmy.world · 3 个月前

Oh yeah, I was thinking of free APIs. If you are looking for paid APIs, Deepseek and Cohere are of course great. Gemini Pro is really good too, and free for 50 requests a day. Cerebras API is insanely fast, like way above anything else. Check out Openrouter too, they host tons of models.

Cohere Drops Command-R 35B 08-2024 Update, Just About a Perfect Local LLM for 24GB GPUs.

Cohere Drops Command-R 35B 08-2024 Update, Just About a Perfect Local LLM for 24GB GPUs.

CohereForAI/c4ai-command-r-08-2024 · Hugging Face