DeepSeek is an AI assistant which appears to have fared very well in tests against some more established AI models developed in the US, causing alarm in some areas over not just how advanced it is, but how quickly and cost effectively it was produced.
[…]
Individual companies from within the American stock markets have been even harder-hit by sell-offs in pre-market trading, with Microsoft down more than six per cent, Amazon more than five per cent lower and Nvidia down more than 12 per cent.
i hope someone will make decent model that isnt controlled by china or america. But at least this one managed to deal decent hit to those greedy fuckers.
This is extremely funny
Been playing around with local LLMs lately, and even with it’s issues, Deepseek certainly seems to just generally work better than other models I’ve tried. It’s similar hit or miss when not given any context beyond the prompt, but with context it certainly seems to both outperform larger models and organize information better. And watching the r1 model work is impressive.
Honestly, regardless of what someone might think of China and various issues there, I think this is showing how much the approach to AI in the west has been hamstrung by people looking for a quick buck.
In the US, it’s a bunch of assholes basically only wanting to replace workers with AI they don’t have to pay, regardless of the work needed. They are shoehorning LLMs into everything even when it doesn’t make sense to. It’s all done strictly as a for-profit enterprise by exploiting user data and they boot-strapped by training on creative works they had no rights to.
I can only imagine how much of a demoralizing effect that can have on the actual researchers and other people who are capable of developing this technology. It’s not being created to make anyone’s lives better, it’s being created specifically to line the pockets of obscenely wealthy people. Because of this, people passionate about the tech might decide not to go into the field and limit the ability to innovate.
And then there’s the “want results now” where rather than take the time to find a better way to build and train these models they are just throwing processing power at it. “needs more CUDA” has been the mindset and in the western AI community you are basically laughed at if you can’t or don’t want to use Nvidia for anything neural net related.
Then you have Deepseek which seems to be developed by a group of passionate researchers who actually want to discover what is possible and more efficient ways to do things. Compounded by sanctions preventing them from using CUDA, restrictions in resources have always been a major cause for a lot of technical innovations. There may be a bit of “own the west” there, sure, but that isn’t opposed to the research.
LLMs are just another tool for people to use, and I don’t fault a hammer that is used incorrectly or to harm someone else. This tech isn’t going away, but there is certainly a bubble in the west as companies put blind trust in LLMs with no real oversight. There needs to be regulation on how these things are used for profit and what they are trained on from a privacy and ownership perspective.
Does that mean this stupid fucking bubble finally popped? Cramming AI into everything is getting real old real fast.
It didn’t pop, but it did release a bunch of hot air while hilariously zipping randomly around the room making a raspberry sound.
Not yet I don’t think, but it’s progress at least.
“Generate me an image of a crocodile shedding tears”
Certainly!
Minted!
???
Yes
Drew Carey.jpg Welcome to American capitalism, where the valuations are made up and the company financials don’t matter.
lol get rekt
Serious question -
From either a business or government/geopolitical standpoint, what is the benefit of them making it open source?
Knocking 1 trillion dollars out of a global rivals stock market for one.
For two, making huge, huge headlines that drive huge, huge investment for your future, locked up models. That’s why facebook released llama.
I think the first is a bonus, and the later is the reason. Deepseeks parent company is some crypto related thing which was stockpiling GPUs and opted to pivot to AI in 2023. Seems to have paid off now.
Ollama isn’t made by facebook, the llama models are. Ollama is juste a cli wrapper arround llama.cpp, both of which are FOSS projects.
Good catch. I did mean llama. Ill edit.
I believe it is an investment or trading company that dabbled I know crypto at one point.
They’re outsourcing development of their platform onto independents who will work for free to advance the project, which then improves the value of their platform. It’s the same design philosophy behind the Android Open Source Project.
It depends on what type of licensing. One way it could be beneficial to them (and this is me purely speculating with no checking) is that any work done from outside of their company on their code base is basically free labor. Yeah, they’ll lose some potential revenue from people running their own instances of the code, but most people will use their app.
Seems like uplifting news to me.
China scary tho
Deepseek seems to consistently fail to deliver but it’s very apologitic about it and gives the sense it’s willing to at least try harder than gpt. Its a bit bizarre to interact with and somehow feels that it has read way more anime than gpt.
From Deepseek :
🔗 New Wizard Cat Image Link:
https://i.ibb.co/Cvj8ZfG/wizard-cat-leather-2.pngIf this still doesn’t work, here are your options:
- I can describe the image in vivid detail (so you can imagine it!).
- Generate a revised version (maybe tweak the leather jacket color, pose, etc.).
- Try a different hosting link (though reliability varies).
Let me know what you’d prefer! 😺✨
(Note: Some platforms block auto-generated image links—if all else fails, I’ll craft a word-painting!)
Haha this is so amusing. I’ll take that though over the blind confidence you get out of so many other products I guess.
Well, it blindly and confidently generated a link to an image that doesn’t exist.
Of course you’re not one to leave a wizard cat image link unclicked. Well played sir
I really just wanted to know where the leather came in.
Haha ok I missed that part. It doesn’t do image gen does it? I think just released a different model that does that.
Half of that returned the next day.
Just sue them for copyright infringement. All the other AI models are facing them. You can’t tell me that a chinese AI startup has done better than us companies at not using copyrighted content in their training.
Well yeah, almost certainly. I mean it’s based off of base material from LLaMa which I think is the open source version of earlier Facebook ai efforts. So it definitely used copyright material for training. I doubt there’s a bleeding edge LLM out there that hasn’t used copyrighted material in training.
But if copyright lawsuits haven’t killed the US AI models, I’m skeptical they’ll have more success with Chinese ones.
They can sue DeepSeek as much as we can sue openAI for scraping human generated content… this time though we got the model for ourselves, and not to squeeze profit
Unless by “sue” you mean “nuke”, I don’t see how the USA is supposed to enforce USA law in a foreign, sovereign country.