I do not feel bad for them.
Between OpenAI and GitHub Copilot my intellectual property has been robbed without my consent.
I wrote open source packages and blogs that got fed to AI. The code in my blog posts can be reproduced by chatgpt verbatim (with the same variables and default parameters and everything).
So, if Deepseek wants to distill openAI into their models that’s exactly as fair as my work was treated.
My blog doesn’t get much attention anymore, and it’s been a funnel for me getting job offers.
I know we stole all of Bibliotik, a private torrent site for pirated ebooks in the books3 corpus, and that we used that to train our first models… but this is entirely different! This is theft of intellectual property!
-Sam Altman probably.
Narrator: It was not actually different at all.
OpenAI accidentally killing themselves over this.