About as open source as a binary blob without the training data

Prunebutt@slrpnk.net · 3 months ago

About as open source as a binary blob without the training data

WraithGear@lemmy.world · edit-2 3 months ago

So i am leaning as much as i can here, so bear with me. But it accepts tokenized data and structures it via a transformer as a json file or sun such. The weights are a binary file that’s separate and is used to, well, modify the tokenized data to generate outcomes. As long as you used a compatible tokenization structure, and weights structure, you could create a new training set. But that can be done with any LLM. You can’t pull the data from this just as you can’t make wheat from dissecting bread. But they provide the tools to set your own data, and the way the LLM handles that data is novel, due to being hamstrung by US sanctions. A “necessity is the mother of invention” and all that. Running comparable ai’s on inferior hardware and much smaller budget is what makes this one stand out, not the training data.

Prunebutt@slrpnk.net · edit-2 3 months ago

It’s still not open source. No matter how extendable the weights are.

WraithGear@lemmy.world · 3 months ago

I mean, this does not help me understand.

Prunebutt@slrpnk.net · edit-2 3 months ago

https://slrpnk.net/comment/13455788

Edit: this one is a more thorough explanation: https://lemmy.ml/comment/16365208