• flamingmongoose
    link
    fedilink
    arrow-up
    4
    ·
    1 month ago

    BERT and early versions of GPT were trained on copyright free datasets like Wikipedia and out of copyright books. Unsure if those would be big enough for the modern ChatGPT types