i jus wanted to get dis outta my system >v< …

i dun like those boring linear model structures… they work… bt they dun look fun, nor intuitive. they jus produce output… which is boring!

pls, if some researcher with lotsa gpus sees this, maybsies try this kinda architecture… u dont evn have to credit me, just try it out n see where it goes ~ ~ ~

  • Smorty [she/her]OPM
    link
    fedilink
    arrow-up
    2
    ·
    8 天前

    yeaaaa you’re right… i was referring specifically to LLMs, but yes, recurrent models are essentially everywhere else.

    i am just surprised we don’t have many LLMs with recurrent blocks in them, like this model here did for example. i really hope we go that direction soon…

    • pixxelkick@lemmy.world
      link
      fedilink
      arrow-up
      4
      ·
      8 天前

      Afaik all LLMs have very derp recurrance, as that’s what provides their context window size.

      The more recurrant params they have, the more context window they can store.