• fmstrat@lemmy.nowsci.com
    link
    fedilink
    English
    arrow-up
    4
    ·
    4 hours ago

    I love regex. I know, most don’t, but I do. GPT/Claude can write some convincing code, but their regexes can be spotted a mile away.

  • andrew_bidlaw@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    5
    ·
    5 hours ago

    As it learns from our data, no wonder it fucks up at regexps. They are the arcane knowledge not accessible to us mere mortals, nor to LLMs.

    • ryathal@sh.itjust.works
      link
      fedilink
      arrow-up
      4
      ·
      4 hours ago

      If you know even a little about how an LLM works it’s obvious why regex is basically impossible for it. I suspect perl has similar problems, but no one is capable of actually validating that.

  • gravitas_deficiency@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    24
    ·
    edit-2
    7 hours ago

    You know what? If your management is telling you to use AI generated code to “go faster”, just go ahead and do it. But fork the repo first, in case you’re still around when they get fired and someone sensible says to put it back how it was before.

  • Snot Flickerman
    link
    fedilink
    English
    arrow-up
    65
    ·
    edit-2
    10 hours ago

    Management: Fuck it, ship it.


    The people at the top honestly don’t give a fuck if it barely works as long as it’s an excuse to cut costs. In things like Customer Service, barely working is a bonus, because it makes customers give up before they try to get their issue solved.

  • cm0002@lemmy.world
    link
    fedilink
    arrow-up
    16
    ·
    edit-2
    10 hours ago

    Just outta curiosity:

    Full o1 model

    “\\id:\[]]+\\\\[]]+\\\”

    Claude 3.5 Haiku:

    Never used elisp, no idea of any of this is right lmao

    • Skullgrid@lemmy.world
      link
      fedilink
      arrow-up
      1
      ·
      6 hours ago

      I swear to god,someone must have written an intermediary language between regex and actual programming, or I’m going to eventaully do it before I blow my fucking brains out.

      • BassTurd@lemmy.world
        link
        fedilink
        arrow-up
        3
        ·
        5 hours ago

        How do you think that would look? Regex isn’t particularly complicated, just a bit to remember. I’m trying to picture how you would represent a regex expression in a higher level language. I think one of its biggest benefits is the ability to shove so much information into a random looking string. I suppose you could write functions like, startswith, endswith, alpha(4), or something like that, but in the end, is that better?

        • frezik@midwest.social
          link
          fedilink
          arrow-up
          4
          ·
          5 hours ago

          People have unironically done that. No, it isn’t better. The fundamental mental model is the same.

        • Skullgrid@lemmy.world
          link
          fedilink
          arrow-up
          4
          ·
          5 hours ago

          I suppose you could write functions like, startswith, endswith, alpha(4), or something like that,

          yes.

          but in the end, is that better?

          YES.

          startswith('text');
          lengthMustBe(5);
          onlyContain(CHARSETS.ALPHANUMERICS); 
          endswith('text');
          

          is much more legible than []],[.<{}>,]‘text’[[]]][][)()(a-z,0-9){}{><}<>{}‘text’{}][][

          • BassTurd@lemmy.world
            link
            fedilink
            arrow-up
            1
            ·
            5 hours ago

            Assuming “text” in your example is a placeholder for a 5 digit alpha string, it can be written like this in regex: /[a-zA-Z0-9]{5}/

            If ”text" is literal, then your statement is impossible.

            I think that when it gets to more complex expressions like a phone number with country code that accepts different formats, the verbosity of a higher level language will be more confusing, or at least more difficult to take in quickly.

    • ChaoticNeutralCzech@feddit.org
      link
      fedilink
      English
      arrow-up
      8
      ·
      edit-2
      9 hours ago

      o1 without Markdown misformatting:

      \\id:\\[^]]+\\\\\[^]]+\\\
      

      No idea what the rectangles are supposed to be, I just copy-pasted it

      • marcos@lemmy.world
        link
        fedilink
        arrow-up
        2
        ·
        5 hours ago

        They are valid unicode points that your font doesn’t know about.

        … or at least they represent that, but I think there’s a character that looks like one too.

        • ChaoticNeutralCzech@feddit.org
          link
          fedilink
          English
          arrow-up
          1
          ·
          3 hours ago

          It’s U+E001 from a Private Use Area. The UnicodePad app renders it as something between 鉮 and 鋁 (separate boxes stricken through; I wasn’t able to find it even with Google Lens)