• dual_sport_dork 🐧🗡️@lemmy.world
    link
    fedilink
    English
    arrow-up
    228
    ·
    edit-2
    6 months ago

    Say it with me again now:

    For fact-based applications, the amount of work required to develop and subsequently babysit the LLM to ensure it is always producing accurate output is exactly the same as doing the work yourself in the first place.

    Always, always, always. This is a mathematical law. It doesn’t matter how much you whine or argue, or cite anecdotes about how you totally got ChatGPT or Copilot to generate you some working code that one time. The LLM does not actually have comprehension of its input or output. It doesn’t have comprehension, period. It cannot know when it is wrong. It can’t actually know anything.

    Sure, very sophisticated LLM’s might get it right some of the time, or even a lot of the time in the cases of very specific topics with very good training data. But its accuracy cannot be guaranteed unless you fact-check 100% of its output.

    Underpaid employees were asked to feed published articles from other news services into generative AI tools and spit out paraphrased versions. The team was soon using AI to churn out thousands of articles a day, most of which were never fact-checked by a person. Eventually, per the NYT, the website’s AI tools randomly started assigning employees’ names to AI-generated articles they never touched.

    Yep, that right there. I could have called that before they even started. The shit really hits the fan when the computer is inevitably capable of spouting bullshit far faster than humans are able to review and debunk its output, and that’s only if anyone is actually watching and has their hand on the off switch. Of course, the end goal of these schemes is to be able to fire as much of the human staff as possible, so it ultimately winds up that there is nobody left to actually do the review. And whatever emaciated remains of management are left don’t actually understand how the machine works nor how its output is generated.

    Yeah, I see no flaws in this plan… Carry the fuck on, idiots.

    • KeriKitty (They(/It))@pawb.social
      link
      fedilink
      English
      arrow-up
      57
      ·
      6 months ago

      Did you enjoy humans spouting bullshit faster than humans can debunk it? Well, brace for impact because here comes machine-generated bullshit! Wooooeee’refucked! 🥳

        • gravitas_deficiency@sh.itjust.works
          link
          fedilink
          English
          arrow-up
          6
          ·
          edit-2
          6 months ago

          A human can only do bad or dumb things so quickly.

          A human writing code can do bad or dumb things at scale, as well as orders of magnitude more quickly.

          • dual_sport_dork 🐧🗡️@lemmy.world
            link
            fedilink
            English
            arrow-up
            2
            ·
            6 months ago

            And untangling that clusterfuck can be damn near impossible.

            The reaper may not present his bill immediately, but he will always present his bill eventually. This is a zero-sum thing: There is no net savings because the work required can be front loaded or back loaded, and you sitting there at the terminal in the present might not know. Yet.

            There are three phases where time and effort are input, and wherein asses can be bitten either preemptively or after the fact:

            1. Loading the algorithm with all the data. Where did all that data come from? In the case of LLM’s, it came from an infinite number of monkeys typing on an infinite number of keyboards. That is, us. The system is front loaded with all of this time and effort – stolen, in most cases. Also the time and effort spent by those developing the system and loading it with said data.
            2. At execution time. This is the classic example, i.e. the algorithm spits out into your face something that is patently absurd. We all point and laugh, and a screen shot gets posted to Lemmy. “Look, Google says you should put glue on your pizza!” Etc.
            3. Lurking horrors. You find out about the problem later. Much later. After the piece went to print, or the code went into production. “Time and effort were saved,” producing the article or writing the code. Yes, they appeared to be – then. Now it’s now. Significant expenditure must be made cleaning up the mess. Nobody actually understood the code but now it has to be debugged. And somebody has to pay the lawyers.
    • Blue_Morpho@lemmy.world
      link
      fedilink
      English
      arrow-up
      19
      ·
      6 months ago

      Your statement is technically true but wrong in practice. Because your statement applies to EVERYTHING on the Internet. We had tons of error ridden garbage articles written by underpaid interns long before AI.

      And no, fact checking is quicker than writing something from scratch. Just like verifying Wikipedia sources is quicker than writing a Wikipedia article.

    • raspberriesareyummy@lemmy.world
      link
      fedilink
      English
      arrow-up
      12
      ·
      6 months ago

      A-MEN. well put. I wouldn’t make so many words, I’d just settle for “Fuck LLMs and fuck the dipshits who label it AI or think it has anything to do with AI.”

    • yokonzo@lemmy.world
      link
      fedilink
      English
      arrow-up
      10
      ·
      edit-2
      6 months ago

      Okay, yes I agree with you fully, but you can’t just say it’s a mathematical law without proof, that’s something you need to back up with numbers and I don’t think “work” is quantifiable.

      Again, yes, they need to slow down, but I have an issue with your claim unless you’re going to be backing it up. Otherwise you’re just a crazy dude standing on a soapbox

    • henfredemars@infosec.pub
      link
      fedilink
      English
      arrow-up
      9
      ·
      edit-2
      6 months ago

      The cost however is not the same. I can totally see the occasional lawsuit as the cost of doing business for a company that employs AI.

      • dual_sport_dork 🐧🗡️@lemmy.world
        link
        fedilink
        English
        arrow-up
        20
        ·
        6 months ago

        This is almost certainly what we’re looking at here. It’s the Ford Pinto for the modern age. “So what if a few people get blown up/defamed? Paying for that will cost less than what we made, so we’re still in the black.” Yeah, that’s grand.

        Further, generative “AI’s” and language models like these are fine when used for noncritical purposes where the veracity of the output is not a requirement. Dall-E is an excellent example, where all it’s doing is making varying levels of abstract art and provided nobody is stupid enough to take what it spits out for an actual photograph documenting evidence of something, it doesn’t matter. Or, “Write me a poem about crows.” Who cares if it might file crows in the wrong taxonomy as long as the poem sounds nice.

        Facts and LLM’s don’t mix, though.

      • anton
        link
        fedilink
        English
        arrow-up
        4
        ·
        edit-2
        6 months ago

        While that works for “news agencies” it’s a free money glitch when used in a customer support role for the consumer.

        Edit: clarification

        • henfredemars@infosec.pub
          link
          fedilink
          English
          arrow-up
          16
          ·
          6 months ago

          Pretty sure an airline was forced to pay out on a fake policy that one of their support bots spouted.

    • dependencyinjection@discuss.tchncs.de
      link
      fedilink
      English
      arrow-up
      8
      ·
      6 months ago

      Simply false in my experience.

      We use CoPilot at work and there is no babysitting required.

      We are software developers / engineers and it’s saves countless hours writing boilerplate code, giving code blocks based on a comment, and sticking to our coding conventions.

      Sure it isn’t 100% right, but the owner and lead engineer calculates it to be around 70% accurate and even if it misses the mark, we have a whole lot less key presses to make.

      • Olap@lemmy.world
        link
        fedilink
        English
        arrow-up
        7
        ·
        6 months ago

        What if I told you that typing in software engineering encompasses less than 5% of your day?

        • dependencyinjection@discuss.tchncs.de
          link
          fedilink
          English
          arrow-up
          4
          ·
          6 months ago

          I’m a developer and typing encompasses most of my day. The owner and lead engineer has many meeting and admin work, but still is writing code and scaffolding new projects around 30% of his time.

          • dual_sport_dork 🐧🗡️@lemmy.world
            link
            fedilink
            English
            arrow-up
            3
            ·
            6 months ago

            I’m a developer and typing encompasses most of my day as well, but increasingly less of it is actually producing code. Ever more of it is in the form of emails, typically in the process of being forced to argue with idiots about what is and isn’t feasible/in the spec/physically possible, or explaining the same things repeatedly to the types of people who should not be entrusted with a mouse.

    • NeptuneOrbit@lemmy.world
      link
      fedilink
      English
      arrow-up
      5
      ·
      6 months ago

      I think it’s worse than that. The work is about the same. The skill and pay for that work? Lower.

      Why pay 10 experienced journalists when you can pay 10 expendable fact checkers who just need to run some facts/numbers by a Wikipedia page?

    • null@slrpnk.net
      link
      fedilink
      English
      arrow-up
      4
      ·
      edit-2
      6 months ago

      Always, always, always. This is a mathematical law.

      Total bullshit. We use LLMs at work for tasks that would be nearly impossible and require obscene amounts of manpower to do by hand.

      Yes we have to check the output, but its not even close to the amount of work to do it by hand. Like, by orders of magnitude.

      • Balder@lemmy.world
        link
        fedilink
        English
        arrow-up
        4
        ·
        edit-2
        6 months ago

        Yeah. I’m not sure that statement applies. It’s easier for humans to check something than to come up with something in the first place. But the thing is, the person doing the checking also needs to be proficient in the subject.

    • Echo Dot@feddit.uk
      link
      fedilink
      English
      arrow-up
      4
      ·
      6 months ago

      I disagree with the “always” bit. At some point in the future AI is actually going to get to the point where we can basically just leave it to it, and not have to worry.

      But I do agree that we are not there yet. And that we need to stop pretending that we are.

      Having said that my company uses AI for a lot of business critical tasks and we haven’t gone bankrupt yet, of course that’s not quite the same as saying that a human wouldn’t have done it better. Perhaps we’re spending more money than we need to because of the AI, who knows?

      • dual_sport_dork 🐧🗡️@lemmy.world
        link
        fedilink
        English
        arrow-up
        10
        ·
        edit-2
        6 months ago

        …Nnnnno, actually always.

        The current models that are in use now (and the subject of the article) are not actual AI’s. There is no thinking going on in there. They are statistical language models that are literally incapable of producing anything that was not originally part of their training input data, reassembled and strung together different ways. These LLM models can’t actually generate new content, they can’t think up anything novel, and of course they can’t actually think at all. They are completely at the mercy of whatever garbage is fed into them and are by definition not capable of actually “understanding” their output because they are not capable of understanding at all. The nature of these processes being a statistical model also means that the output is to some extent always dependent on an internal dice roll as well, and the possibility of rolling snake eyes is always there no matter how clever or well tuned the algorithm is.

        This is not to say humans are infallible, either, but at least we are conceptually capable of understanding when and more importantly how we got something wrong when called on it. We are also capable of researching sources and weighing the validity of different sources and/or claims, which an LLM is not – not without human intervention, anyway, which loops back to my original point about doing the work yourself in the first place. An LLM cannot determine if a published sequence of words is bogus. It can of course string together a new combination of words in a syntactically valid manner that can be read and will make sense, but the truth of the constructed text cannot actually be determined programmatically. So in any application where accuracy is necessary, it is downright required to thoroughly review 100% of the machine output to verify that it is factual and correct. For anyone capable of doing that without smoke coming out of their own ears, it is then trivial to take the next step and just reproduce what the machine did for you. Yes, you may as well have just done it yourself. The only real advantage the machine has is that it can type faster than you and it never needs more coffee.

        The only way to cast off these limitations would be to develop an entirely new real AI model that is genuinely capable of understanding the meaning of both its input and output, and legitimately capable of drawing new conclusions from its own output also taking into account additional external data when presented with it. And being able to show its work, so to speak, to demonstrate how it arrived at its conclusions to back up their factual validity. This requires throwing away the current LLM models completely – they are a technological dead end. They’re neat, and capable of fooling some of the people some of the time, but on a mathematical level they’re never capable of achieving internally provable, consistent truth.

        • Balder@lemmy.world
          link
          fedilink
          English
          arrow-up
          4
          ·
          6 months ago

          I think people don’t yet grasp that LLMs don’t produce any novel output. If that was the case, considering the amount of knowledge they have, they’d be making incredible new connections and insights that humanity never made before. Instead, they can only explain stuff that was already well documented before.

    • gl4d10@lemmy.world
      link
      fedilink
      English
      arrow-up
      3
      ·
      6 months ago

      the other big thing is that once it does start spouting bullshit or even just finds a phase or string of words, its so hard to get it out, you really just have to start over your instance or purge the memory, they get the obsession so easily sometimes without like sacrificing relevancy to the topic entirely

  • ocassionallyaduck@lemmy.world
    link
    fedilink
    English
    arrow-up
    81
    ·
    6 months ago

    I hope he wins, and the fine makes Microsoft’s eyes water. Everyone need to slow the fuck down with this, and they won’t until there are real painful consequences.

    MS can drop billions on game company acquisitions like it’s no big deal? Cool, give this guy 1 billion dollars for randomly singling him out and automated-accusing him of sex crimes.

    Maybe then all the tech bros might pause for 3 seconds before they keep feeding shit into their models illegally.

  • Boozilla@lemmy.world
    link
    fedilink
    English
    arrow-up
    42
    ·
    6 months ago

    This US election was going to be a no-good-choices shitshow no matter what. But I really dread the AI-amped shitshow we’re gonna get.

    • datavoid@lemmy.ml
      link
      fedilink
      English
      arrow-up
      11
      ·
      6 months ago

      Personally I think it will be quite entertaining.

      That being said, I’m not american

      • Boozilla@lemmy.world
        link
        fedilink
        English
        arrow-up
        12
        ·
        6 months ago

        It’s funny from the outside for sure. Until some crazy old creep tries to put in a launch code. Hopefully, they just give them a fake button to mash. Seems like the smart thing to do.

      • neo@lemy.lol
        link
        fedilink
        English
        arrow-up
        3
        ·
        6 months ago

        While it is indeed entertaining to have a great view on the iceberg crashing the ship, I’m quite worried about the consequences.

        Especially since many other decks (countries) already have severe problems.

        • datavoid@lemmy.ml
          link
          fedilink
          English
          arrow-up
          1
          ·
          6 months ago

          That’s fair for sure. In my mind if we made it through the first 4 years, hopefully we will survive the next 4 too.

  • AutoTL;DR@lemmings.worldB
    link
    fedilink
    English
    arrow-up
    8
    ·
    6 months ago

    This is the best summary I could come up with:


    Worse yet, the erroneous reporting was scooped up by MSN — the somehow not-dead-yet Microsoft site that aggregates news — and was featured on its homepage for several hours before being taken down.

    It’s an unfortunate example of the tangible harms that arise when AI tools implicate real people in bad information as they confidently — and convincingly — weave together fact and fiction.

    And if Bigfoot conspiracies slip through MSN’s very large and automated cracks, it’s not surprising that a real-enough-looking AI-generated article like “Prominent Irish broadcaster faces trial over alleged sexual misconduct” made it onto the site’s homepage.

    According to the NYT, the website was founded by an alleged abuser and tech entrepreneur named Gurbaksh Chahal, who billed BNN as “a revolution in the journalism industry.”

    Underpaid employees were asked to feed published articles from other news services into generative AI tools and spit out paraphrased versions.

    Eventually, per the NYT, the website’s AI tools randomly started assigning employees’ names to AI-generated articles they never touched.


    The original article contains 559 words, the summary contains 167 words. Saved 70%. I’m a bot and I’m open source!

    • VeryVito@lemmy.ml
      link
      fedilink
      English
      arrow-up
      26
      ·
      6 months ago

      And now I’m reading a computer’s version of a story describing how a computer wrote a story that should have been discarded.

      • Bluetreefrog@lemmy.world
        link
        fedilink
        English
        arrow-up
        12
        ·
        6 months ago

        It’s even better than that. It’s a computer’s version of a story describing how a computer wrote a story which was then front-paged by a computer.