• @[email protected]
    link
    fedilink
    English
    25
    edit-2
    24 days ago

    I’d just like to point out that, from the perspective of somebody watching AI develop for the past 10 years, completing 30% of automated tasks successfully is pretty good! Ten years ago they could not do this at all. Overlooking all the other issues with AI, I think we are all irritated with the AI hype people for saying things like they can be right 100% of the time – Amazon’s new CEO actually said they would be able to achieve 100% accuracy this year, lmao. But being able to do 30% of tasks successfully is already useful.

    • @[email protected]
      link
      fedilink
      English
      1423 days ago

      being able to do 30% of tasks successfully is already useful.

      If you have a good testing program, it can be.

      If you use AI to write the test cases…? I wouldn’t fly on that airplane.

    • @[email protected]
      link
      fedilink
      English
      123 days ago

      Thing is, they might achieve 99% accuracy given the speed of progress. Lots of brainpower is getting poured into LLMs. Honestly, it is soo scary. It could be replacing me…

      • @[email protected]
        link
        fedilink
        English
        1424 days ago

        I’m not claiming that the use of AI is ethical. If you want to fight back you have to take it seriously though.

        • @[email protected]
          link
          fedilink
          English
          024 days ago

          It cant do 30% of tasks vorrectly. It can do tasks correctly as much as 30% of the time, and since it’s llm shit you know those numbers have been more massaged than any human in history has ever been.

              • @[email protected]
                link
                fedilink
                English
                524 days ago

                yes, that’s generally useless. It should not be shoved down people’s throats. 30% accuracy still has its uses, especially if the result can be programmatically verified.

                • @[email protected]
                  link
                  fedilink
                  English
                  323 days ago

                  Run something with a 70% failure rate 10x and you get to a cumulative 98% pass rate. LLMs don’t get tired and they can be run in parallel.

                  • @[email protected]
                    link
                    fedilink
                    English
                    423 days ago

                    I have actually been doing this lately: iteratively prompting AI to write software and fix its errors until something useful comes out. It’s a lot like machine translation. I speak fluent C++, but I don’t speak Rust, but I can hammer away on the AI (with English language prompts) until it produces passable Rust for something I could write for myself in C++ in half the time and effort.

                    I also don’t speak Finnish, but Google Translate can take what I say in English and put it into at least somewhat comprehensible Finnish without egregious translation errors most of the time.

                    Is this useful? When C++ is getting banned for “security concerns” and Rust is the required language, it’s at least a little helpful.

                  • @[email protected]
                    link
                    fedilink
                    English
                    423 days ago

                    The problem is they are not i.i.d., so this doesn’t really work. It works a bit, which is in my opinion why chain-of-thought is effective (it gives the LLM a chance to posit a couple answers first). However, we’re already looking at “agents,” so they’re probably already doing chain-of-thought.

                • @[email protected]
                  link
                  fedilink
                  English
                  -6
                  edit-2
                  24 days ago

                  Less broadly useful than 20 tons of mixed texture human shit, and more ecologically devastatimg.

                  • @[email protected]
                    link
                    fedilink
                    English
                    424 days ago

                    Are you just trolling or do you seriously not understand how something which can do a task correctly with 30% reliability can be made useful if the result can be automatically verified.

                • @[email protected]
                  link
                  fedilink
                  English
                  123 days ago

                  Tjose are people who could be living their li:es, pursuing their ambitions, whatever. That could get some shit done. Comparison not valid.

                  • @[email protected]
                    link
                    fedilink
                    English
                    -323 days ago

                    The comparison is about the correctness of their work.

                    Their lives have nothing to do with it.