• anotherandrew
    link
    fedilink
    622 months ago

    Grab a copy of the stackoverflow database and use it locally, or train your own local LLM on the datastore.

    And if you can, donate to the Internet Archive – those people do really important work in today’s age of killing off old information and constant enshittification.

    • @[email protected]
      link
      fedilink
      English
      112 months ago

      Came here to say something similar about a local archive.

      You can also use the app Kiwix to make it a little easier to download/search (and grab several other doc archives like Python PEP and Wikipedia)

      • anotherandrew
        link
        fedilink
        42 months ago

        Completely forgot about kiwix; I have that on my ipad and laptop, along with Dash which is like a modern day HELPPC.COM if anyone remembers that thing…

        • @[email protected]
          link
          fedilink
          English
          22 months ago

          I didn’t know about Dash, but it sounds pretty great. Appears to be Mac only, though, and requires a subscription for the latest version.

          Also found someone that appears to have converted HelpPC to HTML. Can’t speak to the legitimacy of it, though.

          https://www.stanislavs.org/helppc/

    • melroy
      link
      fedilink
      22 months ago

      Bad news. Since AI can only answer what it knows. If you have a question that is legit but not yet part of stackoverflow, you get a bad AI response.

      In that case you can ask it on the stackoverflow website. But due to the fact that everybody now only rely on AI stackoverflow is dead. Well there you go, you just killed the source of truth.

      • @[email protected]
        link
        fedilink
        English
        32 months ago

        Which is eventually going to cause AI model collapse, since AI no longer has any source of truth to train on. This is such an interesting technology being used in such a stupid and irresponsible way.

        • melroy
          link
          fedilink
          12 months ago

          Exactly my point. So what you see now is Ai is generating Ai content used for training. Also known as synthetic data… I know right?

      • anotherandrew
        link
        fedilink
        12 months ago

        I don’t know if it’s just my age/experience or some kind of innate “horse sense” But I tend to do alright with detecting shit responses, whether they be human trolls or an LLM that is lying through its virtual teeth. I don’t see that as bad news, I see it as understanding the limitations of the system. Perhaps with a reasonable prompt an LLM can be more honest about when it’s hallucinating?