Tech experts are starting to doubt that ChatGPT and A.I. ‘hallucinations’ will ever go away: ‘This isn’t fixable’::Experts are starting to doubt it, and even OpenAI CEO Sam Altman is a bit stumped.

  • @[email protected]
    link
    fedilink
    English
    12 years ago

    They don’t really chance the meaning of the words, they just look for the “best” words given the recent context, by taking into account the different possible meanings of the words

    • @[email protected]
      link
      fedilink
      English
      1
      edit-2
      2 years ago

      No they do, thats one of the key innovations of LLMs the attention and feed forward steps where they propagate information from related words into each other based on context. from https://www.understandingai.org/p/large-language-models-explained-with?r=cfv1p

      For example, in the previous section we showed a hypothetical transformer figuring out that in the partial sentence “John wants his bank to cash the,” his refers to John. Here’s what that might look like under the hood. The query vector for his might effectively say “I’m seeking: a noun describing a male person.” The key vector for John might effectively say “I am: a noun describing a male person.” The network would detect that these two vectors match and move information about the vector for John into the vector for his.

      • @[email protected]
        link
        fedilink
        English
        1
        edit-2
        2 years ago

        That’s exactly what I said

        They don’t really chance the meaning of the words, they just look for the “best” words given the recent context, by taking into account the different possible meanings of the words

        The word’s meanings haven’t changed, but the model can choose based on the context accounting for the different meanings of words

        • @[email protected]
          link
          fedilink
          English
          12 years ago

          The key vector for John might effectively say “I am: a noun describing a male person.” The network would detect that these two vectors match and move information about the vector for John into the vector for his.

          This is the bit you are missing, the attention network actively changes the token vectors depending on context, this is transferring new information into the meanings of that word.

          • @[email protected]
            link
            fedilink
            English
            1
            edit-2
            2 years ago

            The network doesn’t detect matches, but the model definitely works on similarities. Words are mapped in a hyperspace, with the idea that that space can mathematically retain conceptual similarity as spatial representation.

            Words are transformed in a mathematical representation that is able (or at least tries) to retain semantic information of words.

            But different meanings of the different words belongs to the words themselves and are defined by the language, model cannot modify them