• Max-P@lemmy.max-p.me
    link
    fedilink
    arrow-up
    0
    ·
    1 year ago

    Because an LLM is more than just data: it’s like a big network of how syllables and words go together based on some context. And that’s useful because language is how we communicate, how we connect ideas together, it’s how we share stories. It’s not just Wikipedia articles, it’s a database of relationships between words and concepts. It approximates how we think as humans.

    Yes, AI is hella overhyped. Everyone wants to AI everything. But really for this particular situation, I think the model data would actually be the best precompiled database of knowledge we can possibly provide to learn about humans for the size.

    No it’s not magic compression, but 4GB worth of parameters is still a lot. GPT4All has models just under 4GB. They’re not particularly impressive compared to OpenAI’s offerings, but I think you can extract a lot more practical information to do first contact out of a basic model than 4GB worth of Wikipedia. It’s extremely lossy compression, it’s never gonna spit out articles vebatim, it will hallucinate a ton of stuff.

    If we had more space I’d send all the major AIs we have like Dall-E, LLaMa and GPT 4. Imagine you’re an alien, you’re presented with a keyboard and a monitor, and know nothing about us. You can use Dall-E to try random letters and words and see if the output makes sense. Maybe you find out what a cat, dog, bat, frog, apple looks like. You can then input those words in ChatGPT, and get context as to when those are used. What’s “a horse”? What’s “riding”? Put those into Dall-E, now you know what a “cat riding a horse” looks like. It can generate as many as you want, any combination.

    Now imagine you’re a very advanced alien species that can easily process the model’s parameters. You’ve just downloaded the basics of humanity.

    • DogMuffins@discuss.tchncs.de
      link
      fedilink
      English
      arrow-up
      0
      ·
      1 year ago

      Sorry chief you haven’t really explained why an AI model would be the best format.

      It’s less dense than Wikipedia text. End of.

      • Max-P@lemmy.max-p.me
        link
        fedilink
        arrow-up
        0
        ·
        1 year ago

        I’ve already explained some upsides and even given an example of how it could be used. What’s your counterarguments? What advantages would the raw wikipedia text do that would make it more useful? What assumptions are we making about our aliens’ knowledge about us? Are we looking to share our latest scientific breakthroughs, or just showing them what humanity looks like? Are we trying to send them plans on how to build a spaceship to come visit us?

        Arguably if we’re looking for the perfect dataset to send them, we’d have to redo what we did with the golden discs we sent along with the Voyager probes, and carefully consider every bit of data we send and for what purpose and how we expect them to be able to process it and understand it. This is a broad philosophical discussion, I’m not looking to be right or have the best answer, I’m providing one idea, one potential answer. Everyone’s first thought is to send out as much of Wikipedia as we can. Doesn’t make for great discussion.

        It’s less dense than Wikipedia text. End of.

        That’s making a strong assumption that Wikipedia is already the most dense and detailed source of information we have, and that aliens are able to read and understand english, and that this is the optimal format to present our knowledge.

        I’m not arguing that LLMs encode more information. They certainly don’t. That’s not the point. I’m arguing that I think it has a higher likelihood of being useful to communicate with us. That’s the first thing we want to do with an alien species: open dialogue. Language is the fabric of our entire world, and that’s what Large Language Models do: language. The model is a representation of billions of relationships between words (or tokens, to be technical) in the input, and probabilities that the output will be that other set of words/tokens. When it sees “wheel”, that signal propagates through the network and all the weights and it comes up with probabilities that it’s related to “car”, “bycicle”, “tree”, “mountain”. Does it even know what that implies? Nope. It just knows you’re much more likely to be talking about cars than trees when a wheel is involved. Billions of those relationships are encoded in an LLM along with how weak or strong that relationship is. That’s useful information especially when language and communication is involved.

        If we had an LLM for ancient and long forgotten languages, we wouldn’t even need things like rosetta stones. We could keep throwing inputs at it and see what comes out and make deductions based on that. We’d also get some information and stories from the time as a bonus and side effect of those being somewhat embedded in the model in some way. But the main point is, you can give it as many inputs as you want and it’ll generate as many outputs as you asked. Way, way more than the size of the model itself. You could have an entire conversation with an AI Egyptian or something, and learn the language. Similarly, an alien could get semi fluent in english by practicing with the model as long as they need. Heck we already do this as humans: so many tips about using ChatGPT to practice and refine your presentations, papers, prepare for interviews, etc.

        That’s my value proposition for shipping an AI model: language and general culture over raw scientific data.