• 0 Posts
  • 176 Comments
Joined 1 year ago
cake
Cake day: June 30th, 2023

help-circle
  • But is chicken-ness actually defined by genetics? An important characteristic of a chicken is its domesticated status, if you consider the birds they descend from, they are remarkably similar, and it’s hard to imagine that any one mutation would have been what caused people to start calling them by their own name or considering them as a separate species. It’s possible that the first chicken became the first chicken when it was captured by humans, and so preceded the first chicken egg.





  • But I think the point is, the OP meme is wrong to try painting this as some kind of society-wide psychological pathology, when it’s rather business people coming up with simple reliable formulas to make money. The space of possible products people could want is large, and this choice isn’t only about what people want, but what will get attention. People will readily pay attention to and discuss with others something they already have a connection to in a way they wouldn’t with some new thing, even if they would rather have something new.


  • that is not the … available outcome.

    It demonstrably is already though. Paste a document in, then ask questions about its contents; the answer will typically take what’s written there into account. Ask about something you know is in a Wikipedia article that would have been part of its training data, same deal. If you think it can’t do this sort of thing, you can just try it yourself.

    Obviously it can handle simple sums, this is an illustrative example

    I am well aware that LLMs can struggle especially with reasoning tasks, and have a bad habit of making up answers in some situations. That’s not the same as being unable to correlate and recall information, which is the relevant task here. Search engines also use machine learning technology and have been able to do that to some extent for years. But with a search engine, even if it’s smart enough to figure out what you wanted and give you the correct link, that’s useless if the content behind the link is only available to institutions that pay thousands a year for the privilege.

    Think about these three things in terms of what information they contain and their capacity to convey it:

    • A search engine

    • Dataset of pirated contents from behind academic paywalls

    • A LLM model file that has been trained on said pirated data

    The latter two each have their pros and cons and would likely work better in combination with each other, but they both have an advantage over the search engine: they can tell you about the locked up data, and they can be used to combine the locked up data in novel ways.


  • Ok, but I would say that these concerns are all small potatoes compared to the potential for the general public gaining the ability to query a system with synthesized expert knowledge obtained from scraping all academically relevant documents. If you’re wondering about something and don’t know what you don’t know, or have any idea where to start looking to learn what you want to know, a LLM is an incredible resource even with caveats and limitations.

    Of course, it would be better if it could also directly reference and provide the copyrighted/paywalled sources it draws its information from at runtime, in the interest of verifiably accurate information. Fortunately, local models are becoming increasingly powerful and lower barrier of entry to work with, so the legal barriers to such a thing existing might not be able to stop it for long in practice.














  • this will force us humans to go actually outside, make friends, form deep social relationship, and build lasting, resilient communities

    There is no chance it goes that way, how is talking to people outside even an option for someone used to just being on the internet? Even if the content gets worse, the basic mechanisms to keep people scrolling still function, while the physical and social infrastructure necessary for in person community building is nonexistent.