I started lemdro.id. Pretty cool domain name, right?

  • 0 Posts
  • 38 Comments
Joined 1 year ago
cake
Cake day: June 29th, 2023

help-circle


















  • No, they take exponentially increasing resources as a consequence of having imperfect recall. Smaller models have “worse” recall. They’ve been trained with smaller datasets (or pruned more).

    As you increase the size of the model (number of “neurons” that can be weighted) you increase the ability of that model to retain and use information. But that information isn’t retained in the same form as it was input. A model trained on the English language (an LLM, like ChatGPT) does not know every possible word, nor does it actually know ANY words.

    All ChatGPT knows is what characters are statistically likely to go after another in a long sequence. With enough neurons and layers combined with large amounts of processing power and time for training, this results in a weighted model which is many orders of magnitude smaller than the dataset it was trained on.

    Since the model weighting itself is smaller than the input dataset, it is literally impossible for the model to have perfect recall of the input dataset. So by definition, these models have imperfect recall.