the thing is humans have most efficiently encoded (in detail) reality in text. humans already highlight what is worth encoding about reality.
for example, you can finetune gpt-2 to have an idea of sexual biology by having it read erotica. just like how you can have a model learn the same by watching porn. but it is much more efficient to read the text, since there is much less information that is "useless"
for example, you can finetune gpt-2 to have an idea of sexual biology by having it read erotica. just like how you can have a model learn the same by watching porn. but it is much more efficient to read the text, since there is much less information that is "useless"