300 GB is nothing compared to the vastness of information in the universe (hence it fitting on a disk). AI is approximating a function, and the function they are now learning to approximate is us.
From [1], with my own editing...
When comparing the difference between now and human performance
> ...[huamns] can achieve closer to 0.7 bits per character . What is in that missing >0.4?
> Well—everything! Everything that the model misses. While just babbling random words was good enough at the beginning, at the end, it needs to be able to reason our way through the most difficult textual scenarios requiring causality or commonsense reasoning... every time that it lacks the theory of mind to compress novel scenes describing the Machiavellian scheming of a dozen individuals at dinner jockeying for power as they talk...
> If we trained a model which reached that loss of <0.7, which could predict text indistinguishable from a human, whether in a dialogue ...how could we say that it doesn’t truly understand everything?
From [1], with my own editing...
When comparing the difference between now and human performance
> ...[huamns] can achieve closer to 0.7 bits per character . What is in that missing >0.4?
> Well—everything! Everything that the model misses. While just babbling random words was good enough at the beginning, at the end, it needs to be able to reason our way through the most difficult textual scenarios requiring causality or commonsense reasoning... every time that it lacks the theory of mind to compress novel scenes describing the Machiavellian scheming of a dozen individuals at dinner jockeying for power as they talk...
> If we trained a model which reached that loss of <0.7, which could predict text indistinguishable from a human, whether in a dialogue ...how could we say that it doesn’t truly understand everything?
[1] https://www.gwern.net/Scaling-hypothesis