What do you think an end-to-end solution would look like? I missed the Layernorm...

cs702 · on Oct 19, 2023

Thank you for your thoughtful response!

For me, an end-to-end solution would look like a model that can 1) read numbers in any format from the string, 2) manipulate and reason about them correctly, and 3) reliably output structured data if requested. No need for special encodings. Obviously, we're nowhere near that!

In the interim, I agree with you that a representation that explicitly encodes a mantissa and an exponent seems like a better stopgap solution. We human beings already do that to some extent -- e.g., we often think in terms of tens, hundreds, thousands, and so on, as numbers get bigger, and in terms of percents, basis points, and decimal points as they get smaller.

I wouldn't be surprised if large models internally will eventually learn implicitly to represent numbers as mantissas and exponents.