OpenAI is experimenting with process supervision, which corrects many of these i...

_fbpp · on July 25, 2023

It's an impressive result, but shouldn't be seen as "correction". Framing it as a (drastic) reduction in mistakes is more useful here.

If the model is productionized (read: dumbed down so it isn't as expensive to run), the reasoning abilities drastically decline again.

And these reasoning abilities are still around a language model, rather than around abstract models.

This is a very effective party trick for general math, whose language quite directly maps onto these abstract concepts, but there are some holes. Information about e.g. which values may be zero isn't encoded in the language, and so this approach is liable to blundering around division-by-zero issues.

If you want a particular example to toy around with, LLMs are not fond of quaternions and their conversion to other representations.