It executes just one (well predicted) branch every 30 numbers written, and incrementing/printing the line number is branchless too.
It's not as fast as the subject of the post (40 GB/s) but it's only a few hours of work.
Output as an ASCII string, it is fewer than 400 bytes including \n’s.
If you’re counting GB/S, you’ve failed to understand the specification.
One of the sophisticated engineering aspects of FizzBuzz is that all optimizations are premature.
It is a boring problem. People being people invent different problems to provide a chance to be clever.
It executes just one (well predicted) branch every 30 numbers written, and incrementing/printing the line number is branchless too.
It's not as fast as the subject of the post (40 GB/s) but it's only a few hours of work.