44 million, according to their paper, and they used 5000 TPUs, which are capable of 4.6×10^17 operations per second.
(The operations the TPU can run are far simpler than what supercomputers can do, but just for the sake of comparison, the current top supercomputer in the world can do 1.25×10^17 floating point operations per second)