The P100s have full support for half-precision (i.e. 16 bit) floating point ops....

p1esk · on Oct 11, 2016

FP16 performance is only relevant until people figure out how to train NNs using INT8. See, for example, [1] for recent advances in that direction.

After that, it's going to be mostly about memory size and bandwidth.

[1] https://arxiv.org/abs/1603.01025

yahma · on Oct 11, 2016

First NVIDIA solidified their Monopoly by forcing CUDA... then they gimped half-precision on consumer cards.

We really need some more Frameworks that work with OpenCL, so that we can have some competition from AMD, who's consumer cards are not gimped.

Tom1971 · on Oct 11, 2016

Gimping, in this case, is actually: adding hardware, that costs quite a bit of silicon area, on one chip that will probably never be sold as a consumer GPU.

I don't see the issue with a company making a very high-end product, adding stuff that doesn't have good use for consumers, and asking extra money for their effort.

AMD doesn't have double speed FP16 on its current FPUs either. The latest version has FP16 at the same speed as FP32, but if you're doing that you might as well use FP32 always.

And let's not forget: the Nvidia consumer GPU have deep learning quad int8 operations enabled at all time. They didn't need to do that and could have reserved it for their Tesla product line only.