Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Check out the specs here: http://images.nvidia.com/content/technologies/deep-learning/...

though I'm most curious about what motherboard is in there to support NVLink and NVHS.

Good overview of Pascal here: https://devblogs.nvidia.com/parallelforall/inside-pascal/

1 question: will we see NVLink become an open standard for use in/with other coprocessors?

1 gripe: they give relative performance data as compared to a CPU -- of course its faster than a CPU



You mean you're not surprised that a machine with 8 GPUs, apparently costing $129k USD (from comment below), can outperform a single CPU? :)

(Of course, a better metric is that it's getting ~56x the performance at probably ~10x the TDP, but that's not surprising for a GPU with the current state of deep learning code.)

To their credit, the thermal and power engineering needed to get that dense a compute deployment is challenging. (bt, dt, have the corpses of power supplies to show for it.) But the price means that it's going to be limited to hyper-dense HPC deployments by companies that don't have the resources to engineer their own for substantially less money, such as Facebook's Big Sur design: https://code.facebook.com/posts/1687861518126048/facebook-to... . And, of course, the academics and hobbyists will continue to use consumer GPUs , which give much better performance/$ but aren't nearly as HPC-friendly.


To be fair, they are comparing it to a dual-socket CPU; which is twice as fair as comparing to a single!!

What I was getting more at was: I want to know the relative performance compared to another 8 Tesla box. I know comparing apples isn't good marketing, but c'mon.


They gave a pseudo-comparison to other GPUs in the keynote

http://images.anandtech.com/doci/10225/DGX-1Speed.jpg


$129K buys you a lot of dual 22-core servers.


What kind of server pricing are you getting? Base servers are cheap, but add high-end Xeons and memory, not to ignore interconnect and I get something like 7 ok configured 1U servers for $129K (2 20 core w lots of RAM, 10GbE NICs and mirrored boot/swap). No interconnect switching. That's for 20 core Haswell because I don't yet have discount pricing for Broadwell Xeons. I'm sure one could do better at hyperscaler discount but this is startup low-ish quantity.


Not really. Not a lot at least.


if you include space, power, HVAC, and networking?


It looks like it uses a separate daughterboard that houses the GPUs + NVLink, connected to the main motherboard using quad Infiniband EDR (400Gbps) + RDMA. http://images.anandtech.com/doci/10225/SSP_85.JPG


The diagram is confusing, but the GPUs are connected to the NVLink matrix which is connected to the motherboard via the PLX PCIe switches. The quad IB/dual 10GbE are separate IO attached to the motherboard.

https://devblogs.nvidia.com/parallelforall/inside-pascal/


That would make much more sense. Thanks! The PCI bandwidth must be fairly limited. 4x 100G Infiniband is 64x PCIe lanes, out of 80x lanes available.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: