We don't host our voice servers on google cloud as that's actually really expens...

leonroy · on Sept 10, 2018

> We rent commodity dedicated servers.

So not even a VPS like Linode, Discord rent physical servers across the board? Or is there a mix of AWS or some other vendor in there?

jhgg · on Sept 10, 2018

We host our voice server on dedicated hardware, not VPS. Visualization overhead for networking is too high for the cost.

Additionally, you can buy bandwidth for much cheaper from dedicated hosting providers as opposed to cloud providers. For our usecase, AWS would be approximately 15,000x to 30,000x more expensive due to bandwidth pricing.

gregdunn · on Sept 11, 2018

>Visualization overhead for networking is too high for the cost.

Do you mean virtualization?

If so, I recommend looking into testing this with SR-IOV based NICs and passing through a VF to the guest. Even in regular operation the latency difference between bare metal and an ixgbevf virtualized NIC all but disappear into levels well below anything that would be meaningful for voice communication.

Moving to a DPDK based poll mode driver would reduce the latency differences even further.

Edit: https://01.org/packet-processing/blogs/nsundar/2018/nfv-i-ho... some actual numbers w/ DPDK on bare metal vs vm

Disclaimer: I work for a cloud company, but SR-IOV knowledge in general is something I had from my days running a vmware environment, and not anything new :)

mmt · on Sept 11, 2018

With potentially substantial engineering effort, including needing to hire someone with a relatively rare expertise, they could [1] eliminate that specific downside of virtualization.

However, there are remaining overhead/downsides, and virtualization may be a solution looking for a problem in their environment.

[1] Also, presumably, dependent on specific NIC hardware, but I expect they're already using something compatible. It's merely another constraint.

gregdunn · on Sept 11, 2018

By no means do you need DPDK to basically eliminate the latency difference - I just wanted to point out how low latency can go in general.

A vm using SR-IOV with ixgbevf on good ol' Intel 82599 from 7 years ago will not have a latency difference noticeable to the overwhelming majority of use cases vs. bare metal.

mmt · on Sept 11, 2018

> By no means do you need DPDK to basically eliminate the latency difference

I didn't mean to imply that was your argument.

> I just wanted to point out how low latency can go in general.

Rather, I meant that "can" isn't the same as "does", absent exceptional circumstances.

> A vm using SR-IOV

Whether this qualifies as exceptional is, of course, arguable, but I'm arguing that it is. I could understand the point that it doesn't have to be, but, to be actually convinced, I'd want to see evidence that it's well understood and well implemented enough that neither rare expertise, substantial engineering effort, nor constrained configuration (hardware or software) would be required to take advantage of it. I'd expect most technically-minded decision makers to think similarly.

gregdunn · on Sept 12, 2018

>Whether this qualifies as exceptional is, of course, arguable, but I'm arguing that it is. I could understand the point that it doesn't have to be, but, to be actually convinced, I'd want to see evidence that it's well understood and well implemented enough that neither rare expertise, substantial engineering effort, nor constrained configuration (hardware or software) would be required to take advantage of it. I'd expect most technically-minded decision makers to think similarly.

Oh. My apologies for misunderstanding your point.

SR-IOV is available on basically any and all server grade NICs, and is quite simple to use. With Azure and AWS it's basically just making sure you have the proper driver installed (gotten for free on basically all modern kernels) and flipping a command switch.

If you're rolling your own virtualization stack, it's generally about as simple as any other task for that stack.

With vSphere it takes a matter of seconds: https://docs.vmware.com/en/VMware-vSphere/6.7/com.vmware.vsp...

Similarly easy for XenServer: https://support.citrix.com/article/CTX126624

A little bit more work with the common KVM management options, but still a very simple task as far as Linux sysadmin tasks go: https://access.redhat.com/documentation/en-us/red_hat_enterp...

OpenStack is a bit more complicated, but frankly, less complicated than plenty other tasks in OpenStack: https://docs.openstack.org/mitaka/networking-guide/config-sr...

All of the real setup work has to be done at the hypervisor level, but you're primarily just doing two things: Creating VFs, and assigning them to VMs. The driver does all of the rest of the hard work. I would argue that any Linux or vSphere admin with any real amount of experience should be able to read any of the documentation I linked and be able to confidently work through it in an hour or two.

For the guest, just making sure the driver is installed should be all that's required. For ixgbevf, the ubiquitous commercial option, it's been in-tree for the Linux kernel for at least half a decade.

Once VFs are created and assigned, it largely "Just Works". The only real caveat I know of is that seamless live migration of the guest is no longer an option, because now all of the network virtualization is handled in the hardware instead of the hypervisor.

mmt · on Sept 12, 2018

> I would argue that any Linux or vSphere admin with any real amount of experience should be able to read any of the documentation I linked and be able to confidently work through it in an hour or two.

Having glanced through those documents, I agree that it doesn't appear to be overly complex. However, considering how much CLI there was in those instructions, I'd argue that it's evidence that this feature is not what could safely be called "well implemented" (or perhaps "well integrated" would have been better for me to use) and probably not "well understood".

If it actually only ever requires that hour or two and nothing ever again and isn't brittle, that's great. If it ever needs debugging, especially if a critical performance problem crops up, a rare expert might be needed after all.

I realize my overall point is, essentially, FUD, but, absent a large enough installed base, that's not a totally outlandish stance for a decision-maker with an already-working solution.

> Once VFs are created and assigned, it largely "Just Works". The only real caveat I know of is that seamless live migration of the guest is no longer an option

If they have to be individually/manually (or automated, just not already integrated into the usual VM management mechanisms), wouldn't this also prevent other forms of virtualization flexibility?

Ultimately, though, especially in this case, it seems like virtualization is a solution looking for a problem. That there may be (even nearly complete) mitigations for some performance issues doesn't mean that there won't still be some overhead and, more importantly, at scale, the virtualized options are always going to be noticeably more expensive than bare metal.

fulafel · on Sept 11, 2018

Probably the biggest jitter comed from your VPS getting preempted by other customers due to oversubscribing (cpu or network), not the relatively benign and fixed amount of packet processing overhead from virtualization itself.

gregdunn · on Sept 11, 2018

For sure! But that's a matter of the provider's placement decisions, and not inherent to virtualization.

Slartie · on Sept 10, 2018

Regarding bandwidth, since that is probably the most important resource required by your voice servers: do you strategically use the mixed-calculation-type pricing that some dedicated server providers offer (where you can get large amounts of traffic included in cheap server prices because very few heavy users are subsidized by many users that only use a tiny fraction of their bandwidth share) or are you rather renting dedicated uplinks for your servers that you and your provider expect to be fully saturated most of the time?

bcheung · on Sept 11, 2018

Can't speak for Discord but at that scale a provider is probably going to charge you for the physical pipe 100 Mbps, 1 Gbps, 10 Gbps, or by the bandwidth used (Mbps) using the 95% percentile method.

bcheung · on Sept 11, 2018

It really amazes me how ridiculous cloud providers are with their bandwidth pricing. Anything with user generated video is not feasible on the current cloud providers. Dedicated server providers like OVH are more than an order of magnitude cheaper. Anyone know why cloud providers have such high markup on bandwidth?

mmt · on Sept 11, 2018

It can be at least partially explained by a mismatch between the "cloud" model and the underlying engineering (and market) reality [1].

One has to build (or buy) for peak bandwidth. Selling it pay-as-you-go, with no regard to local maxima, means one has to price that rate high enough to account for the typical (and then some) spikiness in traffic. [2]

It's not hard to imagine that something like a UGC video site might significantly increase that spikiness ratio, if only because of the sheer quantity of data involved. Moreover, it's a large quantity of data transfer per user, so even modest user growth would result in huge network use growth. As a sibling comment pointed out a cloud provider "may not really want that type of client".

Perhaps cloud providers could start charging on a more traditional-ISP 95th-percentile style basis for larger customer and engineer their networks accordingly, but then they might have to keep those customers corralled in specific datacenters, which would remove part of the value of cloud infrastructure.

[1] Forgetting that "the cloud is just somebody else's servers" also led to the delusion that one doesn't have to "worry" about hardware failures in the cloud. Fortunately, it's now common knowledge that EC2 instances are subject to disappearing due to hardware reasons and that this needs to be "worried" about (engineered around).

[2] There is a similar issue with residential electricity pricing, where consumers pay a flat rate but the utility actually pays time-of-use (potentially a much higher rate on the spot market). Somewhat related to but not identical to rooftop solar using the grid as a "free battery", since that's also time-of-use. These come up routinely on HN discussions of electric power.

bcheung · on Sept 13, 2018

Wouldn't it be cheaper for cloud providers though? They are buying more bandwidth so they can get it cheaper. Also, they are taking advantage of the fact that clients have unused bandwidth so they can overprovision and get cost savings that way as well. I would think that that SHOULD make it cheaper for clients, but the opposite seems to be the case.

mmt · on Sept 13, 2018

> They are buying more bandwidth so they can get it cheaper.

I don't think that's actually true. The first assumption, that they are buying something from someone else is potentially flawed, and the conclusion is based on another potentially flawed assumption, that bandwidth has an inherent volume discount.

I say "potentially" flawed because these assumptions easily hold true for small enough providers and little enough bandwidth.

At large provider scale, it's probably safer to assume that they're building instead of buying, and those costs follow fairly large, discrete steps.

Increasing bandwidth means buying faster DWM modules and, possibly, higher-end equipment that supports them. It might mean doing that for their network peer, too.

In many cases, I expect it would mean bypassing shared infrastructure like internet exchanges, which might be limited to as little as 10Gb/s or even 1Gb/s and getting direct peering arrangements (including physical connections) with other networks, including possibly reimbursing them for their costs. This can be complicated by the new peer only having just enough bandwidth to that exchange point to match the exchange's maximum bandwidth, in which case peering will require co-locating somewhere else, with all the hardware and (hopefully dark but not always possible) fiber leasing costs.

None of those costs are necessarily high if considering maximum available bandwidth, such as if they spent 5x to get 20x or even 100x the capacity. However, if they only did it for a single customer that, on average only uses 2x the bandwidth (and only peaks at 20x-100x at rare times or only on certain, unpredictable in advance, connections to peers), they experienced a volume premium, rather than a discount.

jsjohnst · on Sept 11, 2018

> Anyone know why cloud providers have such high markup on bandwidth?

1) because they can

2) because they may not really want that type of clients

3) because due to the nature of peering agreements, they want to avoid paying as much as they can

Not sure which of the above applies, but the list is very likely at least part of the reason.