GKE at least used to be a bit better, but that’s only worth so much compared to the rest of the services where you’ll frequently find feature gaps where you end up adding a +1 to an issue which has been open for years, and building your own service (yay, more toil). There’s some room for debate on security depending on exactly what features you use and what your needs are but if you use GCP, I’ll just note that Google gave all of your compute instances privileged access to your entire project and you should fix that that immediately if you haven’t already done so.
I’d also note that it used to be fairly comparable on pricing but they’ve been raising prices on some things which I noticed meant a cost savings migrating away.
> you’ll frequently find feature gaps where you end up adding a +1 to an issue which has been open for years, and building your own service (yay, more toil)
Yup, exactly my experience and what I meant by amateurish. We aren't even talking about "crazy" features.
For example, their identity management service supports MFA only though a cell text. Adding support for Google auth was a years old issue that we were told "yeah, lots of other people want this and if you plus 1 it maybe PM will prioritize it".
It's probably the best managed Kubernetes out there. People just like to make hyperboles. The consumer stuff, especially the free ones, get abandoned every now and then, but the "enterprise" services are pretty good.
GKE is an abomination. Its reputation is completely undeserved. Sure there's some nice things out of the box, like authenticating to clusters with your Google/GCP account, but day 2 operations are a constant frustration.
What sucks?
1. The Kubernetes Pod garbage collector is configured to be abominally slow, keeping terminated pods in the API server for far too long. This interferes with cluster monitoring by making it seem like there's a consistently high number of OOMKilled etc. pods rather than blipping as it happens. GCP support claims this is working as intended and recommends manually running a script to clean up the API server if it bothers you (this is a managed service?!). See e.g. https://stackoverflow.com/questions/75374590/why-kubernetes-... .
2. The rest of the Kubernetes world moved on from kube-dns and on to CoreDNS. Not GKE! On GKE your two options are kube-dns and the GCP VPC-native Cloud DNS (i.e. Kubernetes service and pod records are listed in the private DNS zone for the VPC). Surprise surprise - if you pick Cloud DNS to help scale your cluster, because GCP isn't operating kube-dns well enough on its managed control plan, then you're on the hook for paying for the Cloud DNS zone as well, it's not included in the GKE cluster costs. See e.g. https://cloud.google.com/kubernetes-engine/docs/how-to/cloud... .
A lot of this criticism could also be levelled at Azure’s AKS service.
IPv6 is broken there as well, and they similarly overcharge for logging. However, instead of 50c per GB, they charge $2.75 per GB, which is highway robbery. That’s more than 5x what GCP or AWS charge.
I swear Microsoft must have been aiming for “half price to be competitive” and then accidentally put the decimal point in the wrong place.
Let this sink in: they charge the price of a serviceable used car or a decent gaming PC to store 1 TB of text for a month!
GCP is clearly good if your use cases match their services. However, there are a lot of things that are sub-optimal.
IPv6 doesn't have non-premium networking. Premium networking is nice, but non-premium networking is less expensive.
Instances sometimes take more than 5 minutes to shutdown. A lot of things seem very slow like this. Really frustrating to use for testing when it takes so long to bring an environment up and to clean it afterwards.
Load balancing is hard to use outside of http style short connection use cases. There's no load feedback mechanism, and at small request rates, requests are severely unbalanced anyway. Managed instance groups can auto scale downwards, but connection drain is implemented as take it out of rotation and wait a configurable time and destroy the instance. If the instance drains faster, it won't be destroyed faster. If you want to drain for more than 60 minutes, that's too bad (this isn't that unreasonable, but while I'm ranting...)
Google's container optimized OS has documentation that tells you how to configure docker log retention... But then their container runner (konlet) forces their own log settings for the main container, so your settings are ignored.