This is true for AirPlay, but not really for Google Cast: While a Cast-capable device is required to initially launch Spotify, the Spotify Cast app is a full-featured Spotify Connect client.
For example, you can launch Spotify on a Google Home and control it from any device logged in to the same account, regardless of whether it understands the Cast protocol, is on the same network etc.
I suspect that once Spotify offers HomePod support, the situation might look somewhat similar for AirPlay (since being able to continue streaming with your phone out of battery etc. is possible on Apple Music on a HomePod already).
I kinda knew this since I think the feature you described is also available through Alexa/Echo, but I never really took the time to think about what was going on behind the scenes.
However, I'm still struggling to understand what is going on in your example.
When you control Spotify from your phone (volume++, next, new song, etc), how does Google Home know to do something?
When you tell Google Home "google play never gonna give you up", how does the Spotify app on your phone know how to reflect what's playing on the Spotify app?
Technically I can imagine that there is a shadow state of your Spotify instance sitting out in the cloud somewhere, but what are the mechanisms that make all this work together? Is this part of the "cast" protocol/architecture? Is this Spotify specific IP/tech that they can push on Google and Alexa to include on their IOT/smart services since they have the clout and pull?
Do you have any good links (not too technical, not too lite) on how this works?
For example, you can launch Spotify on a Google Home and control it from any device logged in to the same account, regardless of whether it understands the Cast protocol, is on the same network etc.
I suspect that once Spotify offers HomePod support, the situation might look somewhat similar for AirPlay (since being able to continue streaming with your phone out of battery etc. is possible on Apple Music on a HomePod already).