The time to start the single stream would be not much faster or at all to the mcast join. Most channels are broadcast to the local head node (main router in an service area). The join just builds the tree down past the head to the end user. Last time I looked at / worked on / designed these systems was maybe 8 years ago, but I do not except it to have changed to much. But you never know. I think the most interesting thing was this use case was what mcast was really designed for and the only real world use case I have seen beyond stuff in the finance market (market data feeds from the exchanges are multicast). Well there is OSPF...
Yes, these setups are really just a big mystery regarding the inner workings if you don't happen to work in that field. The unicast stream really started almost instantaneously when switching channels (sub second), so they somehow got this optimized well, but multicast always took 2 to 3 seconds...