Quick question for you C++ gurus: Why doesn't C++ have native generators? Is there some technical reason that makes it hard to compile such a feature, or is there a political reason?
Plenty of lightweight thread systems exist for C++. You're thinking about the problem the wrong way. You don't really want generators: you want fibers. Once you have fibers, generators are trivial to implement. One implementation of fibers for unixlike systems is GNU nPth.
Not exactly. I'm not familiar with nPth specifically, but it appears it uses swapcontext, which is a lot slower than doing it in an event loop the proper way. Like, 1000 times slower (on Linux)[1]. Having proper generators and extendable event loops is a far cleaner solution to the problem, in my humble opinion. Python's asyncio is perhaps the best implementation of such an event loop I've seen.
nPth can use swapcontext if everything else fails, but its preferred portable switching strategy is the so called sigaltstack trick witch is usually significantly faster.
BTW, I'm the author of the above page. Coroutine libraries have been my personal hobby since forever.
I'm a bit out of my wheelhouse here, I usually do C and rarely touch C++, and in addition the coding standard I have to adhere to prevents setjmp/longjmp, but doesn't setjmp need to be used in conjunction with longjmp, which makes it completely unsafe to use malloc, fopen, etc? Also, it would prevent RAII from running.
getcontext and setcontext are fundamentally about saving and restoring CPU registers. setjmp and longjmp are also about saving and restoring CPU registers, and they (and related functions) give you much more complex of whether you want to save the signal mask. Since saving and restoring the signal mask is the objectionable part of setcontext, setjmp is a decent alternative.
If you want to unwind, just unwind. setjmp and setcontext neither help nor hurt you here.
Fair enough. I still have doubts about whether that would scale up to millions or billions of coroutines, but I will defer to your knowledge in that it's probably Good Enough™ for most purposes. Thanks for the explanation!
It is actually pretty easy, stackless generators desugar to a switch statement. MS implementation of their C++ coroutine proposal already demonstrates that. The problem is that, as currently specified, it requires memory allocation which can't always be optimised out and that has proven conttentious with many.
Say function A calls into generator B, which has entry points X,Y,Z. Maybe you could generate code for functions A<X>, A<Y> and A<Z>, specialised on the entry point into B. When B yields, it actually "calls" the appropriate A, but does something funny with the stack pointer. Maybe?
The generated code would explode if you had too many coroutines going at once, and you need a lot of things to be known at compile time, and maybe the specific things you'd need to do to the stack would slow you down. I don't think you'd have to mess with the return address, though, so maybe it wouldn't be so bad.
Yes of course. At that point any standard optimisation that can be applied to normal code is applicable.
In this case you wouldn't expect any call at all: the switch statement is online in the caller and standard constant propagation can remove the switch itself.