Hacker Newsnew | past | comments | ask | show | jobs | submit | CompuIves's commentslogin

I think this is very similar! Really cool to see.

The first version we launched used the exact same approach (MAP_PRIVATE). Later on, we bypassed the file system by using shared memory and using userfaultfd because ultimately the NVMe became the bottleneck (https://codesandbox.io/blog/cloning-microvms-using-userfault... and https://codesandbox.io/blog/how-we-scale-our-microvm-infrast...).


I tried to do something similar well over a decade ago during an internal hackathon (the motivation back then being speeding up destructive integration tests). My idea was to have the memory be a file on tmpfs, and simply `cp --reflink` to get a copy-on-write clone. Then you wouldn't need to bother with userfaultfd or slow storage as the kernel would just magically do the right thing.

Unfortunately, the Linux kernel didn't support reflink on tmpfs (and still doesn't), and I'm not genius enough to have been able to implement that within 24 hours. :-)

I still believe it'd be nice to implement reflink for tmpfs, though. It's the perfect interface for copy-on-write forking of VM memory.


Glad to see the approach validated at scale! I hadn't seen your blog posts until they were linked here, going to dig into the userfaultfd path. Would love to chat if you're open to it.

Yes, that's right. The Firecracker team has written a fantastic doc about this as well: https://github.com/firecracker-microvm/firecracker/blob/main....

It's important to refresh entropy immediately after clone. Still, there can be code that didn't assume it could be cloned (even though there's always been `fork`, of course). Because of this, we don't live clone across workspaces for unlisted/private sandboxes and limit the use case to dev envs where no secrets are stored.


I talk a bit about this here: https://codesandbox.io/blog/cloning-microvms-using-userfault.... Before VM A updates its data, the data is copied over to VM B if VM B hadn't written/read that data yet.


clever! Thank you.


Oh wow! Unexpected and cool to see this post on Hacker News! Since then we have evolved our VM infra a bit, and I've written two more posts about this.

First, we started cloning VMs using userfaultfd, which allows us to bypass the disk and let children read memory directly from parent VMs [1].

And we also moved to saving memory snapshots compressed. To keep VM boots fast, we need to decompress on the fly as VMs read from the snapshot, so we chunk up snapshots in 4kb-8kb pieces that are zstd compressed [2].

Happy to answer any questions here!

[1]: https://codesandbox.io/blog/cloning-microvms-using-userfault...

[2]: https://codesandbox.io/blog/how-we-scale-our-microvm-infrast...


The example code present in the link is not available. Would you know where it went? Thanks, great article!


Enjoyed reading all these and learnt a lot! Thanks for taking the time out to write the blogs!


It's a block for non-US IPs, I think. The page is an application form for joining them, and it only accepts US citizens.


US citizens have to pay federal taxes abroad but now can't view federal websites.


That's been the case for a long time. I made a big stink about not being able to fill out my census survey to my congresswomen. It got nowhere.


Yup, that checks out


I think this is it, in eu i get blocked but if I access it via VPN I can open the page


A new post in what is slowly becoming a series! Happy to discuss this or answer any questions here.


Exactly, the result would've been different if the author would not have disabled caching.

In this case it's because the iframes are loaded/unloaded multiple times, but we also spawn web workers where the same worker is spawned multiple times (for transpiling code in multiple threads, for example). In all those cases we rely on caching so we don't have to download the same worker code more than once.


Wouldn't it be better if the code editors only activated if the user interacts with one?

Even with caching it's absurd to download so much JS for a feature that probably most users will not use. It's a docs site after all.


If you want to be efficient in Amsterdam, you take the bike or public transport. That has been faster than cars even before this change, and now more so.


Hey all! I'm one of the co-founders of CodeSandbox. Happy to answer any (technical) questions about this release or CodeSandbox in general!


Yes! But I work on CodeSandbox, so that creates some bias :). We've been working on our own CDE solution, though we've taken a different spin to improve speed and cost.

Our solution is based on Firecracker, which enables us to "pause" (& clone) a VM at any point in time and resume it later exactly where it left of, within 1.5s. This gives the benefit that you won't have to wait for your environment to spin up when you request one, or when you continue working on one after some inactivity.

However, there's another benefit to that: we can now "preload" development environments. Whenever someone opens a pull request (even from local), we create a VM for it in the background. We run the dev server/LSPs/everything you need, and then pause the VM. Now whenever you want to review that pull request, we resume that environment and you can instantly review the code or check the dev server/preview like a deployment preview.

It also reduces cost. We can pause the VM after 5 minutes of inactivity, and when you come back, we'll resume it so it won't feel like the environment was closed at all. In other solutions you either need to keep a server spinning in the background, or increase the "hibernation timeout" to make sure you don't have the cold boot.

It's kind of like your laptop, if you close it you don't expect it to shut down and boot the whole OS again when you open it. I've written more about how we do the pausing/cloning here (https://codesandbox.io/blog/how-we-clone-a-running-vm-in-2-s...) and here (https://codesandbox.io/blog/cloning-microvms-using-userfault...).


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: