The only thing that fsync() does is take data which is *already* in the operatin...

batbomb · on May 5, 2015

The point of a return from fsync is that you are guaranteed the file has been written to disk[1]. If you don't block on fsync, you can't guarantee the file was written to disk, because the server may have died in any number of ways.

[1] This guarantee occasionally fails too; If you have a battery-backed NVRAM RAID controller, the guarantee is that the write has hit the NVRAM controller with the expectation that it will hit a disk before the battery dies. Throw in a 72 hour power outage, a controller failure, or a massive disk failure, and you can't even guarantee that.

teraflop · on May 5, 2015

No, I understand that. Maybe I'm not explaining my point properly, so I'll try again:

If you issue a write() syscall from a process, and the syscall succeeds, then the data that was written is present in the OS's cached view of the filesystem, even if the process dies a nanosecond later. That view is shared consistently by all processes on the system. It's true that the changes may not actually be stored persistently on disk, but that difference is unobservable unless something happens to make the kernel lose its cached data.

So from the test suite's point of view, unless part of the test involves actually killing VMs and not processes, it should not be possible for the results to depend on whether or not fsync() was called.

eclark · on May 5, 2015

Jepsen is just doing a kill -9 on the java process.

I posted a comment on the blog: https://aphyr.com/posts/323-call-me-maybe-elasticsearch-1-5-...

First I made sure that read() goes through the page cache. (It does as long as there's no O_DIRECT) Then I went and checked the write ahead log on ES.

Turns out from my reading that ES is considering a write to be durable if it is put into a userspace buffer.

https://github.com/elastic/elasticsearch/blob/master/src/mai...

Data is pushed to kernel space whenever the buffer gets full. Then it is fsync'd on the timer.

teraflop · on May 5, 2015

Nice research.

In case anyone else is wondering why that Github link is broken, the file in question was renamed a few hours ago. Here's a working permalink: https://github.com/elastic/elasticsearch/blob/fafd67e1aef091...