Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I'm not much for using SQLite but using mmap() is a no-brainer for fast file access and I'm surprised they hadn't used it before. I mean, its trivial to use if you know how to use fopen()...


D. Richard Hipp, the inventor and primary developer of SQLite considers the rock-solid reliability of SQLite of primary importance, which is one of the reasons everything from iOS to Skype uses it. (The fact that it has a public domain license doesn't hurt.) He lists the potential pitfalls of using this method on the page: http://www.sqlite.org/mmap.html and leaves it to users to determine if those risks are acceptable to them. I'm sure he's been thinking about this method for a while.


I first raised the issue of mmap in SQLite 2 years ago. http://www.mailinglistarchive.com/html/sqlite-dev@sqlite.org...

Richard started playing with it a couple months later http://www.mailinglistarchive.com/html/sqlite-dev@sqlite.org...

but didn't get any performance benefit at the time because he was Doing It Wrong(tm).

SQLite does a lot of excessive work; this work is justified if you're using it on an embedded processor that doesn't have a full virtual-memory based OS underneath. But on any other platform it's a waste of CPU/power/memory/time. And it's been decades since the majority of embedded applications ran on bare metal without some kind of OS.

This new mmap feature in SQLite promises "up to 2x" performance improvement, while LMDB in SQLite still yields 20x improvement over stock SQLite.


> I'm not much for using SQLite but using mmap() is a no-brainer for fast file access and I'm surprised they hadn't used it before. I mean, its trivial to use if you know how to use fopen()...

mmap() is great for reads, but much tougher for writes. this is because it is hard for the kernel to determine what it is you are trying to do. madvise() helps here, but not many projects seem to use it. using plain old write[v]() for writes can be informative enough to the kernel to "do the right thing".


> using mmap() is a no-brainer for fast file access and I'm surprised they hadn't used it before

With mmap, you map a file into process memory and read and write to it as if it were memory. When you write to this memory there isn't really a way to indicate that an error occurred because there's no function call to return an error code. Instead, the process is sent a SIGBUS to indicate that something happened. This causes control to jump from the memory access into the signal handler.

Normally this would be fine and you can handle the error there. But sqlite is a library. It lives in-process with the "real" application using it. What if the program had a signal handler there? You don't want to intercept signals that they need to receive, and you definitely don't want to mask errors that they are getting on their own mmapped files, and you don't know how to handle the errors that they should be seeing instead. And what if they have some crazy green-threading going on that you've now ripped control away from by entering the signal handler?

In this particular case, sqlite's answer is to crash when that happens, taking down the entire host application with it. That's not okay for a lot of applications, and why it defaults to being turned off.

And maybe this particular case does or doesn't apply to sqlite, but there are always cases that your application's domain-specific knowledge of how it uses disk and memory and CPU can out-perform the OS's more general optimisations.

That's not the only issue, and the particular subtleties of mmap aren't important here. What's important is that when very smart people like the authors of sqlite choose to not do something, it's generally not as much as "no-brainer" as some know-it-all types would have you believe. It's easy to read something like Varnish's "always use mmap and leave it to the OS!" manifesto and think that you now automatically know better than all of those silly plebeians that just don't get it, but usually things are more nuanced than that.

> I mean, its trivial to use if you know how to use fopen

I seriously doubt that the difficulty of implementing it is a concern


> With mmap, you map a file into process memory and read and write to it as if it were memory.

This could be an implimentation difference, but I've always accessed mmapped files as files, but I've only ever used it on *BSD and Solaris and it was very straightforward and easy.


It's just a poorly titled mis-link, clicking through describes in more detail why it's not entirely a 'no brainer':

http://www.sqlite.org/mmap.html

tl;dr is "Improves performance in some cases with certain caveats/risks, is disabled by default"


> but using mmap() is a no-brainer for fast file access and I'm surprised they hadn't used it before.

Is it always? I remember benchmark sequential file reads and writes and plain old read and write were sometimes faster or there was no difference.


I think a lot of people get bit by small stdio buffers and switch to mmap, not realizing that read() based solutions can be improved greatly by better buffering strategies.


I never saw much of a reason for SQLite either until I started doing mobile development. However, it's nearly a must for doing any sort of intensive storage though with Android or iOS (unless use cases are simple enough for XML and JSON).

I just wish doing asynchronous queries with SQLite on Android was not so obtuse and resulted in a bunch of boilerplate or using third party libraries for a core part of the API.


Have to agree the problem with SQLite isn't SQLite it's the generally terrible wrappers companies put around it.

On both iOS and Android the default ones are terrible. And even many third party ones aren't the best.


On Android at least, I try to use loaderex[1] whenever possible, but it has some limitations. Best third party solution I've found so far. I was kind of fishing for someone to give some alternatives for Android and iOS, but seems like we're all kind of just in a rut :/

[1] https://github.com/commonsguy/cwac-loaderex


It's a no-brainer for "fast", but you had better brain pretty hard if your goal is "reliable".


I had assumed SQLite had "reliable" down :)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: