Don't Share Java FileChannels

Arnavion · on March 12, 2023

So why does Java close the channel when an "interrupt" happens? I looked at the git-blame of [1] to see if the commit message would explain it, but it's just been there since the first OpenJDK commit in 2007. The only other info I could find by searching for that exception name is [2] but it doesn't elaborate the "IO safety issues". And are these "interrupts" just the underlying syscall returning EINTR or something Java-specific?

[1]: https://github.com/openjdk/jdk/blame/c313e1ac7b3305b1c012755...

[2]: https://stackoverflow.com/questions/1161297/

---

Edit: marginalia_nu's and cesarb's comments make sense. Searching for "interruptiblechannel" yields:

https://www.taogenjia.com/2020/07/13/Java-understanding-nio/

>NIO’s designers chose to shut down a channel when a blocked thread is interrupted because they couldn’t find a way to reliably handle interrupted I/O operations in the same manner across operating systems. The only way to guarantee deterministic behavior was to shut down the channel.

Since the interrupts being talked about here are a Java concept, Java would need to interrupt the syscall itself, and it makes sense they chose to close the file since there's no way to do that equally well on all OSes.

layer8 · on March 12, 2023

Thread interrupts are a purely Java thing [1]. Probably when the blocking operation detects that the interrupt flag is set, they cancel the internally asynchronous operation, but can’t do so on all platforms without also closing the channel.

You should still be able to share FileChannels between threads when all immediate client code is under your control so you can handle that situation gracefully.

[1] It’s just a flag you can set on a thread, requesting the thread to stop its operation (the thread is not required to comply), and longer-running operations like I/O can choose to test that flag, and typically throw some exception when it is set. The default exception type for this is InterruptedException, but here they chose to throw a subclass of IOException, probably exactly because it also closes the channel.

twic · on March 13, 2023

This is it - a thread that's blocked on I/O won't see the interrupt flag until the system call returns, so interrupting it also closes the file descriptor, to force the system call to return straight away. It's a nasty kludge.

I wonder if the virtual threads from Project Loom could let the JVM do the right thing here, and interrupt the virtual thread without waiting for the carrier thread to return from the system call. That might be extremely hairy, though - your thread could return from a read call with an InterruptedException, and then some time later, the data from the read could end up in your buffer as if by magic!

marginalia_nu · on March 12, 2023

"AbstractInterruptibleChannel" seems to be doing this, and the comments/javadocs offer some hint. As to why they're designed this way, that's a good question.

https://github.com/openjdk-mirror/jdk7u-jdk/blob/master/src/...

pron · on March 13, 2023

It's important for sockets, less so for files, but they inherit that behaviour from the specification of the interface (https://docs.oracle.com/en/java/javase/19/docs/api/java.base...).

See an old discussion about that here:

https://mail.openjdk.org/pipermail/nio-dev/2018-February/004...

nitwit005 · on March 12, 2023

Not every OS supports cancel-able file IO, but they all support closing the file outright, which will get this behavior of canceling all IO against the file.

cesarb · on March 12, 2023

> And are these "interrupts" just the underlying syscall returning EINTR or something Java-specific?

It's something Java-specific: https://docs.oracle.com/javase/8/docs/api/java/lang/Thread.h...

Szpadel · on March 13, 2023

just a guess, afaik syscalls for reads and writes get buffer as a pointer, and java have GC, so if thread gets interrupted, there could happen GC cycle and memory could be moved to different location, so read would write to outdated memory location potentially causing memory corruption. (I didn't check code, but I bet canceling happens before suspending thread)

Could be similar why rust have Pin macros.

layer8 · on March 13, 2023

You can be 100% sure the internal I/O buffers are pinned in Java’s implementation, and you seem to have a misconception about the Java-specific meaning of “thread interrupt” in this context (it has nothing to do with thread context switches).

Szpadel · on March 13, 2023

Oh sorry, I don't know java well enough, I tried to approximate knowledge from other languages. This was for me only logical reason why you would need to cancel operation here. So this is about interrupt()? so basically canceling thread?

vbezhenar · on March 13, 2023

When you call thread.interrupt(), if a target thread sleeps in a blocking I/O call, this blocking call should throw Java exception.

If target thread does not sleep in a blocking I/O (or uses some kind of unsupported blocking I/O, e.g. implemented with third-party JNI library), only some internal flag sets and no magic happens.

It's pretty simple implementation actually. Thread is not cancelled or corrupted in any way, if you handle this situation, it's perfectly OK to continue execution in this thread. Usually interrupt is used to coordinate termination, indeed, but that's not the only use.

So apparently they didn't find a better way to cancel a blocking I/O operation than just to close file handle from the other thread which causes OS to return some kind of error code from blocking call which translates to Java exception in the end.

avg_dev · on March 12, 2023

I enjoyed the article. Long time since I programmed in Java. I recall being recommended a book called Java Concurrency in Practice and it languishing on my shelf.

I had a few notes:

1. It is really nice to have such thorough documentation. Even if a programmer doesn’t always have or make the time to read it.

2. I _think_ I remember reading an interview with Peter Norvig in Coders at Work where he talks about programmers never having the time to fully grok their API docs (it may well have been a more general statement about rushing to get stuff done).

Some time ago I personally learned a virtualization discovery API whereby one could make some calls and learn, through some sort of traversal, how a topology of VMs was laid out. My title at the time was Intermediate Software Developer. I remember I was pretty happy with my solution and shared during standup that it was working well but was kind of slow, and that there was another, more complicated and finicky type of traversal mentioned in the docs, and that my reading of the docs was that learning and coding this other method was necessary or helpful is some use cases but that for our situation it would not make a difference and would just add complexity. Well within a couple of weeks another team member - Junior Software Developer - read the same docs and tried out the more finicky version and wouldn’t you know, the discovery process suddenly became blazing fast.

Arnavion · on March 12, 2023

And for those looking for the equivalent POSIX C API, it's `pread` and `pwrite`. I've come across a lot of people (me included) who resort to locking + seek because they don't know these exist.

vbezhenar · on March 12, 2023

They should have seen ClosedByInterruptException in the logs and the whole mystery would be resolved much faster.

rezonant · on March 13, 2023

This is a good point, but presumably the ClosedByInterruptException was expected, aka they called thread.interrupt() on that thread intentionally. It was the side effect on seemingly unrelated threads that is the surprising behavior.

EDIT: Looking closer, it's just that they caught IOException instead of the ClosedByInterruptException subclass and expected it to be a non exceptional exception. One would hope they'd log the subclass type in that case and if so then yeah they should've seen that.

thayne · on March 13, 2023

The throws documentation can be really important. I recently ran into something for the API for a JDBC ResultSet. From the signature, and description, it would seem to be fine to call on a forward only result set, as long as you haven't moved past the first row yet. But in the throws section it says that if the result set is TYPE_FORWARD_ONLY, then it will throw a SQLException. And to make things more fun, not all drivers throw that exception, so it might work fine, until you switch drivers, or even upgrade your driver.

spuz · on March 12, 2023

Good insights. Would the solution therefore be to synchronize on all the FileChannel methods (not just those you think you need to synchronize on) or is there another way to get around the too many open files error?

CodesInChaos · on March 12, 2023

> is there another way to get around the too many open files error?

Since this isn't an actual leak, raising the limit should be fine. The default limit on Linux is 1024 due to some issues with SELECT, but you can easily raise it to a much higher value if you don't use that API.

mritun · on March 12, 2023

Raising the FD limit is the right answer. 1024 is absurdly low in modern context and is a carry over from the past that needs to die.

p4l4g4 · on March 12, 2023

You can't just lock on the file operations, since this problem comes from thread interruptions. No interrupt, no problem. So, instead you need to make file operations and _any_ thread interrupt mutual exclusive.

Finding and patching all possible locations which could interrupt your threads doing file operations is probably a foolish effort.

So, raising the limit, or load balancing (depending on the type of application) is probably the best solution.

layer8 · on March 12, 2023

No, the solution would be to handle the unexpected closing specially and gracefully, for example trying to reopen the channel and retrying the higher-level operation once.

The other question is why the threads are receiving interrupts in the first place. Depending on the reason, a different course of action might be appropriate.

stefan_ · on March 12, 2023

You have now serialized all your request handlers. No, just raise the fricking limit. It's a holdover.

pianoben · on March 12, 2023

Cache the file contents, perhaps? Isolate actual file I/O to dedicated threads and vend reads and writes from it? Buffer writes in-memory, only flushing at some interval or when the buffer fills up? Use a DB server and not raw files?

Lots of ways to skin this cat, but it really depends on what the application is doing and why.

agilob · on March 12, 2023

I don't know much about these file descriptors but in this case I would protect it using ReentrantLock or Monitor from Guava

https://docs.oracle.com/javase/7/docs/api/java/util/concurre...

They are more powerful than synchronized and can produce log messages when things go sideways with locking, dead locking or external crashes.

CodesInChaos · on March 12, 2023

Would not using these interrupts be a realistic alternative?

pkolaczk · on March 12, 2023

Actually this was the way we solved it. Although, that's still quite risky and I'd not recommend it as a good workaround.

p4l4g4 · on March 12, 2023

I would say you can get a long way with trying to prevent your own code from emitting interrupts. But how do you stop libraries or the JVM from emmiting them?

brazzy · on March 12, 2023

The problem is that the interrupts may be used by library or framework code that you're not aware of, in hard-to-reproduce edge cases.

jesprenj · on March 12, 2023

Don't write multithreaded software (:

yjftsjthsd-h · on March 12, 2023

I mean, unironically if you can[0] that's probably not a bad approach - just document (or programmatically ensure) that your code isn't multi thread safe and call it a day.

[0] A major caveat, but often true for "boring" business logic applications.

mcapodici · on March 12, 2023

I have done barely any multithreaded coding, but if I had to I would look at akka etc. first; i.e. use a framework!

marginalia_nu · on March 12, 2023

Java's actually got pretty sane multithreading support. Some classes are doing things that are a bit unintuitive (like this one), but once you grok the Java Memory Model it's actually very straightforward.

exabrial · on March 13, 2023

I agree with this. Memory safety, plus concurrency built into the memory model, and very well documented API pretty much make it a breeze.

midoBB · on March 13, 2023

Akka is the only way I could be sure that my concurrent code would most likely not make everything on fire. Especially Typed Akka.

yjftsjthsd-h · on March 12, 2023

Oh sure - there are lots of great tools to do it these days. My point is mostly that in most cases you can just skip it all together.

karmakaze · on March 12, 2023

That's the XP way of not sharing FileChannels.

cutler · on March 12, 2023

Solution: use Clojure's STM. If you listen to Rich Hickey's early presentations following Clojure 1.0 he detailed how spending over a decade trying to write concurrent code in Java with locks drove him to search for something simpler.

MrBuddyCasino · on March 12, 2023

STM is not that fast. The easier solution is to switch to virtual threads and sidestep the issue.

eternalban · on March 13, 2023

Clojure's STMs (originally borrowed from Scala iirc) are not going to help you with concurrent file IO.