Random Stuff of Randomness

Simon's

Java NIO and NIO2 Overview

Java NIO and NIO2 interfaces bring some improvements to the file/stream interfaces. Let’s see what has changed since the dark ages of the inter-blago-tubes of the 90ies. Here is a brief history of the java I/O interfaces:

  • The first one is Java I/O that is introduced in 1996 in the very first version of the JDK.
  • The second, Java NIO has been added to the JDK in 2002, Java 4.
  • The third, in 2011, in Java 7, Java NIO2 was introduced.

Over time, the following improvements have been added: Buffered readers/writers, bidirectional channels, off-heap buffering, proper support for charsets and asynchronous operations.

Probably the biggest difference between java.io and java.nio is, however, that java.nio can be used in non-blocking mode. This means that multiple streams can run in the same process/thread. This should, in theory, speed up performance because task/thread switching comes with some cost.

I/O

NIO

The motivation for NIO

Early Java Versions had some shortcomings in the java.io package [Ref]:

Primary shortcomings

  • No buffered buffered streams (always byte by byte access)
  • No bidirectional transfer, either read or write streams.
  • No internal charset support, which meant working on byte streams with manual conversion if needed.
  • Streams were always blocking
  • buffered reader/writer were using heap memory and therefore held by garbage collector (negative performance impact on large data).

Secondary shortcomings:

  • The File class lacked the significant functionality required to implement even commonly used functionality. For instance, it lacked a copy method to copy a file/directory.
  • The File class defined many methods that returned a Boolean value. Thus, in case of an error, false was returned, rather than throwing an exception, so the developer had no way of knowing why that call failed.
  • The File class did not provide good support for handling symbolic links.
  • The File class handled directories and paths in an inefficient way (it did not scale well).
  • The File class provided access to a very limited set of file attributes, which was insufficient in many situations.

Improvements in NIO

To overcome these problems, Java introduced NIO (New IO) in Java 4. The key features of NIO were:

  • Channels and Selectors: A channel is an abstraction over lower-level file system features (such as memory-mapped files and file locking) that lets you transfer data at a faster speed. Channels are non-blocking and can be bi-directional.
  • Buffers: It provided the java.nio.Buffer class that offers operations such as clear, flip, mark, reset, and rewind. Buffers can be kept off-heap and can be very large (think giga or terabytes) without impacting the garbage collector.
  • Charset: java.nio.charset includes encoders, and decoders to map bytes and Unicode symbols.

Improvements in NIO.2

With the Java 7 version (former Java 1.7), Java has introduced comprehensive support for I/O operations in the NIO.2 package. Java 7 introduces the:

  • File package: for better support for handling symbolic links, to provide comprehensive attribute access, and to support the extended file system.
  • Path: Path object represents a hierarchical sequence of directories and file names separated by a delimiter. The root component is furthest to the left, while the file is right.
  • Asynchronous channel APIs: The asynchronous channel APIs were introduced into the existing java.nio.channels package, simply put – by prefixing the class names with the word Asynchronous.

Performance

Claims that NIO is faster than I/O are floating around the internet.

It seems to be very dependent on use case and hardware / operating system. For example this Benchmark from 2012 claims that NIO in a single threaded environment performs worse on file operations (approx. half the throughput compared to streams).

Another benchmark from 2010 was looking at socket throughput in NIO compared to I/O. It came to the conclusion, that NIO performs slightly better the more clients connected. However in the lower range of # clients, performance was pretty similar:

Echo Throughput Test

Another benchmark, conducted in 2017, has been done and published on github.com/romromov/java-io-benchmark. Performance was compared across Linux and Windows.

All read speeds
All write speeds

Their conclusion:

What we have seen in our small survey, is that channels are faster than streams if buffer size is selected appropriately. Direct byte buffers > are faster than indirect ones. Also we cannot confirm a common opinion that first opening of a direct byte buffer is always slower that the following. This was only seen on hdd1 system.

It is clear that I/O performance heavily depends on configuration of the system. In particular, cache size, file system page size, RAM etc. If you need high performance consequent I/O on Java you need to experiment and compare. I would consider approaches according to the following priority:

  • Memory-mapped files
  • Direct ByteBuffer
  • Indirect ByteBuffer / streams

Conclusion

On modern platforms there does not seem to be a general performance difference between java.io and java.nio. Performance gains (one way or another) seem to be highly situational (concurrency, hardware, network/files, etc.).

Being able to use multiple channels in one thread makes programming much simpler and less error prone. Also, conceptually channels are much closer to what posix select and poll system calls do - event handlers.

Depending on your JVM implementation and underlying hardware, there could be a significant benefit of using NIO on large streams because memory allocation off-heap seems to be faster.

Links

Bibliography

  • Oracle certified professional java se 7 programmer exams 1z0-804 and 1z0-805
    S G Ganesh, Tushar Sharma
    ISBN 978-1430247647 Apress; 1st ed. edition (February 26, 2013)