The difference between APIs and SPIs
Bear Giles | May 19, 2018At work we implement a number of filesystems via the java.nio.file.spi.FileSystemProvider SPI. This allows us to use URLs in our code and adding support for a new filesystem requires nothing more than implementing a dozen or so classes in an abstraction layer. Our application code doesn’t need to change beyond, perhaps, adding a entry in a pulldown menu or an account class that captures a new type of authentication information. (Examples of the latter are AWS credentials, or Kerberos principal and keytab file).
The java.nio SPI is stable so it’s literally a case where we can write it once, test it well, and then forget about it. We can eventually remove from our primary source tree and use like any other library. We aren’t quite there yet since some servers don’t fully implement their RFC standard and we have to find workarounds. (I’m looking at you, Windows FTP server.) But soon….
A recent issue highlighted that some developers didn’t understand the key difference between an API and a SPI:
An API (Application Programming Interface) is a contract from the library to the developer. The developer can use as much or as little of the API as they wish.
A SPI (Service Provider Interface) is a contract from the developer to the library. The developer must fully implement the SPI according to the specification. Ideally the test coverage is based on a reference implementation, e.g., I wrote my hdfs:// tests by writing analogous file:/// tests and verifying the results matched instead of basing them solely on my understanding of the SPI javadoc. There were some significant differences.
There are two caveats to the requirement that a developer fully implement the SPI. First, many SPIs specify that some methods can throw UnimplementedMethodException. These methods are optional and the developer can safely throw that exception if they don’t want to implement the method for some reason. Second, if the developer knows, with absolute certainty, which SPI methods will never be called then they can throw an exception if these methods are called. This is extremely risky though since methods you don’t need today have a tendency to become critical a few years later. On the one hand we want to make the most efficient use of our time and it’s wasteful to implement something we’ll never need. On the other hand if we do need that functionality later it’s far more efficient to implement everything at once while all of the nuances are fresh in our minds. It can be costly to refamiliarize yourself with the servers and code later.
In our particular case having a full implementation wouldn’t have solved the immediate problem. However the discussion brought up the difference between APIs and SPIs and I decided that this is a common enough confusion to be worth discussing.