zsmb.coNot Medium!



Maintaining Compatibility in Kotlin Libraries

2019.03.30. 22h • Márton Braun

A decent library author will know that, say, removing a method from their library in a new version will cause problems for clients that rely on that method. When updating, they’ll have to find the changelog (which hopefully exists), and find out what other way there is - if any - to achieve the functionality they had before, using the new version of the library.

The time the library’s client has to spend on this migration after a breaking change is time taken away from making progress on their actual project. They pay this price merely to maintain their existing functionality. Sure, they could just stick to the old version, but what if the new version brings crucial bugfixes or security updates? (“Of course”, such updates shouldn’t have API changes or new features at all, unless the fixes itself necessitate them.)

The blatant removal of a method would probably be an obvious red flag for any developer, but there are much more subtle ways to break compatibility for your clients. We’ll touch on three topics:

  • Source compatibility
  • Binary compatibility
  • Deprecation

Source compatibility

The simplest form of compatibility to maintain is source compatibility, also referred to as API (Application Programming Interface) compatibility. This is what most people will think of when they talk about breaking changes in libraries, and this is what most library authors tend to keep in mind.

Source compatibility means that code written against the old version of the library should still compile against the new version. This can be verified if you’re using all of the API of your own library somewhere - for example, in the library’s unit tests. If you broke your tests, you’ve broken client code.

Let’s take an example library, called adder, which contains a single function that lets you add two numbers together. This is the 1.0 implementation:

package co.zsmb.example.adder

fun add(x: Int, y: Int): Int { // 1.0
    return x + y
}

If we publish this library, client applications can start using it:

package co.zsmb.example.app

import co.zsmb.example.adder.add

fun main() {
    println(add(1, 2)) // 3
}

So far, so good! Now, what if we want to make a change and let users add three numbers together instead of two? The naive way would be to just add the third parameter:

package co.zsmb.example.adder

fun add(x: Int, y: Int, z: Int): Int { // 2.0 attempt
    return x + y + z
}

This change would break source compatibility. The application code seen in the previous snippet would no longer compile, because the add function can no longer be called with two parameters.

If we want to keep existing clients happy, we can provide them source compatibility by giving the new parameter a default value. This way, existing code passing in just two parameters won’t be broken, and clients who update to the new version also have the opportunity to add up three numbers in one fell swoop! A seemingly perfect 2.0.

package co.zsmb.example.adder

fun add(x: Int, y: Int, z: Int = 0): Int { // 2.0
    return x + y + z
}

If we’re feeling really fancy, we can even make a 3.0 later on, now with a vararg argument, so that clients can add up as many whole numbers as they want. This, again, is source compatible with both the 1.0 and 2.0 versions, as the add function is still in the same package, and can be called with 2 or 3 arguments.

Edit: This isn’t entirely source compatible, since calls made with named arguments would break, as pointed out by ilya-g on reddit..

package co.zsmb.example.adder

fun add(vararg numbers: Int): Int { // 3.0
    var sum = 0
    for (n in numbers) {
        sum += n
    }
    return sum
}

So… Did we forget anything here at any point?

Binary compatibility

We’ve made a huge assumption, namely, that whatever code uses our library will be recompiled after they update the library’s version. The grim reality is that there’s a couple ways we can end up with bytecode compiled against the 1.0 of the library calling into the 2.0 (or 3.0) implementation.

So our library doesn’t only have its regular interface that we see from source code (the API) to maintain, but a binary interface (ABI, Application Binary Interface) as well. This means that we’ll need binary compatibility from the adder library, if possible.

Let’s look at two ways we can produce the version mismatch eluded to above.

The first way is rather simple, as the assumption of client code always being recompiled is far from being true when working on the JVM. We could easily create a compiled .jar of our application, and run it by putting both it and the library’s 1.0 .jar on the classpath. In this scenario, we could update the library to 2.0 by swapping its .jar to a newer one, without recompiling our own application that references it.

JAR replacement illustrated


This doesn’t tends to be an issue when running on Android, since the application code itself is always recompiled after updating a dependency, in order to produce a new APK. No easily swapping .jars on the classpath here. It would seem that source compatibility would suffice, but there’s a catch - let me show you the second way to get a version discrepancy.

We’ll create another library, this time to calculate the average of two numbers. This library will depend on our adder library to add the numbers together.

package co.zsmb.example.averager

import co.zsmb.example.adder.add

fun average(x: Int, y: Int): Double {
    val sum = add(x, y)
    return sum / 2.0
}

We’ll use a command line application as our example which we’ll assume is recompiled every time, as it would be the case in an Android app. This application will depend on both of the libraries.

Structure of the three modules

package co.zsmb.example.app

import co.zsmb.example.adder.add
import co.zsmb.example.averager.average

fun main() {
    println(add(1, 2))
    println(average(2, 3))
}

If we swap the adder library to the 2.0 version in our dependencies, the code in our main function will continue to work thanks to source compatibility. The averager library will now call into this new 2.0 version as well, since there can’t be two versions included at the same time. Note that since averager is included as a .jar, it won’t be recompiled.


We’ve now seen two ways of ending up with a .jar file containing code that wants to call into the 1.0 of the adder library, depending on a .jar that actually contains the 2.0 version. (And the same could, of course, be done with version 3.0). We won’t have any compilation errors along the way in these scenarios, but at runtime, we’d see crashes.

This happens because while there’s still an add function that can be called with two parameters in the newer versions, it’s not the same add function. Not as far as the bytecode is concerned.

This is what a call to the original, 1.0 add function looks like in our averager library’s bytecode:

ILOAD 0
ILOAD 1
INVOKESTATIC co/zsmb/example/adder/AdderKt.add (II)I

It loads the two parameters onto the stack, and then calls the static add method of the AdderKt class. This is the class that the compiler generated to wrap our top-level function, as the JVM won’t allow a function without a class to escort it at all times at the bytecode level. We see the package of our generated class prefixing it, which is separated by slashes in the bytecode, but is otherwise the same as in the source code.

The really important part here is the signature of the add method, which we can also make out: (II)I. The parentheses represent the parameter list, which contains two primitive integers. The return value noted afterwards is also a primitive int.

If we look at the bytecode produced by the 1.0 of the adder library, we’ll see a class and a method in it that matches this signature:

public final class co/zsmb/example/adder/AdderKt {
  public final static add(II)I // 1.0
}

Here’s an overview of how the code we’ve seen so far interacts:

Module structure with bytecode snippets

However, in the 2.0 and 3.0 versions, this method won’t exist anymore, and we’ll get this crash when our code tries to invoke it:

Exception in thread "main" java.lang.NoSuchMethodError: 
  co.zsmb.example.adder.AdderKt.add(II)I
	at co.zsmb.example.averager.AveragerKt.average(Averager.kt:6)
	at co.zsmb.example.app.MainKt.main(Main.kt:9)
	at co.zsmb.example.app.MainKt.main(Main.kt)

You can see the signature of the method the calling code expected in the exception, which you can now understand!

To show you the entire picture, this is what the 2.0 bytecode looks like.

public final class co/zsmb/example/adder/AdderKt {
  public final static add(III)I
  public static synthetic add$default(IIIILjava/lang/Object;)I
}

There is now a three-parameter add method in the class. There’s also a synthetic method that the Kotlin compiler translates any two-parameter calls into. This method calls into the first one, passing along the first two parameters, and adding the default 0 as the third. (It also actually has five parameters, but why that is beyond the scope of our current investigation.)

Module structure with now incompatible bytecode snippets

Finally, in the 3.0 version, as we’d expect, there’s an add method that takes an array of integers, marked by [I.

public final class co/zsmb/example/adder/AdderKt {
  public final static varargs add([I)I
}

Neither of these have the original method, hence the crashes if we update the implementation called by the client code, without recompiling it.

What could’ve been done to maintain binary compatibility? We could’ve kept our old methods when making our updates. By our 3.0, we would’ve ended up with this library code:

package co.zsmb.example.adder

fun add(x: Int, y: Int): Int { // 1.0
    return x + y
}

fun add(x: Int, y: Int, z: Int = 0): Int { // 2.0
    return x + y + z
}

fun add(vararg numbers: Int): Int { // 3.0
    var sum = 0
    for (n in numbers) {
        sum += n
    }
    return sum
}

All three signatures now exist in the bytecode, so any code already compiled against the 1.0 or 2.0 implementation will continue to work.

public final class co/zsmb/example/adder/AdderKt {
  public final static add(II)I
  public final static add(III)I
  public static synthetic add$default(IIIILjava/lang/Object;)I
  public final static varargs add([I)I
}

Any newly compiled code against this 3.0 version will pick the two-parameter method if it’s called with two parameters, the three-parameter one if it’s invoked with three, and the vararg one otherwise.

Edit: The good news is that there’s tooling to help you with binary compatibility, as pointed out here.

Deprecation

We’ve maintained both source and binary compatibility, but clients using the 3.0 of the library from source code might not understand why there are so many add methods to choose from. If they wanted to add two numbers, all three of these would technically be able to do the job.

Luckily for us, Kotlin has amazing support for marking obsolete APIs as deprecated in the form of the @Deprecated annotation. While the java.lang.Deprecated annotation could only produce a warning amongst the lines of “x is deprecated”, the Kotlin version can provide a descriptive error message, as well as set the level of the deprecation, which is our focus here.

It defines three deprecation levels:

  • Warning: this will produce a warning in client source code, which can be ignored or suppressed. The message provided in the annotation is shown. An example of the warning deprecation level
  • Error: this will produce an error in client source code, preventing it from being compiled. The message provided in the annotation is shown. An example of the error deprecation level
  • Hidden: the declaration will not show up in autocompletion, and will remain unresolved if typed in manually. An example of the hidden deprecation level

The exciting option for our case is the very last one. Hiding our old methods will maintain binary compatibility, as the old methods will still be present in the bytecode. Any code that’s recompiled against our library’s new version will now have all of its calls against the vararg add method in its bytecode.

package co.zsmb.example.adder

@Deprecated(
    message = "Superceded by the three-param version",
    level = DeprecationLevel.HIDDEN
)
fun add(x: Int, y: Int): Int { // 1.0
    return x + y
}

@Deprecated(
    message = "Superceded by the vararg version",
    level = DeprecationLevel.HIDDEN
)
fun add(x: Int, y: Int, z: Int = 0): Int { // 2.0
    return x + y + z
}

fun add(vararg numbers: Int): Int { // 3.0
    var sum = 0
    for (n in numbers) {
        sum += n
    }
    return sum
}

This ensures both source and binary compatibility. Clients can either recompile their code or not, but if they do, they’ll be seamlessly moved from the old implementations to the new one.

Conclusion

Not complicating the lives of clients using a library is one of the most important duties of the maintainer. Any changes made to public API can break client code - in more ways than we might usually think about.

So make sure you know what’s public API in your library, and strive to maintain it as best you can. Provide users with appropriate migrations, and especially beware of breaking binary compatibility.