Coroutine Cancellation 101

2020-03-04 • Márton Braun

This article assumes that you already know the basics of coroutines, and what a suspending function is.

Starting a coroutine

The rule of thumb with coroutines is that suspending functions can only be called from suspending functions. But how do we make our first call to a suspending function, if we need to already be in a suspending function to do so? This is the purpose that coroutine builders serve. They let us bridge the gap between the regular, synchronous world and the suspenseful world of coroutines.

launch is usually the first coroutine builder that we encounter when learning coroutines. The “trick” with launch is that it’s a non-suspending function, but it takes a lambda parameter, which is a suspending function:

fun launch(
    context: CoroutineContext = EmptyCoroutineContext,
    block: suspend () -> Unit
): Job

It creates a new coroutine, which will execute the suspending block of code passed to it. launch is a fire-and-forget style coroutine builder, as it’s not supposed to return a result. So it returns immediately after starting the coroutine, while the started coroutine fires off asynchronously.

It does return a Job instance, which, according to the documentation…

is a cancellable thing with a life-cycle that culminates in its completion.

Basically, this Job represents a piece of work being performed for us by a coroutine, and can be used to keep track of that coroutine. We can check if it’s still running, or cancel it:

val job = GlobalScope.launch {
    println("Job is running...")
    delay(500)
    println("Job is done!")
}

Thread.sleep(200L)
if (job.isActive) {
    job.cancel()
}

delay is a handy suspending function that we can use inside coroutines to wait for a given amount of the time in a non-blocking way.

Since the coroutine above is cancelled before the delay is over, only its first print statement will be executed.

Job is running...

This happens because we call cancel while the suspending delay call is happening in the coroutine.

Blocking execution

What if there were no suspension points in the coroutine, and its entire body was just blocking code? For example, if we replace the delay call with Thread.sleep:

val job = GlobalScope.launch {
    println("Job is running...")
    Thread.sleep(500L)
    println("Job is done!")
}

Thread.sleep(200L)
if (job.isActive) {
    job.cancel()
}

If we run the code again, we’ll see this output:

Job is running...
Job is done!

We’re in trouble, cancellation is now broken! It turns out that coroutines can only be cancelled cooperatively. While a coroutine is running continuously blocking code, it won’t be notified of being cancelled.

Cooperation

Why doesn’t the Thread that the coroutine is running on get shut down forcibly? Because doing something like this would be dangerous. Whenever you write blocking code, you expect all those lines of code to be executed together, one after another. (Kind of like in a transaction!) If this gets cut off in the middle, completely unexpected things can happen in the application. Hence the cooperative approach instead.

So how do we cooperate? For one, we can call functions from kotlinx.coroutines that support cancellation already - delay was an example of this. If our coroutine is cancelled while we are waiting for delay, it will throw a JobCancellationException instead of returning normally. If our coroutine was cancelled some time before a call delay, and this cancellation wasn’t handled, delay will also throw this exception as soon as it’s called.

For example, let’s say that we have a list of entities to save to two different places, which we perform by calling these two simple, blocking functions:

fun saveToServer(entity: String)
fun saveToDisk(entity: String)

We don’t want to end up in a situation where we’ve saved an entity to one of these places, but not the other. We either want both of these calls to run for an entity, or neither of them.

A first approach to this problem would be to use withContext, to suspend the caller, and move this operation to another thread. The code below will block a thread on the IO dispatcher for the entire length of our operation, which ensures that this coroutine is practically never cancelled:

suspend fun processEntities(entities: List<String>) = withContext(Dispatchers.IO) {
    entities.forEach { entity ->
        saveToServer(entity)
        saveToDisk(entity)
    }
}

However, we can also add cancellation support, by checking if our current coroutine has been cancelled, manually. For example, we can do this after processing each entity:

suspend fun processEntities(entities: List<String>) = withContext(Dispatchers.IO) {
    entities.forEach { entity ->
        saveToDisk(entity)
        saveToServer(entity)
        if (!isActive) {
            return@withContext
        }
    }
}

If our coroutine is cancelled while we run the blocking part of our code, that entire blocking part will still be executed together, but then we’ll eventually notice the cancellation at the end of the loop, and stop performing further work, in a safe way.

Quick cancellation checks

Note: this section was updated on 2020.03.22., thanks to a great suggestion by Fred Porciúncula on reddit.

There is a dedicated function in kotlinx.coroutines to check for a cancelled coroutine: ensureActive.

All ensureActive does is handle cancellation (meaning that it throws a JobCancellationException when it’s invoked in a cancelled coroutine:

public fun Job.ensureActive(): Unit {
    if (!isActive) throw getCancellationException()
}

This means that we can call it every once in a while when performing lots of blocking work, to provide opportunity for the coroutine to be cancelled. This is done completely manually though, explicitly, which means we are aware of the possibility of cancellation. ensureActive can easily replace manual cancellation checks, if terminating with an exception upon cancellation is good enough for us:

suspend fun processEntities(entities: List<String>) = withContext(Dispatchers.IO) {
    entities.forEach { entity ->
        saveToDisk(entity)
        saveToServer(entity)
        ensureActive()
    }
}

Just like with delay, even if the coroutine happens to have been cancelled some time before ensureActive, it will notice this, and throw an exception. The cancellation doesn’t have to happen at the exact time that ensureActive is called.

Note that if there’s some cleanup of the coroutine to do (freeing up resources, etc.) when cancelled, manual cancellation checks can still be very handy, and should be used instead of ensureActive.

Another function from kotlinx.coroutines is yield, which has the original purpose of performing a manual suspension of the current coroutine, just to give other coroutines waiting for the same dispatcher a chance to execute. It essentially reschedules the rest of our coroutine to be executed on the same dispatcher that it’s currently on. If there’s nothing else waiting to use this dispatcher, this is roughly the same as a 0-length delay. Since it’s nearly a no-op, and handles cancellation (throws an exception if the coroutine it’s in is cancelled), you might see it used for the same purpose as ensureActive.

Conclusion

We’ve seen that coroutines always rely on cooperative cancellation. We can either check if the coroutine we’re executing has been cancelled ourselves, or if we invoke any kotlinx.coroutines functions in our code. The latter will perform the check for us, and attempt to stop the execution of the coroutine with an exception.

If you want to learn even more about coroutine cancellation, the following talk is for you: KotlinConf 2019: Coroutines! Gotta catch ‘em all! by Florina Muntenescu & Manuel Vivo.

Feedback on this article is very welcome. One of the best places to leave it would be on reddit.

Follow me on Mastodon (or LinkedIn, or Twitter) to keep in touch and get notified of similar posts! You can also subscribe to this blog using RSS.

You might also like...

Fragment Lifecycles in the Age of Jetpack

Fragments have... Complicated lifecycles, to say the least. Let's take a look at these, and how they all fit into the world of Jetpack today, with LifecycleOwners, LiveData, and coroutines.

The conflation problem of testing StateFlows

StateFlow behaves as a state holder and a Flow of values at the same time. Due to conflation, a collector of a StateFlow might not receive all values that it holds over time. This article covers what that means for your tests.

Mastering API Visibility in Kotlin

When designing a library, minimizing your API surface - the types, methods, properties, and functions you expose to the outside world - is a great idea. This doesn't apply to just libraries: it's a consideration you should make for every module in a multi-module project.

Effective Class Delegation

One of the most significant items of the Effective Java book is Item 18: Favor composition over inheritance. Let's see how Kotlin promotes this with class delegation.

zsmb.coEst. 2017