An overview of exception handling in over 600,000 Java projects on Github and Sourceforge
Java is one of the few languages that use checked exceptions. They are enforced during compile time, and require handling of some sort. But… what happens in practice? Do most developers actually handle anything? And how do they do that?
In this post we’ll go over the data from a recent research study by the university of Waterloo that covered the use of exceptions in over 600,000 Java projects from GitHub and sourceforge. Let’s dig in and answer some questions.
— Takipi (@takipid) June 21, 2016
The top 10 exception types in catch clauses
Ahm, sounds familiar? Recently we’ve published results following a data crunch based on data from over a 1,000 applications in production, where we examined the top 10 thrown exceptions types.
In this instance of the data crunch, the researchers analyzed Java projects on Github and Sourceforge, looking into the catch clauses and reporting on the findings. Let’s see how the dataset looks like:
The top 10 exception types in catch clauses, source: “Analysis of Exception Handling Patterns in Java”
Well well, what do we have here? The research found that checked exceptions account for almost three times the number of unchecked exceptions in Java projects. Can’t upset the compiler here. In the production data crunch, we’ve seen opposite results where the top exceptions are unchecked.
An important difference to note here, is that the production crunch took the thrown type into account, while this research is referring to the caught type, which could be different / higher level than the thrown object.
Another insight is that developers often catch checked exceptions at the top level, using the Throwable and Exception classes. The plot thickens.
To learn more about how checked exceptions are handled, the researchers examined the Exception and Throwable handlers. 78% of the methods that caught Exception did not catch any of its subclasses, same as 84% of Throwable. Meaningless catch clauses.
Next up, let’s find out what’s going on inside these catch clauses. Maybe there’s hope.
“Most programmers ignore checked exceptions and leave them unnoticed”
Sounds bad? Keep reading. It’s an actual, real, official takeaway from the study. Many of us had this tingling spidey-sense feeling about checked exceptions, but in software development it’s unusual to have data that provides a cold hard proof to issues around actual code style. Apart from personal experiences and qualitative rather than quantitative type of studies.
The following chart shows the top operations performed in the top 3 checked exception catch blocks:
Top operations in checked exception catch clauses, source: “Analysis of Exception Handling Patterns in Java”
We see that log statements and e.printStackTrace() are at the top, making them the top operations used in checked exception catch blocks, which helps debug the situation and understand what happened.
Sneaking up on them are the notorious empty catch blocks. Joshua Bloch describes in “Effective Java” what would ideally happen, “To capture the failure, the detail message of an exception should contain the values of all parameters and fields that contributed to the exceptions”. Empty catch blocks are defeating this purpose.
Another common use case is throwing an unchecked exception that replaces the checked exception.
Mario Fusco summed it up pretty good on his twitter feed:
What Java devs do in checked exception catch blocks demonstrates that if you oblige a dev to do smtg unnecessary he will do smtg stupid
— Mario Fusco (@mariofusco) June 6, 2016
But wait, there’s more
Looking at the bigger picture of both checked and unchecked exceptions, only on Github this time, we see a similar picture with rethrows gaining some more popularity:
Top operations used in exception handling (Github), source: “Analysis of Exception Handling Patterns in Java”
20% of the total (6,172,462) catch blocks are empty. This is quite bad. Connecting the dots with the fact that exceptions which are higher at the hierarchy are used more frequently than specific types, the researchers arrived to the conclusion that “most participants seemed to give a low priority to exception handling as a task, or included exceptions in their code only when the language forced them to handle checked exceptions”.
Eventually, the product quality suffers.
What’s going on with the re-throws?
Since throwing exceptions up the call stack hierarchy is the most popular catch clause operation, the researchers further looked into what kind of conversions are most popular. The results are summed up in the following table:
Top exception transformations, source: source: “Analysis of Exception Handling Patterns in Java”
In at #1, transforming Exception to RuntimeException. Most conversions from any exception type were made to RuntimeException, making checked exceptions unchecked.
Exception best practices
In addition to the data crunch and its insights, the article mentions Joshua Bloch’s guidelines for dealing with exceptions from the famous 2nd edition of his book: “Effective Java” (chapter 9). We thought it would be a good idea to list them here as well:
1. “Use exceptions only for exceptional scenarios”
Exceptions cause considerable overhead on the JVM, using exceptions for normal flow control is a source for trouble (Yes, even though many developers abuse it). On our actionable exceptions post, we’ve elaborated on this “normal exceptions” issue.
2. “Use checked exceptions for recoverable conditions and runtime exceptions for programming errors”
This implies that if a developer finds a checked exception unrecoverable, it’s ok to wrap it in an unchecked exception with its state and throw it up the hierarchy for logging and handling.
3. “Avoid unnecessary use of checked exceptions”
Use checked exceptions only when the exception cannot be avoided by properly coding the API and there’s no alternate recovery step.
4. “Favor the use of standard exceptions”
Using standard exceptions from the already extensive Java API promotes readability.
5. “Throw exceptions appropriate to the abstraction”
As you go higher in the hierarchy, use the appropriate exception type.
6. “Document all exceptions thrown by each method”
No one likes surprises when it comes down to exceptions.
7. “Include failure-capture information in detail messages”
Without information about the state the JVM was in, there’s not much you can do to make sure the exception doesn’t happen again. Not everyone has Takipi in place to cover their back.
8. “Don’t ignore exceptions”
All exceptions should lead to some action, what else do you need them for?
To read more about these guidelines, check out this previous blog post about actionable exceptions, and lessons from a recent production data crunch covering over a 1,000 production applications, to see what’s in their logs and what are the top 10 exceptions they encounter.
Whaaaaaat exactly were we looking at here?
The data for this study comes from a research paper by Suman Nakshatri, Maithri Hegde, and Sahithi Thandra from the David R. Cheriton School of Computer Science at the University of Waterloo Ontario, Canada.
The researcher team looked through a database of 7.8m Github projects and 700k Sourceforge projects, extracted the Java projects, and examined the use of catch blocks with the BOA domain specific language for mining software repositories.
The dataset by the numbers
Exceptions should be reserved to exceptional situations, but… other things happen in practice. Checked exceptions become unchecked, empty catch blocks are all over the place, control flow is mixed with error flow, there’s lots of noise and critical data goes missing. It’s a mess.
This was a main motivation to us for building Takipi, a Java agent that monitors JVMs in production and takes care of filling in the blanks with everything you need to know about how exceptions behave (and how to avoid them).