Logs

The Java log levels showdown: SEVERE FATAL ERROR OMG PANIC

Capitalized log levels induce high levels of stress. What if, instead of ERROR we’d just use “oops”? On a more serious note, we’ve recently ran a huge data crunch over GitHub’s top Java projects and the logging statements they use, revealing the log level breakdown of the average Java project.

In this post, we’ll explore the resulting data set from another angle, shed some more light on the dataset, and put the focus on the use of standard java.util.logging levels versus more popular frameworks like Log4j (+ Log4j 2), and Logback.

Step right in.

Meet the players

Logging utilities can be roughly divided to 2 categories: the logging facade and the logging engine.

As far as logging facades go, you pretty much have 2 choices: slf4j and Apache’s commons-logging. In practice, 4 out of 5 Java projects choose to go with slf4j. Based on data from the top Java libraries in 2016 on Github. The motivation for using a logging facade is pretty definitive and straightforward, an abstraction on top of your logging engine of choice – allowing you to replace it without changing the actual code and logging statements.

As to the logging engine, the most popular picks are Logback, which is an evolved version of Log4j, Log4j itself, and its new version since the development was passed on to the Apache Software Foundation, Log4j2. Trailing behind is Java’s default logging engine, java.util.logging aka JUL.

Pointing fingers and calling names

On the “superficial” side of things, each of the logging frameworks has slightly different names for their logging levels.

Log Levels

In the rare case where slf4j is used with java.util.logging, the following mapping takes place:

FINEST -> TRACE
FINER -> DEBUG
FINE -> DEBUG
INFO -> INFO
WARNING -> WARN
SEVERE -> ERROR

Another thing to notice here is that Logback and java.util.logging have no FATAL equivalent. Behind those error names, are simple integer values, that help control the logging level in a running applications. Each library also contains values for OFF and ALL, which basically set the logger level to actually transmit everything, or nothing. Setting a logger level at WARN for instance, would only log WARN messages and above – Its practically the default setting for production environments.

btw, one of the cool things about the tool that we’re building, is that you can get log messages lower than WARN in production, even if you’ve set the logger level to WARN. Check out this video for a quick (25 sec) demonstration.

How does the level naming breakdown look in practice?

For the data crunch, we focused on the top starred Java projects with at least 100 logging statements in either of the methods. Examining the data set of projects, here’s what we found:

Logging Levels by Type

Only 4.4% of projects exclusively used the java.util.logging naming scheme.

The average non jul logging project, looked like this (examining 1,313 projects):

The Average Java Log Level Distribution

To look at the average java.util.logging project, we filtered it down to include only projects who had at least 100 statements from levels that don’t overlap with the non-JUL naming scheme (WARNING and INFO). This left us with a smaller dataset, so it might not be big enough to make definite conclusions from:

JUL Logging Average

With that said, it looks like in both situations, roughly ⅔ of logging statements are disabled in production, since only WARN and above are activated in that case.

Fun fact: As an extra datapoint, we also looked at ALL / OFF levels. Turns out only 8.6% of the projects examined used them both.

How did we reach the data?

The starting point for this research is the GitHub archive, and its datasets on Google BigQuery. We wanted to focus on qualified Java projects, excluding android, sample projects, and simple testers. A natural choice was to look at the most starred projects, taking in the database of the top 400,000 repositories.

We ended up with 15,797 repositories with Java source files, 4% of the initial dataset. But it didn’t stop there. Looking at the number of logging statements, we decided to only focus on projects with at least 100 different statements. The dataset is available right here.

We believe this to be a fairly representative sample of what we were trying to achieve. For the full walkthrough and the steps we took to reach the data, including the exact SQL queries, check out the last part in this post.

Final Thoughts

This post stresses out that java.util.logging is, well, practically dead. Most serious projects choose to go with 3rd party logging frameworks. Did you find anything else that we might have missed in the dataset? Do you have other interesting questions that can be answered through this or similar data?

Feel free to suggest your ideas in the comment section below.

Yoda

Join over 30,254 Java developers

Get new posts about Java, Scala and everything in between

Watch a live demo
Yoda
Some kind of monster @ OverOps, GDG Haifa lead.
  • Michael McCallum

    The order of levels in sl4fj and logback are error, warn, info, debug, trace.

    final public int TRACE_INT = 00;
    final public int DEBUG_INT = 10;
    final public int INFO_INT = 20;
    final public int WARN_INT = 30;
    final public int ERROR_INT = 40;

    AND

    public static final int OFF_INT = Integer.MAX_VALUE;
    public static final int ERROR_INT = 40000;
    public static final int WARN_INT = 30000;
    public static final int INFO_INT = 20000;
    public static final int DEBUG_INT = 10000;
    public static final int TRACE_INT = 5000;
    public static final int ALL_INT = Integer.MIN_VALUE;

    The only place I’ve seen reversed info and debug was the sails for nodejs.

    • http://www.takipi.com/ Alex Zhitnitsky

      That is correct, thanks for the comment! We’ve updated the ordering in the table.

  • Aman Jain

    Hi,

    Nice article!

    How do you compare Log4j2 and Logback as logging engine for use in production? Is it preferred to use asynchronous appenders for efficient logging?

    Thanks,
    Aman

    • Niranjan Nanda

      Async logging is definitely better over sync because in both Log4J 1.x and Logback, the actual log writer is a synchronized method. And under peak load we used to see a lot of threads blocking for log writing (in thread dump analysis). This is the reason why 6 months before we migrated our logging framework from Log4j 1.2.x to Log4j2 and since then its in production without any performance impact. Log4J2 uses LMAX disruptor for its async logging instead of blocking queue which IMHO, is a good decision.

      • Aman Jain

        Thanks, it is good to know. Does async logging results in increased memory footprint or loss of some log information in case of sudden shutdown of application?

        • Niranjan Nanda

          We haven’t observed either of the issues as of now.

        • grobmeier

          Just note one thing: if you need to use logging for audit purposes, async logging might not be your choice. If some logging event fails in async mode, there is no way to recognise that. In these rare cases you might want to go standard paths.

  • Bit Hammer

    I understand the benefits of an abstract logging layer, but as I’ve only every used Log4j and now Log4j2 – I don’t like the idea of my logging calls going through another layer. There were a (very) few minor changes to go from Log4j to Log4j2 but I see that as much less of a pain than introducing a layer of abstraction. java.util.logging is dead, it should never have existed in the first place, and if we would all just switch over to Log4j2 we could kill off slf4j and logback.

    • gacl

      “if we would all just switch over to…” That isn’t how the world works. Most agree it would be nice if the US would just switch over to the metric system, for example, but that’s dramatically more effort than it would seem.

  • gacl

    Any library or middleware should use slf4j. Any final end application or service should choose the specific logging implementation. Lots of Java middleware libraries currently use log4j 1.x directly, and IMO, they should change over.

  • Adam Smith

    Two popups interrupting me while I’m trying to read the article is two too many.

  • Ralph Goers

    Two points:
    1. Besides SLF4J and Commons Logging, Log4j 2 also provides a Logging Facade by way of its API. If you code to the API logging can still be routed to other logging frameworks.
    2. Besides the logging levels mentioned above, Log4j 2 allows for custom log levels.

  • Scott Palmer

    java.util.logging is the only logging I use. I have not found enough benefits in any other logging framework to justify adding the dependency. I also hate getting into a quagmire of different logging frameworks because of libraries that each make a unique choice for logging. Libraries should only use java.util.logging because of the mess it causes when they don’t and the lack of any real benefit.

  • Rohit

    Those are some sick Pie charts. What tool were they made in?