Building lightweight in memory caches with Google Guava - no more putIfAbsent

July 9, 2012 | 4 min Read

I can’t count the number of times I found myself implementing some sort of cache. In many situations caching is really useful e.g. when the computation of a value is expensive or when loading of resources is involved. I bet you found yourself implementing a cache many times too. Probably you have also used a Java Map for storing the values. One disadvantage of using Maps for caching is that you have to implement the eviction of entries yourself, e.g. to keep the size to a given limit. When you develop for a concurrent environment the task gets more complicated and a simple Map is not sufficient. You need to switch to a thread safe solution, e.g. a ConcurrentHashMap. ConcurrentHashMaps solve the concurrency problem, but the code gets ugly. You have to deal with the fact that keys can be added multiple times from different threads concurrently.

A few months ago I learned about Google Guava’s Caches. I was so impressed by their simplicity that this was the last time I have implemented a cache myself. Let’s see how Guava can help us by taking a look at a simple example, comparing a ConcurrentHashMap based cache with an implementation using Guava Caches. I have created a cache that stores simple String values. The computation of the values is done by the method createRandom. You will notice that just a simple String is created. In a real production environment the content of this method will be some expensive computation… be creative ;)

public class Cache {

  private static final long MAX_SIZE = 100;

  private final ConcurrentHashMap<String, String> map;

  public Cache() {
    map = new ConcurrentHashMap<String, String>();
  }

  public String getEntry( String key ) {
    String result = createChacheEntry( key );
    removeOldestCacheEntryIfNecessary();
    return result;
  }

  private String createChacheEntry( String key ) {
    String result = map.get( key );
    if( result == null ) {
      String putResult = map.putIfAbsent( key, createRandom() );
      if( putResult != null ) {
        result = putResult;
      }
    }
    return result;
  }

  private void removeOldestCacheEntryIfNecessary() {
    if( map.size() > MAX_SIZE ) {
      String keyToDelete = map.keys().nextElement(); // very effective ;)
      map.remove( keyToDelete );
    }
  }

  private String createRandom() {
    return "I'm a random string or resource... Be creative ;)";
  }
}

As you can see this implementation is pretty messy. There are two heavy smelly things from my point of view. The first one is the putIfAbsent usage. All the ugly null checking and reassignment is everything but readable. The second smelly part is the eviction. To keep the size lower than 100 we have to intercept every add operation. The chosen implementation of the eviction strategy is very basic, it’s just to show you which sort of code we have to create when implementing a cache on our own.

The good news is that everything I have described as ugly can be avoided using Guava’s caches.  The example above can be transformed to this:

public class Cache {

  private static final long MAX_SIZE = 100;

  private final LoadingCache<String, String> cache;

  public Cache() {
    cache = CacheBuilder.newBuilder().maximumSize( MAX_SIZE ).build( new CacheLoader<String, String>() {
        @Override
        public String load( String key ) throws Exception {
          return createRandom();
        }
      }
    );
  }

  public String getEntry( String key ) {
    return cache.getUnchecked( key );
  }

  private String createRandom() {
    return "I'm a random string or resource... Be creative ;)";
  }
}

From my point of view the ugly code has now vanished. The thread safe storing and the eviction is all done by Guava’s internal implementation. Also Guava provides a nice API which makes our code much more readable. If you want to understand the caches in detail, the Guava User Guide does a great job.

[ For more tips on clean coding from our blog, see Software Craftsmanship page. | Running a mission-critical system? Keep things running smooth with our Production Support. ]

Caching is a sensitive topic and Guava Caches are not intended to solve all requirements for everyone. The main reasons why I like this little library are its simplicity, the fact I can trust the implementation because it’s heavily tested and that it increases the readability of my code. Feel free to disagree :-) and leave a comment…