Proj4j Race Condition and the Importance of Dependencies Update

osha1
2 min readFeb 3, 2020

We recently encounter a severe (but rare) issue on one of our servers, that upon startup failed to handle any request. It took couple of month to nail it down. Inspecting the stacktrace suggested an issue with a lib called proj4j:

One of my colleagues found the following code that looks related:

But… it looks pretty clear that datum was entered to the list here…

Take a minute and look if you can spot an issue in the code above.

Photo by Gilly Stewart on Unsplash

The root of the issue is the tree. Or more accurately TreeSet which is not thread safe. You can see that supportedParams is a member of the class, and in case it initialized by multiple threads it can cause a state corruption on the TreeSet — such as a missing “datum” String in our case.

There are various ways to solve the issue, but first let’s see if it’s already fixed?

Apparently yes, 8 years ago! in this commit. When we dug a little bit more it turned out we were using our own bogus clone of the lib and not the official distribution. There are two “official” forks, both of them with the fix:

  • org.osgeo which doesn’t looks like being actively maintained but the bug was fixed.
  • location tech (the maintainers of JTS) that looks more actively maintained.

We tried the second one, but failed to migrate to it due to this issue:

So eventually we decided to use the osgeo port of proj4j. Migration was not entirely smooth due to the above bug and api changes, but luckily for us — there was not a lot of code to migrate on our side.

Afterthoughts

  • I am not sure if you notice, but the fix was adding asynchronized keyword on the method. I can think of better alternatives. I would probably make this set initialized on construction when the class loads.
  • Having an 8 years old, undocumented, lib is also something you should try to avoid.
  • Concurrency bugs can be tricky!

--

--