Earlier this year I was looking at an NPE that started happening within a rules engine used by an insurance quoting application for discount eligibility. Ultimately the nature of error was evident - a rule was written in an unsafe way and necessary data was not present, but it wasn't immediately clear why the data wasn't being mapped properly.
The error had only been reported happening a few times in a development environment. I was able to discern that the first time it happened was the morning after an update to Spring 3. Debugging locally with code written just a few days earlier didn't trigger the error, so I knew the Spring 3 upgrades had to be related. The missing data was supposed to be derived as the result of a library call to another rules library maintained by a different team, used to derive pricing attributes from information on a request.
After a bit of debugging, I could see that the data in question should have been derived by this other rules engine, but no data of any kind was being mapped from it. No errors were logged in the scenario, and debugging was very fiddly. Notably, the error messages at different points in the debugging process differed on subsequent requests after the first request was submitted. This required restarting the application locally after each pass of the error. This made me think that some static structure was at play.
This rules engine made use of the popular Jackson package to parse YAML files containing lists of rules to be executed subject to constraints. I could see that this parsing initially worked, but failed shortly into the execution flow. No rules were being executed even though they were being scoped for execution. After a few hours of incrementally debugging the scenario, I saw the true culprit: a class from the Apache Commons library was missing at runtime. The ClassNotFoundException was silently ignored and allowed processing to continue, only resulting in a NPE for a limited number of scenarios that required this additional rules engine. The class in question should have been provided transitively from our dependency on the other rules engine maintained by another team, but migrating to Spring 3 seemed to cause some incompatibility with that error. Adding Apache Commons to our build config (and fixing the unsafe code) fixed the issue, but I still don't know perfectly why the issue was happening. I'll probably look back at in the near future
The error had only been reported happening a few times in a development environment. I was able to discern that the first time it happened was the morning after an update to Spring 3. Debugging locally with code written just a few days earlier didn't trigger the error, so I knew the Spring 3 upgrades had to be related. The missing data was supposed to be derived as the result of a library call to another rules library maintained by a different team, used to derive pricing attributes from information on a request.
After a bit of debugging, I could see that the data in question should have been derived by this other rules engine, but no data of any kind was being mapped from it. No errors were logged in the scenario, and debugging was very fiddly. Notably, the error messages at different points in the debugging process differed on subsequent requests after the first request was submitted. This required restarting the application locally after each pass of the error. This made me think that some static structure was at play.
This rules engine made use of the popular Jackson package to parse YAML files containing lists of rules to be executed subject to constraints. I could see that this parsing initially worked, but failed shortly into the execution flow. No rules were being executed even though they were being scoped for execution. After a few hours of incrementally debugging the scenario, I saw the true culprit: a class from the Apache Commons library was missing at runtime. The ClassNotFoundException was silently ignored and allowed processing to continue, only resulting in a NPE for a limited number of scenarios that required this additional rules engine. The class in question should have been provided transitively from our dependency on the other rules engine maintained by another team, but migrating to Spring 3 seemed to cause some incompatibility with that error. Adding Apache Commons to our build config (and fixing the unsafe code) fixed the issue, but I still don't know perfectly why the issue was happening. I'll probably look back at in the near future