That explains the jndi lookup, but not why variables are parsed when they are not part of the format string. That just asks for unexpected issues and exploits.
JNDI lookup in it self is not a problem, problem is that user input is not sanitised and can include templates which can have JNDI lookup in them. I would expect user input with {} template symbols to be escaped and not evaluated.
Maybe, but still this seems as vanity feature added because "It would be really convenient"... this wasn't something what was needed, but something what was added to make life of maybe 0.1% of users little bit easier. My guess is that most of users of Log4J2 don't even know that it is able to do such magic, and would be horrified knowing it.
IMHO Log library should log, not do some magic stuff.
The method responsible for variable substitution is here [1].
There are other lookup mechanisms, (didn't check them all) but they only retrieve environment/static values (this one [2] for example retrieves kubernetes attributes). I think the jndi one is the only one that load and execute code.
Edit : I think my understanding of the docs is incorrect so ignore the next paragraph
From the documentation [3], I have the impression that these variables should be evaluated only when loading the pattern layout configuration. But in reality they are also evaluated when formatting the final log message (log.info(...)).
I agree that the input should be sanitized but only if the formatting behavior is a bug and was not intentional.
One possible explanation for the current situation, is that developers assumed that all lookup mechanism will retrieve only static values (env variables for example).
And then another dev introduced the jndi lookup which execute code, but no one noticed the impact on the already existing behavior (evaluation variable when formatting the final msg).
> I agree that the input should be sanitized but only if the formatting behavior is a bug and was not intentional.
Non-pattern arguments should not do any substitution, because otherwise developers have to jump through hoops to output strings verbatim. You don’t want "Invalid identifier: '${<some valid log4j syntax>}'" to be turned into "Invalid identifier: '<the log4j replacement>'" when the actual invalid identifier (e.g. from user input) was the "${…}" syntax. I’m surprised that log4j behaves that way still after two decades.
Recursive replacement is somewhat of a WTF yes but not really in a causitive relationship ro this vulnerability, right? The main cause is the order in which ${} variables are evaluated. They are evaluated after substitution of `{}` in the format string, instead of before. That's the key behavior problem.
A simple rule such as "If you evaluate a placeholder of type `{}` you should stop evaluating further recursively" would maintain most of existing behavior while only removing vulnerable behavior.
> I agree that the input should be sanitized but only if the formatting behavior is a bug and was not intentional.
The behavior is fundamentally wrong as explained by layer8's comment and in the blog post jcheng linked. Apparently it was intentional, and to me that's a million times worse than if it were a bug that could just be fixed.
log4j is unsuitable for use in a way that many many people are suddenly discovering today. The log4j developers need to rethink this, and even if they do, log4j's users should still strongly consider switching to something else. Otherwise they need to audit the whole codebase for other surprising, broken, insecure design decisions not only in its present state but also on each update. I don't see how it's possible to trust the log4j developers' design sensibilities after this.
> Apparently it was intentional, and to me that's a million times worse than if it were a bug that could just be fixed.
I've been on an 'archaeological dig' into the Log4j commit history, and sure enough, there's evidence that the formatting behavior was intentional from the outset.