The author of SnakeYaml argued vehemently that a library executing untrusted cod...

ludwik · on Sept 21, 2023

Wow, this is truly terrifying. It reads as if Donald Trump was the maintainer of a popular library. "Low quality tooling! Sad!!!".

He claims with a straight face that every software using his library only loads YAMLs from sources 100% trusted to execute code. He is given example after example to the contrary, which he ignores instead opting to constantly blame "low quality tooling" for generating "false reports" about his perfect software. People can be really weird sometimes.

sheepshear · on Sept 21, 2023

A lot of people misunderstand what he's actually saying.

There are two categories of constructors. One is for data that should not be executed, the other is for trusted data that should be executed.

There are two libraries. One has default constructors that can execute data, the other has default constructors that don't execute data.

He's saying to rtfm and choose the library with the correct defaults, choose the correct constructor from that library, and stop trying to take away the choice.

richbell · on Sept 21, 2023

> He's saying to rtfm and choose the library with the correct defaults, choose the correct constructor from that library, and stop trying to take away the choice.

Nobody was trying to "take away the choice".

The problem is that you have to explicitly opt-in to be safe. If you followed the code snippets from the README, your application would be vulnerable to RCE without you realizing it; as people pointed out, it would be more secure to have Constructor (safe by default) + DangerousConstructor rather than Constructor (unsafe by default) + SafeConstructor.

His argument was that "100% of applications using SnakeYaml do not accept untrusted data".

sheepshear · on Sept 21, 2023

I understood him to be speaking tautologically, that you trust the data when you choose the trusting library without using the safe constructor, whether or not you realize the implication. He seems very well informed that some people are using the trusting constructors on untrusted data.

As he explained, this library is, by design, convenient by default. Those seeking safe by default should consider using the other library.

"take away the choice" is my summary of several comments that would have the feature removed. One was about how its existence is a vulnerability if file access is compromised. Another was about how code execution is not in the spec. And so on.

richbell · on Sept 21, 2023

> I understood him to be speaking tautologically, that you trust the data when you choose the trusting library without using the safe constructor, whether or not you realize the implication.

He was speaking literally; he even rejected several of the provided examples because "users have to login first, therefore the data is trusted" which isn't an argument that any security-conscious person would make.

> As he explained, this library is, by design, convenient by default. Those seeking safe by default should consider using the other library.

That is a negligent mindset to have. Log4J added the ability to execute arbitrary dns and ldap calls for the sake of convenience, which resulted in one of the most consequential vulnerabilities of the past decade.

Opt-in security is dangerous and should never be the default — especially when the feature in question is executing arbitrary input.

sheepshear · on Sept 21, 2023

He also said to sanitize any data that you intend to use with the unsafe constructors. Taken together, he's pointing out that you decide how much you trust the data and you control which constructor to use. "Problem in chair"

"Should" statements are always relative to what you value. Clearly he thinks this trade-off is fine for him. His other library accommodates your security needs but this one accommodates his convenience needs. Can the man not make something for himself?

I assume it would be costly for him to make and propagate the changes. Maybe money could persuade him.

richbell · on Sept 21, 2023

> He also said to sanitize any data that you intend to use with the unsafe constructors. Taken together, he's pointing out that you decide how much you trust the data and you control which constructor to use. "Problem in chair"

That doesn't change the fact that it's a poorly designed API that's insecure by default. There are countless situations where people are inadvertently exposed to risk via transitive dependencies, at no fault of their own.

> Can the man not make something for himself?

He did not make it for himself, he made it to he consumed by others. SnakeYAML is a widely used package.

sheepshear · on Sept 21, 2023

He said he designed it for his use case of executing trusted configuration code, which some others appreciate. Obviously there was some misunderstand about the goals and priorities of the project.

Making this change would cost him something he values without giving him something else he values in return. According to him, Snake Engine already provides a default safe solution, so he's not leaving anyone without a remedy. It would cost you to switch, but you would get something in return. That seems fair to me.

sznio · on Sept 21, 2023

If you don't trust the data you're supposed to use SafeConstructor.

It's like blaming the browser vendor for an XSS vulnerability rather than the backend implementation not using HtmlEncode.

zkldi · on Sept 21, 2023

why isn't the default secure? if the default isn't secure we have learned time and time again that people will use the default unknowingly exposing themselves to security holes.

Here's just a couple examples off the top of my head:

- `$variables` in bash are subject to arbitrary code execution via word splitting without escaping

- PHP register_globals

- PHP, express, and some others parse `?a[b]="foo"` in a query string as an object, allowing for prototype pollution or other exploits

- string concatenation for SQL + escape_string being the default for years

- perl array expansion in function calls

- XML entity inclusion on by default allowing you to read arbitrary files

- log4j executing arbitrary code inside its logs

- passing a variable to printf's first arg

- no difference between escaped and unescaped tags in php

- xargs splitting on whitespace

- yaml allowing arbitrary code execution (it got rails good!)

and there's probably loads more.

lodovic · on Sept 21, 2023

wow, what a read. I'm now convinced to immediately remove that library if I ever come across it.